SFERA Archivio dei prodotti della Ricerca dell'Università di Ferrara

In this paper we introduce a new Italian dataset consisting of simultaneous recordings of continuous speech and trajectories of important vocal tract articulators (i.e. tongue, lips, incisors) tracked by Electromagnetic Articulography (EMA). It includes more than 500 sentences uttered in citation condition by three speakers, one male (cnz) and two females (lls, olm), for approximately 2 hours of speech material. Such dataset has been designed to be large enough and phonetically balanced so as to be used in speech applications (e.g. speech recognition systems). We then test our speaker-dependent articulatory Deep- Neural-Network Hidden-Markov-Model (DNN-HMM) phone recognizer on the set of data recorded from the cnz speaker. We show that phone recognition results are comparable to the ones that we previously obtained using two well-known British-English datasets with EMA data of equivalent vocal tract articulators. That suggests that the new set of data is a equally useful and coherent resource. The dataset is the session 1 of a larger Italian corpus, called Multi-SPeaKing-style-Articulatory (MSPKA) corpus, including parallel audio and articulatory data in diverse speaking styles (e.g. read, hyperarticulated and hypoarticulated speech). It is freely available at http://www.mspkacorpus.it for research purposes. In the immediate future the whole corpus will be released.

A new Italian dataset of parallel acoustic and articulatory data

Canevari, C;Badino, L;FADIGA, Luciano

2015

Abstract

In this paper we introduce a new Italian dataset consisting of simultaneous recordings of continuous speech and trajectories of important vocal tract articulators (i.e. tongue, lips, incisors) tracked by Electromagnetic Articulography (EMA). It includes more than 500 sentences uttered in citation condition by three speakers, one male (cnz) and two females (lls, olm), for approximately 2 hours of speech material. Such dataset has been designed to be large enough and phonetically balanced so as to be used in speech applications (e.g. speech recognition systems). We then test our speaker-dependent articulatory Deep- Neural-Network Hidden-Markov-Model (DNN-HMM) phone recognizer on the set of data recorded from the cnz speaker. We show that phone recognition results are comparable to the ones that we previously obtained using two well-known British-English datasets with EMA data of equivalent vocal tract articulators. That suggests that the new set of data is a equally useful and coherent resource. The dataset is the session 1 of a larger Italian corpus, called Multi-SPeaKing-style-Articulatory (MSPKA) corpus, including parallel audio and articulatory data in diverse speaking styles (e.g. read, hyperarticulated and hypoarticulated speech). It is freely available at http://www.mspkacorpus.it for research purposes. In the immediate future the whole corpus will be released.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno di pubblicazione
	
				2015
			
	Parole chiave
	
				Acoustic-to-articulatory mapping; Articulatory corpora; Deep-neural-networks; Electromagnetic articulography; Phone recognition
			
	Appare nelle tipologie:
	
				04.3 Abstract (Riassunto) in convegno in Rivista/Volume

File in questo prodotto:

Non ci sono file associati a questo prodotto.

I documenti in SFERA sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11392/2371023

Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni

ND

12

6

social impact