Articulatory data offers promising developments in our understanding of speech production and advances in speech technologies. However, it is more expensive and difficult to obtain than audio data, which means data collection must be carefully planned. This paper presents a method for designing an articulatory speech corpus comparable to the widely-used TIMIT corpus, for languages other than English, using Italian as a case study. This data-driven method searches freely-available online text corpora for a set of sentences that provide broad phonetic coverage, while still being small enough to be read in a single session, which is important given the often invasive nature of articulatory data collection. Sentences are first phonemically transcribed and scored based on negative log-likelihood of triphones, with sentences that have many rare triphones scoring higher. The search algorithm then finds sentences that have high scores, but also contain the most frequent triphones. Experiments show that the distribution of triphones in the automatically selected sentences is similar to that found in handconstructed sentence sets for English, such as TIMIT, and offers broader phonetic coverage than selecting random sets of sentences

Data-driven design of a sentence list for an articulatory speech corpus

FADIGA, Luciano
2013

Abstract

Articulatory data offers promising developments in our understanding of speech production and advances in speech technologies. However, it is more expensive and difficult to obtain than audio data, which means data collection must be carefully planned. This paper presents a method for designing an articulatory speech corpus comparable to the widely-used TIMIT corpus, for languages other than English, using Italian as a case study. This data-driven method searches freely-available online text corpora for a set of sentences that provide broad phonetic coverage, while still being small enough to be read in a single session, which is important given the often invasive nature of articulatory data collection. Sentences are first phonemically transcribed and scored based on negative log-likelihood of triphones, with sentences that have many rare triphones scoring higher. The search algorithm then finds sentences that have high scores, but also contain the most frequent triphones. Experiments show that the distribution of triphones in the automatically selected sentences is similar to that found in handconstructed sentence sets for English, such as TIMIT, and offers broader phonetic coverage than selecting random sets of sentences
Articulatory data; Corpus design; Information theory; Text corpus
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11392/2371032
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 2
  • ???jsp.display-item.citation.isi??? 1
social impact