Corpora

The Tundra Corpus

A corpus of European language audiobooks

Spanish Speaking Styles Corpora

Spanish data in a number of different speaking styles

Blizzard 2014 Annotations

Annotations generated for the 2014 Blizzard challenge

Text Normalisation Datasets

Text datasets in 3 languages: English, Spanish, and Romanian

Romanian Broadcast News

A speech and text dataset

Romanian Parliamentary Speeches

Speech and text of Romanian parliamentary speeches