Spanish Speaking Styles Corpora

Spanish Speaking Styles (SSS) is a Spanish multi-style corpus, especially designed for training and testing speaking-style TTS systems, including genre-detection algorithms, style-dependent synthesis models and style transplantation algorithms. SSS contains a set of high-quality studio recordings from an actor playing a set of prosodic-diverse speaking styles and their associated text genres.

The recorded styles of SSS are: broadcast news, interviews, political speeches, and live sport commentaries. The corpus contains about 1 hour speech per style. Broadcast news scripts are adapted from “Agencia EFE” website . Political speech scripts contain reproductions of three different speeches given by former Spanish presidents. Both news and political speeches have been fully prompted and recorded in paragraphs. Live sport scripts are slightly prompted from the original broadcasters of the recorded match, but were recorded continuously with a high degree of improvisation. Interviews were fully unscripted and also recorded continuously. The improvised styles (eg. interviews and sports) were hand labeled and segmented offline. For all four styles, there are three types of pauses which are labelled: intra-sentence pauses, inter-sentence pauses, and filled pauses for better prosody modeling. The sampling frequency of the WAV files is 48 KHz at 16 bits per sample. The TXT files are available in ISO-8859 format.

For each of the 4 styles, 2 samples (the text / the corresponding speech) are available bellow:
* broadcast news: [ Text 1 Wav 1 ], [ Text 2 Wav 2 ];
* interviews: [ Text 1 Wav 1 ], [ Text 2 Wav 2 ];
* political speech: [ Text 1 Wav 1 ], [ Text 2 Wav 2 ];
* sport commentary: [ Text 1 Wav 1 ], [ Text 2 Wav 2 ];

SSS database is a property of Universidad Politecnica de Madrid, Departamento de Ingeniera Electronica, Grupo de Tecnologia del Habla. She/He has not been authorised to either make a copy of the SSS database or distribute it to a third party. She/He has been authorised to use the database just for research and publication purposes. Any commercial or industrial use of the database or derivative works and any commercial or industrial use of the results obtained from the use of database is explicitly forbidden.

Any publication using this corpus must include a reference to the paper:
J. Lorenzo-Trueba et al.
“Development of a Genre-Dependent TTS System with Cross-Speaker Speaking-Style Transplantation”, Proceedings of the 2nd InternationalWorkshop on Speech, Language and Audio in Multimedia (SLAM2014) pp. 39-42. 2014.

