User Feedback data

This data was collected as a first step into investigation on spoken user feedback. The purpose is to investigate listeners’ agreement on the types of mistakes that a synthetic voice of poor quality would do.

The data includes:

  • synthetic speech waveforms generated from a model set trained with 300 sentences of speech from ”Roger”-database used in the Blizzard Challenge 2009;
  • a demonstration web page of the listening test procedure;
  • listening test results. The listening test was run on 200 sentences with 34 listeners. Each listener listened to and classified 100 sentences into categories:
    • a) Ok: Quality is good;
    • b) Ok: It’s not great but it will do;
    • c) Not ok: Mispronunciation of word(s);
    • d) Not ok: Incomprehensible segments;
    • e) Not ok: Bad rhythm or intonation;
    • f) Not ok: Bad audio quality (artifacts etc.).

demonstration web page of the test arrangement is available on-line (as of October 2014). This is also included together with the data.

The synthetic speech samples are made public under the licensing and copyright terms of the Creative Commons Attribution 3.0 Unported License (CC-BY 3.0). The listening test results are free to use, but for any published work a citation to the preliminary report published in Simple4all deliverable 4.3 is expected.

A description of the organisation of the directories is included in the data package.

The dataset is available according the the license described above and may be downloaded from here .