Adriana Stan, Peter Bell, Junichi Yamagishi, Simon King (2013): Lightly Supervised Discriminative Training of Grapheme Models for Improved Sentence-level Alignment of Speech and Text Data. In: Proc. Interspeech 2013, 2013. (Type: Inproceeding | Abstract | Links | BibTeX | Tags: automatic alignment, discriminative training, grapheme models, light su- pervision)@inproceedings{stan_IS13a,
title = {Lightly Supervised Discriminative Training of Grapheme Models for Improved Sentence-level Alignment of Speech and Text Data},
author = {Adriana Stan and Peter Bell and Junichi Yamagishi and Simon King},
url = {http://consortium.simple4all.org/files/2013/03/master.pdf},
year = {2013},
date = {2013-08-24},
booktitle = {Proc. Interspeech 2013},
abstract = {This paper introduces a method for lightly supervised discrim- inative training using MMI to improve the alignment of speech and text data for use in training HMM-based TTS systems for low-resource languages. In TTS applications, due to the use of long-span contexts, it is important to select training utterances which have wholly correct transcriptions. In a low-resource set- ting, when using poorly trained grapheme models, we show that the use of MMI discriminative training at the grapheme level enables us to increase the amount of correctly aligned data by 40%, while maintaining a 7% sentence error rate and 0.8% word error rate. We present the procedure for lightly supervised dis- criminative training with regard to the objective of minimising sentence error rate.},
keywords = {automatic alignment, discriminative training, grapheme models, light su- pervision}
}
This paper introduces a method for lightly supervised discrim- inative training using MMI to improve the alignment of speech and text data for use in training HMM-based TTS systems for low-resource languages. In TTS applications, due to the use of long-span contexts, it is important to select training utterances which have wholly correct transcriptions. In a low-resource set- ting, when using poorly trained grapheme models, we show that the use of MMI discriminative training at the grapheme level enables us to increase the amount of correctly aligned data by 40%, while maintaining a 7% sentence error rate and 0.8% word error rate. We present the procedure for lightly supervised dis- criminative training with regard to the objective of minimising sentence error rate.
|
Janne Pylkkönen, Mikko Kurimo (2012): Analysis of Extended Baum-Welch and Constrained Optimization for Discriminative Training of HMMs. In: IEEE Transactions on Audio, Speech and Language Processing, 20 (9), pp. 2409-2419, 2012, ISSN: 1558-7916. (Type: Article | Links | BibTeX | Tags: discriminative training, HMMs)@article{pylkkonen12taslp,
title = {Analysis of Extended Baum-Welch and Constrained Optimization for Discriminative Training of HMMs},
author = {Janne Pylkkönen and Mikko Kurimo},
url = {http://dx.doi.org/10.1109/TASL.2012.2203805},
issn = {1558-7916},
year = {2012},
date = {2012-10-25},
journal = {IEEE Transactions on Audio, Speech and Language Processing},
volume = {20},
number = {9},
pages = {2409-2419},
keywords = {discriminative training, HMMs}
}
|