Publications

Public deliverables

Internship reports

 

Academic publications

Julian David Echeverry-Correa, J. Ferreiros-López, A. Coucheiro-Limeres, R. Córdoba, Juan M Montero (2015): Topic identification techniques applied to dynamic language model adaptation for automatic speech recognition. In: Expert System with Applications, 42 (1), pp. 101–112, 2015, ISSN: 0957-4174. (Type: Article | Abstract | Links | BibTeX | Tags: genre identification, language model, speech recognition)
Onur Babacan, Thomas Drugman, Tuomo Raitio, Daniel Erro, Thierry Dutoit (2014): Parametric Representation for Singing Voice Synthesis: A Comparative Evaluation. In: Proc. ICASSP 2014, 2014. (Type: Inproceeding | BibTeX | Tags: Parametric Representation, Singing Voice, Synthesis, vocoder)
Gilles Degottex, John Kane, Thomas Drugman, Tuomo Raitio, Stefan Scherer (2014): COVAREP - A collaborative voice analysis repository for speech technologies. In: Proc. ICASSP 2014, 2014. (Type: Inproceeding | BibTeX | Tags: glottal source, sinusoidal modeling, spectral envelope, Speech processing, toolkit, voice quality)
Thomas Drugman, Tuomo Raitio (2014): Excitation Modeling for HMM-based Speech Synthesis: Breaking Down the Impact of Periodic and Aperiodic Components. In: Proc. ICASSP 2014, 2014. (Type: Inproceeding | BibTeX | Tags: excitation modeling, glottal flow, HMM-based speech synthesis, residual signal)
József Domokos, Adriana Stan, Mircea Giurgiu (2014): An Approach to Lexical Stress Detection from Transcribed Continuous Speech Using Acoustic Features. In: Proc. Telfor2014, 2014. (Type: Inproceeding | BibTeX | Tags: lexical stress, Speech synthesis)
Jaime Lorenzo-Trueba, Roberto Barra-Chicote, Junichi Yamagishi, Juan M. Montero (2014): Towards Cross-lingual Emotion Transplantation. In: Proc. Iberspeech 2014, 2014. (Type: Inproceeding | BibTeX | Tags: cross-lingual, emotion transplantation, expressive speech synthesis)
Antti Suni, Tuomo Raitio, Dhananjaya Gowda, Reima Karhila, Matt Gibson, Oliver Watts (2014): The Simple4All entry to the Blizzard Challenge 2014. In: Proc. Blizzard Challenge 2014 Workshop, Singapore, 2014. (Type: Inproceeding | Links | BibTeX | Tags: Deep neural network, glottal flow pulse library, glottal inverse filtering, statistical parametric speech synthesis, unsupervised learning, vector space model)
A. Gallardo-Antolín, J.M. Montero, S. King (2014): A Comparison of Open-Source Segmentation Architectures for Dealing with Imperfect Data from the Media in Speech Synthesis. In: Proc. Interspeech 2014, 2014. (Type: Inproceeding | Abstract | BibTeX | Tags: expressive speech synthesis, speaker diarization, speaking styles, Speech synthesis)
Ling-Hui Chen, Tuomo Raitio, Cassia Valentini-Botinhao, Junichi Yamagishi, Zhen-Hua Ling (2014): DNN-based stochastic postfilter for HMM-based speech synthesis. In: Proc. Interspeech 2014, pp. 1954-1958, Singapore, 2014. (Type: Inproceeding | Links | BibTeX | Tags: DNN, HMM, modulation spectrum, postfilter, segmental quality, Speech synthesis)
Thomas Merritt, Tuomo Raitio, Simon King (2014): Investigating source and filter contributions, and their interaction, to statistical parametric speech synthesis. In: Proc. Interspeech 2014, pp. 1509-1513, Singapore, 2014. (Type: Inproceeding | Links | BibTeX | Tags: GlottHMM, hidden Markov modelling, source filter interaction, source filter model, Speech synthesis)
Tuomo Raitio, Antti Suni, Lauri Juvela, Martti Vainio, Paavo Alku (2014): Deep neural network based trainable voice source model for synthesis of speech with varying vocal effort. In: Proc. Interspeech 2014, pp. 1969-1973, Singapore, 2014. (Type: Inproceeding | Links | BibTeX | Tags: Deep neural network, DNN, glottal flow, Speech synthesis, Vocal effort, voice source modelling)
Manu Airaksinen, Paavo Alku (2014): Parameterization of the glottal source with the phase plane plot. In: Proc Interspeech 2014, pp. 96-99, ISCA, 2014. (Type: Inproceeding | Abstract | BibTeX | Tags: glottal source, phase plane)
Manu Airaksinen, Tom Bäckström, Paavo Alku (2014): Automatic estimation of the lip radiation effect in glottal inverse filtering. In: Proc Interspeech 2014, pp. 398-402, 2014. (Type: Inproceeding | Abstract | BibTeX | Tags: glottal inverse filtering, glottal source)
Jouni Pohjalainen, Paavo Alku (2014): Filtering and subspace selection for spectral features in detecting speech under physical stress. In: Proc Interspeech 2014, pp. 432-436, ISCA, 2014. (Type: Inproceeding | Abstract | BibTeX | Tags: physical stress, spectral features)
Dhananjaya Gowda, Heikki Kallasjoki, Reima Karhila, Cristian Contan, Kalle Palomäki, Mircea Giurgiu, Mikko Kurimo (2014): On the role of missing data imputation and NMF feature enhancement in building synthetic voices using reverberant speech. In: Proc. Interspeech 2014, pp. 2947-2951, ISCA, Singapore, 2014, ISSN: 1990-9770. (Type: Inproceeding | Abstract | Links | BibTeX | Tags: dereverberation, missing data methods, nonnegative matrix factorization, robust speech synthesis)
J. Lorenzo-Trueba, J. D. Echeverry-Correa, R. Barra-Chicote, R. San-Segundo, J. Ferreiros, A. Gallardo-Antolín, J. Yamagishi, S. King, J. M. Montero (2014): Development of a Genre-Dependent TTS System with Cross-Speaker Speaking-Style Transplantation. In: ISCA/IEEE Workshop on Speech, Language and Audio in Multimedia (SLAM) 2014., 2014. (Type: Inproceeding | Abstract | BibTeX | Tags: expressive speech synthesis, genre detection, speaking styles, TTS)
Tuomo Raitio, Heng Lu, John Kane, Antti Suni, Martti Vainio, Simon King, Paavo Alku (2014): Voice source modelling using deep neural networks for statistical parametric speech synthesis. In: Proc. of the 22nd European Signal Processing Conference (EUSIPCO), Lisbon, Portugal, 2014. (Type: Inproceeding | Links | BibTeX | Tags: Deep neural network, DNN, glottal flow, statistical parametric speech synthesis, voice source modelling)
Stig-Arne Grönroos, Sami Virpioja, Peter Smit, Mikko Kurimo (2014): Morfessor FlatCat: An HMM-Based Method for Unsupervised and Semi-Supervised Learning of Morphology. In: Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics, pp. 1177–1185, Dublin City University and Association for Computational Linguistics, Dublin, Ireland, 2014. (Type: Inproceeding | Links | BibTeX | Tags: HMM, machine learning, morfessor, morphology)
Jouni Pohjalainen, Cemal Hanilci, Tomi Kinnunen, Paavo Alku (2014): Mixture linear prediction in speaker verification under vocal effort mismatch. In: IEEE Signal Processing Letters, 21 (12), pp. 1516-1520, 2014. (Type: Article | Abstract | BibTeX | Tags: )
Jouni Pohjalainen, Paavo Alku (2014): Gaussian mixture linear prediction. In: Acoustics, Speech and Signal Processing (ICASSP), 2014 IEEE International Conference on, pp. 6285 - 6289, IEEE, 2014. (Type: Inproceeding | Abstract | BibTeX | Tags: weighted linear prediction)
Jouni Pohjalainen, Paavo Alku (2014): Multi-scale modulation filtering in automatic detection of emotions in telephone speech. In: Acoustics, Speech and Signal Processing (ICASSP), 2014 IEEE International Conference on, pp. 980 - 984, IEEE, 2014. (Type: Inproceeding | Abstract | BibTeX | Tags: emotions)
Peter Smit, Sami Virpioja, Stig-Arne Grönroos, Mikko Kurimo (2014): Morfessor 2.0: Toolkit for statistical morphological segmentation.. 2014. (Type: | Links | BibTeX | Tags: morfessor, morphological segmentation)
Thomas Drugman, Paavo Alku, Abeer Alwan, Bayya Yegnanarayana (2014): Glottal source processing: from analysis to applications. In: Computer Speech and Language, 28 (5), pp. 1117-1138, 2014. (Type: Article | Abstract | BibTeX | Tags: review on glottal inverse filtering)
Manu Airaksinen, Tuomo Raitio, Brad Story, Paavo Alku (2014): Quasi closed phase glottal inverse filtering analysis with weighted linear prediction. In: IEEE Transactions on Audio, Speech, and Language Processing, 22 (3), pp. 596-607, 2014. (Type: Article | Abstract | Links | BibTeX | Tags: glottal inverse filtering, weighted linear prediction)
Beatriz Martínez-González, Jose Manuel Pardo, J.D. Echeverry-Correa, J. M. Montero (2014): New experiments on speaker diarization for unsupervised speaking style voice building for speech synthesis. In: Procesamiento del Lenguaje Natural, 52 (0), pp. 77-84, 2014, ISSN: 1989-7553. (Type: Article | Abstract | Links | BibTeX | Tags: )
Wei Zhang, Robert A. J. Clark, Yongyuan Wang (2014): Unsupervised Language Filtering using the Latent Dirichlet Allocation. In: Proc. Interspeech 2014, pp. 1268–1272, 2014. (Type: Inproceeding | BibTeX | Tags: )
Susana Palmaz López-Peláez, Robert A. J. Clark (2014): Speech synthesis reactive to dynamic noise environmental conditions. In: Proc. Interspeech 2014, pp. 2927–2931, 2014. (Type: Inproceeding | BibTeX | Tags: )
Chetana Prakash, Dhananjaya Gowda, Suryakanth Gangashetty (2013): Analysis of Acoustic Events in Speech Signals Using Bessel Series Expansion. In: Circuits, Systems, and Signal Processing, 32 (6), pp. 2915-2938, 2013, ISSN: 1531-5878. (Type: Article | Abstract | Links | BibTeX | Tags: bessel series expansion, glottal closure instants, speech analysis, vowel onset time)
Sami Virpioja, Peter Smit, Stig-Arne Grönroos, Mikko Kurimo (2013): Morfessor 2.0: Python Implementation and Extensions for Morfessor Baseline. Aalto University (SCIENCE + TECHNOLOGY, 25/2013), 2013, ISBN: 1799-490X. (Type: Techreport | Abstract | Links | BibTeX | Tags: machine learning, morfessor, morpheme segmentation, morphology induction, semi-supervised learning, unsupervised learning)
Antti Suni, Daniel Aalto, Tuomo Raitio, Paavo Alku, Martti Vainio (2013): Wavelets for intonation modeling in HMM speech synthesis. In: Proc. 8th ISCA Speech Synthesis Workshop, 2013, 2013. (Type: Inproceeding | Abstract | Links | BibTeX | Tags: HMM-based synthesis, intonation modeling, wavelet decomposition)
Dhananjaya Gowda, Jouni Pohjalainen, Paavo Alku, Mikko Kurimo (2013): Robust Spectral Representation Using Group Delay Function and Stabilized Weighted Linear Prediction for Additive Noise Degradations. In: Speech Technology and Human-Computer Dialogue, SpeD 2013, IEEE, 2013. (Type: Inproceeding | Abstract | BibTeX | Tags: frequency weighted segmental SNR, group delay function, robust spectrum estimation, stabilized weighted linear prediction)
Ioana Muresan, Adriana Stan, Mircea Giurgiu, Rodica Potolea (2013): Evaluation of Sentiment Polarity Prediction using a Dimensional and a Categorical Approach. In: Speech Technology and Human-Computer Dialogue, 2013. (Type: Inproceeding | Links | BibTeX | Tags: sentiment polarity, statistic metrics, VAD model)
Jouni Pohjalainen, Paavo Alku (2013): Extended weighted linear prediction using the autocorrelation snapshot - A robust speech analysis method and its application to recognition of vocal emotions. In: Proc. Interspeech 2013, 2013. (Type: Inproceeding | BibTeX | Tags: Robust emotion detection)
Manu Airaksinen, Brad Story, Paavo Alku (2013): Quasi closed phase analysis for glottal inverse filtering. In: Proc. Interspeech 2013, 2013. (Type: Inproceeding | BibTeX | Tags: Quasi closed phase)
S. Lebai Lutfi, F. Fernández-Martínez, J. Lorenzo-Trueba, R. Barra-Chicote, J. M. Montero (2013): I Feel You: The Design and Evaluation of a Domotic Affect- Sensitive Spoken Conversational Agent. In: Sensors, 13 (8), pp. 10519-10538, 2013, ISSN: 1424-8220. (Type: Article | Abstract | Links | BibTeX | Tags: affective agent, emotional speech, Speech synthesis, spoken conversational agents; evaluation)
J. Lorenzo-Trueba, R. Barra-Chicote, J. Yamagishi, O. Watts, J. M. Montero (2013): Towards Speaking Style Transplantation in Speech Synthesis. In: Proc. 8th ISCA Speech Synthesis Workshop, 2013. (Type: Inproceeding | Abstract | Links | BibTeX | Tags: Adaptation, expressive speech synthesis, speaking styles, transplantation)
J. Lorenzo-Trueba, R. Barra-Chicote, J. Yamagishi, O. Watts, J.M. Montero (2013): Evaluation of a Transplantation Algorithm for Expressive Speech Synthesis. In: proccedings of Workshop en Tecnologías Accesibles, IV Congreso Español de Informática CEDI2013, 2013. (Type: Inproceeding | Abstract | BibTeX | Tags: expressive speech synthesis, transplantation)
S. Lebai Lutfi, F. Fernández-Martínez, J. M. Lucas-Cuesta, López-Lebón, J. M. Montero (2013): A Satisfaction-based Model for Affect Recognition from Conversational Features in Spoken Dialog Systems. In: Speech Communication, 55 (7-8), pp. 825–840, 2013, ISSN: 0167-6393. (Type: Article | Abstract | Links | BibTeX | Tags: affect prediction, emotional speech)
Verónica López-Ludeña, Roberto Barra-Chicote, Syaheerah Lutfi, Juan Manuel Montero, Rubén San-Segundo (2013): LSESpeak: A spoken language generator for Deaf people. In: Expert Systems with Applications, 40 (4), pp. 1283–1295, 2013, ISSN: 0957-4174. (Type: Article | Abstract | Links | BibTeX | Tags: affective agent, emotional speech synthesis)
Oliver Watts, Adriana Stan, Rob Clark, Yoshitaka Mamiya, Mircea Giurgiu, Junichi Yamagishi, Simon King (2013): Unsupervised and lightly supervised learning for rapid construction of TTS systems in multiple languages from 'found' data: evaluation and analysis. In: Proc. 8th ISCA Speech Synthesis Workshop, 2013. (Type: Inproceeding | Links | BibTeX | Tags: audiobook data, multilingual speech synthesis, text-to-speech, unsupervised learning, vector space model)
Yoshitaka Mamiya, Adriana Stan, Junichi Yamagishi, Peter Bell, Oliver Watts, Robert Clark, Simon King (2013): Using Adaptation to Improve Speech Transcription Alignment in Noisy and Reverberant Environments. In: Proc. 8th ISCA Speech Synthesis Workshop, 2013. (Type: Inproceeding | Links | BibTeX | Tags: adaptive training, CMLLR, MAP, speech alignment, speech segmentation, VAD)
Dhananjaya Gowda, Mikko Kurimo (2013): Analysis of breathy, modal and pressed phonation based on low frequency spectral density. In: Proc. Interspeech 2013, 2013. (Type: Inproceeding | Abstract | BibTeX | Tags: amplitude quotient, breathy voice, glottal features, harmonic-to-noise ratio, low-frequency spectral density, Phonation, pressed voice)
Dhananjaya Gowda, Jouni Pohjalainen, Mikko Kurimo, Paavo Alku (2013): Robust formant detection using group delay function and stabilized weighted linear prediction. In: Proc. Interspeech 2013, 2013. (Type: Inproceeding | Abstract | BibTeX | Tags: formant detection, group delay, robust spectrum estimation, stabilized weighted linear prediction, SWLP)
Adriana Stan, Peter Bell, Junichi Yamagishi, Simon King (2013): Lightly Supervised Discriminative Training of Grapheme Models for Improved Sentence-level Alignment of Speech and Text Data. In: Proc. Interspeech 2013, 2013. (Type: Inproceeding | Abstract | Links | BibTeX | Tags: automatic alignment, discriminative training, grapheme models, light su- pervision)
Adriana Stan, Oliver Watts, Yoshitaka Mamiya, Mircea Giurgiu, Rob Clark, Junichi Yamagishi, Simon King (2013): TUNDRA: A Multilingual Corpus of Found Data for TTS Research Created with Light Supervision. In: Proc Interspeech 2013, 2013. (Type: Inproceeding | Abstract | Links | BibTeX | Tags: audiobook data, found data, imperfect data, light supervision, multilingual corpus, text-to-speech)
Paavo Alku, Jouni Pohjalainen, Martti Vainio, Anne-Maria Laukkanen, Brad Story (2013): Formant frequency estimation of high-pitched vowels using weighted linear prediction. In: Journal of the Acoustical Society of America, 134 (2), pp. 1295-1313, 2013. (Type: Article | Links | BibTeX | Tags: formants, high pitch)
Yoshitaka Mamiya, Junichi Yamagishi, Oliver Watts, Robert A. J. Clark, Simon King, Adriana Stan (2013): Lightly Supervised GMM VAD to Use Audiobook for Speech Synthesiser. In: Proc. ICASSP 2013, pp. 7987-7991, 2013. (Type: Inproceeding | Abstract | Links | BibTeX | Tags: audiobook, HMM-based speech synthesis, lightly supervised, voice activity detection)
Jouni Pohjalainen, Paavo Alku (2013): Robust speech analysis by lag-weighted linear prediction. In: Proc. ICASSP 2012, 2013, ISSN: 1520-6149. (Type: Inproceeding | BibTeX | Tags: robust linear prediction)
Bajibabu Bollepalli, Tuomo Raitio, Paavo Alku (2013): Effect of MPEG Audio Compression on HMM-based Speech Synthesis. In: Proc. Interspeech 2013, 2013. (Type: Inproceeding | Abstract | BibTeX | Tags: GlottHMM, HMM, MP3, Speech synthesis)
Harri Auvinen, Tuomo Raitio, Manu Airaksinen, Samuli Siltanen, Brad Story, Paavo Alku (2013): Automatic Glottal Inverse Filtering with the Markov Chain Monte Carlo Method. In: 2013, (Computer Speech and Language (In press)). (Type: Article | Abstract | Links | BibTeX | Tags: glottal inverse filtering, Markov chain Monte Carlo)
Tuomo Raitio, Antti Suni, Jouni Pohjalainen, Manu Airaksinen, Martti Vainio, Paavo Alku (2013): Analysis and Synthesis of Shouted Speech. In: Proc. Interspeech 2103, 2013. (Type: Inproceeding | Abstract | Links | BibTeX | Tags: shouting, speech analysis, Speech synthesis)
Tuomo Raitio, John Kane, Thomas Drugman, Christer Gobl (2013): HMM-based synthesis of creaky voice. In: Proc. Interspeech 2013, 2013. (Type: Inproceeding | Abstract | Links | BibTeX | Tags: Contextual Factors, Creaky voice, excitation modeling, F0 estimation, Speech synthesis)
Antti Suni, Reima Karhila, Tuomo Raitio, Mikko Kurimo, Martti Vainio, Paavo Alku (2013): Lombard Modified Text-to-Speech Synthesis for Improved Intelligibility: Submission for the Hurricane Challenge 2013. In: Proc. Interspeech 2013, 2013. (Type: Inproceeding | Abstract | Links | BibTeX | Tags: GlottHMM, Hurricane challenge, intelligibility, Lombard speech, Speech synthesis)
Jouni Pohjalainen, Tuomo Raitio, Santeri Yrttiaho, Paavo Alku (2013): Detection of shouted speech in noise: human and machine. In: Journal of the Acoustical Society of America, 133 (4), pp. 2377-2389, 2013. (Type: Article | Links | BibTeX | Tags: shouting)
Jouni Pohjalainen, Paavo Alku (2013): Automatic detection of anger in telephone speech with robust autoregressive modulation filtering. In: Proc. ICASSP 2013, 2013. (Type: Inproceeding | BibTeX | Tags: Anger detection)
Karhila, Reima, Remes, Ulpu, Kurimo, Mikko (2013): HMM-Based Speech Synthesis Adaptation Using Noisy Data: Analysis and Evaluation Methods. In: Proceedings of ICASSP-13, 2013, (Accepted to ICASSP 2013). (Type: Inproceeding | Abstract | BibTeX | Tags: Adaptation, Evaluation, Feature extraction, Noise robustness, Speech synthesis)
Tuomo Raitio, Antti Suni, Martti Vainio, Paavo Alku (2013): Comparing Glottal-Flow-Excited Statistical Parametric Speech Synthesis Methods. In: Proc. ICASSP 2013, 2013. (Type: Inproceeding | Abstract | Links | BibTeX | Tags: excitation, glottal flow, principal component analysis, pulse library, statistical parametric speech synthesis)
Drugman, Thomas, Kane, John, Raitio, Tuomo, Gobl, Christer (2013): Prediction of Creaky Voice from Contextual Factors. In: Proc. ICASSP 2013, 2013. (Type: Inproceeding | Abstract | BibTeX | Tags: Contextual Factors, Creaky voice, Expressive Speech, Speech synthesis)
Tuomo Raitio, Antti Suni, Martti Vainio, Paavo Alku (2013): Synthesis and Perception of Breathy, Normal, and Lombard Speech in the Presence of Noise. In: Special issue of Computer Speech and Language on 'The Listening Talker', 2013. (Type: Article | Abstract | Links | BibTeX | Tags: Adaptation, Breathy speech, intelligibility, Lombard speech, statistical parametric speech synthesis, Vocal effort)
Bajibabu Bollepalli, Jerome Urbain, Tuomo Raitio, Joakim Gustafson, Huseyin Cakmak (2013): A Comparative Evaluation of Vocoding Techniques for HMM-based Laughter Synthesis. In: Proc. ICASSP 2013, 2013. (Type: Inproceeding | BibTeX | Tags: DSM, GlottHMM, HMM, HTS, Laughter synthesis, mel-cepstrum, STRAIGHT, vocoder)
Adriana Stan, Peter Bell, Simon King (2012): A Grapheme-based Method for Automatic Alignment of Speech and Text Data. In: Proc. Spoken Language Technology Workshop (SLT), 2012 IEEE, pp. 286 - 290, 2012. (Type: Inproceeding | Abstract | Links | BibTeX | Tags: grapheme-based models, imperfect transcripts, speech alignment, word networks)
Jaime Lorenzo-Trueba, Oliver Watts, Roberto Barra-Chicote, Junichi Yamagishi, Simon King, Juan M Montero (2012): Simple4All proposals for the Albayzin Evaluations in Speech Synthesis. In: Proc. Iberspeech 2012, 2012. (Type: Inproceeding | Abstract | Links | BibTeX | Tags: Albayzin challenge, expressive speech synthesis)
V. López-Ludeña, R. San-Segundo, J. M. Montero, R. Barra-Chicote, J. Lorenzo (2012): Architecture for Text Normalization using Statistical Machine Translation techniques. In: Proc. Iberspeech 2012, 2012. (Type: Inproceeding | Abstract | Links | BibTeX | Tags: Abbreviations, Acronyms, language translation, numbers, text normalization, text to speech conversion)
Janne Pylkkönen, Mikko Kurimo (2012): Analysis of Extended Baum-Welch and Constrained Optimization for Discriminative Training of HMMs. In: IEEE Transactions on Audio, Speech and Language Processing, 20 (9), pp. 2409-2419, 2012, ISSN: 1558-7916. (Type: Article | Links | BibTeX | Tags: discriminative training, HMMs)
Tuomo Raitio, Marko Takanen, Olli Santala, Antti Suni, Martti Vainio, Paavo Alku (2012): On measuring the intelligibility of synthetic speech in noise – Do we need a realistic noise environment?. In: Proc. ICASSP 2012, pp. 4025-4028, IEEEE, 2012, ISSN: 1520-6149. (Type: Inproceeding | Abstract | Links | BibTeX | Tags: intelligibility, Lombard speech, multichannel reproduction, speech in noise)
Harri Auvinen, Tuomo Raitio, Samuli Siltanen, Paavo Alku (2012): Utilizing Markov Chain Monte Carlo (MCMC) Method for Improved Glottal Inverse Filtering. In: Proc. Interspeech 2012, 2012, ISSN: 1990-9770. (Type: Inproceeding | Abstract | Links | BibTeX | Tags: glottal inverse filtering, Markov chain Monte Carlo, MCMC)
Paavo Alku, Jouni Pohjalainen, Martti Vainio, Anne-Maria Laukkanen, Brad Story (2012): Improved formant frequency estimation from high-pitched vowels by downgrading the contribution of the glottal source with weighted linear prediction. In: Proc. Interspeech 2012, 2012, ISSN: 1990-9770. (Type: Inproceeding | Abstract | Links | BibTeX | Tags: formants, linear prediction, weighted linear prediction)
Alan Pinheiro, Tuomo Raitio, Danyane Gomes, Paavo Alku (2012): Voice source analysis using biomechanical modeling and glottal inverse filtering. In: Proc. Interspeech 2012, 2012, ISSN: 1990-9770. (Type: Inproceeding | Abstract | Links | BibTeX | Tags: biomechanical simulation, glottal flow, glottal inverse filtering)
Antti Suni, Tuomo Raitio, Martti Vainio, Paavo Alku (2012): The GlottHMM Entry for Blizzard Challenge 2012: Hybrid Approach. In: Proc. of the Blizzard Challenge 2012 Workshop, 2012. (Type: Inproceeding | Abstract | BibTeX | Tags: glottal inverse filtering, glottal flow pulse library, hybrid, statistical parametric speech synthesis)
Jaime Lorenzo-Trueba, Roberto Barra-Chicote, Tuomo Raitio, Nicolas Obin, Paavo Alku, Junichi Yamagishi, Juan M Montero (2012): Towards Glottal Source Controllability in Expressive Speech Synthesis. In: Proc. Interspeech 2012, Portland (Oregon), USA, 2012, ISSN: 1990-9772. (Type: Inproceeding | Abstract | Links | BibTeX | Tags: expressive speech synthesis, glottal source modeling, speaking styles)
Ruben San-Segundo, Juan M. Montero, Veronica Lopez-Ludeña, Simon King (2012): Detecting Acronyms from Capital Letter Sequences in Spanish. In: Proc. Interspeech 2012, 2012, ISSN: 1990-9772. (Type: Inproceeding | Abstract | Links | BibTeX | Tags: Abbreviations, Acronyms, Capital letter sequence pronunciation, Spanish, Speech synthesis, Spelling)
J. Lorenzo, B. Martinez, R. Barra-Chicote, V. Lopez–Ludena, J. Ferreiros, J. Yamagishi, J.M. Montero (2012): Towards an Unsupervised Speaking Style Voice Building Framework: Multi–Style Speaker Diarization. In: Proc. Interspeech 2012, 2012, ISSN: 1990-9772. (Type: Inproceeding | Abstract | Links | BibTeX | Tags: expressive speech synthesis, speaker diarization, speaking styles, voice cloning)
Martti Vainio, Daniel Aalto, Antti Suni, Anja Arnhold, Tuomo Raitio, Henri Seijo, Juhani Jarvikivi, Paavo Alku (2012): Effect of noise type and level on focus related fundamental frequency changes. In: Proc. Interspeech 2012, ISCA, Portland, Oregon, 2012. (Type: Inproceeding | Abstract | BibTeX | Tags: f0, focus, intonation, Lombard speech, noise, prosody)
Anna C. Janska, Erich Schröger, Thomas Jacobsen, Robert A. J. Clark (2012): Asymmetries in the perception of synthesized speech. In: Proc. Interspeech 2012, 2012, ISSN: 1990-9770. (Type: Inproceeding | Abstract | Links | BibTeX | Tags: perceptual evaluation, Speech synthesis)
Tuomo Raitio, Antti Suni, Martti Vainio, Paavo Alku (2012): Wideband Parametric Speech Synthesis Using Warped Linear Prediction. In: Proc. Interspeech 2012, 2012, ISSN: 1990-9770. (Type: Inproceeding | Abstract | Links | BibTeX | Tags: statistical parametric speech synthesis, warped linear prediction, wide-band, WLP)
Jouni Pohjalainen, Tuomo Raitio, Hannu Pulakka, Paavo Alku (2012): Automatic detection of high vocal effort in telephone speech. In: Proc. Interspeech 2012, 2012. (Type: Inproceeding | BibTeX | Tags: high vocal effort)
Reima Karhila, Rama Sanand Doddipatla, Mikko Kurimo, Peter Smit (2012): Creating synthetic voices for children by adapting adult average voice using stacked transformations and VTLN. In: Proc. ICASSP 2012, IEEEE, 2012, ISSN: 1520-6149. (Type: Inproceeding | Links | BibTeX | Tags: speaker adaptation, Speech synthesis)
Jouni Pohjalainen, Paavo Alku (2012): Robust speech analysis by lag-weighted linear prediction. In: Acoustics, Speech and Signal Processing (ICASSP), 2012 IEEE International Conference on, pp. 4453 - 4456, IEEE, 2012, ISBN: 1520-6149. (Type: Inproceeding | Abstract | BibTeX | Tags: weighted linear prediction)