I'm a researcher at the Center for Translational Neurophysiology of Speech and Communication at the Italian Institute of Technology (IIT). I received a PhD in Computer Science from the Centre for Speech Technology Research at the University of Edinburgh (2006-2010). Before moving to Edinburgh I worked as software engineer/ project manager in Loquendo, a speech technology company (2001-2006). I received a 5-year degree (BEng + MEng) in Electronic Engineering from the Universita’ di Genova (1994-2000).

My research interests include text-to-speech (TTS) synthesis, automatic speech recognition (ASR) with a focus on neuro-inspired ASR, machine learning for speech and language processing, and analysis of non-verbal communication.

Since I joined IIT I have been working on the use of speech production information for ASR. I collaborate with the neuroscientists of my lab (Mirror Neurons and Interaction Lab) to 1) investigate the role of the motor cortex in speech perception through neurophysiological and behavioral experiments and 2) try to build acoustic models that take into account the contribution of motor information.

In IIT I also work on the analysis of non-verbal communication (i.e., sensory-motor communication).

During my PhD I worked on the identification and generation of prosodic prominence patterns for English TTS synthesis with a particular focus on the mechanisms by which context (i.e., discourse and "sentence" context) affects prosodic prominence and on the automatic detection of discourse-context related factors (e.g. focus, contrast) from text.

2010-2013 SIEMPRE - EU 7 Framework Programme ICT - FET

2012-2015 POETICON++ - EU 7 Framework Programme ICT - STREP

2015-2018 ECOMODE - H2020-ICT-2014-1


Selected Publications

[35] Badino, L., Franceschi, L., Donini, M., Pontil, M., "A Speaker Adaptive DNN Training Approach for Speaker-independent Acoustic Invertion", in Proc. of Interspeech, Stockholm, Sweden, 2017.

[34] Mukherjee, S., D'Ausilio, A., Nguyen, N.,  Fadiga, L., Badino, L., "The Relationship between F0 Synchrony and Speech Convergence in Dyadic Interaction", in Proc. of Interspeech, Stockholm, Sweden, 2017.

[33] Badino, L., "Phonetic Context Embeddings for DNN-HMM Phone Recognition", in Proc. of Interspeech, San Francisco, CA, USA, 2016. Code avilable at

[32] Coco, M.I.,  Badino, L.,  Cipresso, P.,  Chirico, A.,  Ferrari, E.,  Riva, G., Gaggioli, A.,  D’Ausilio, A., “Multilevel behavioral synchronisation in a joint tower-building task”, in the IEEE Transactions on Cognitive and Developmental Systems, accepted.

[31] Volpe, G., D’Ausilio A., Badino L., Camurri, A., Fadiga, L., "Measuring social interaction in music ensembles", Philosophical Transactions R. Soc. B, 2016, DOI: 10.1098/rstb.2015.0377.

[30] Badino, L., Canevari, C., Fadiga, L., Metta, G., "Integrating Articulatory Data in Deep Neural Network-based Acoustic Modeling", Computer Speech and Language, vol 36, pp. 173–195, 2016. Some of the code written for this work is available at:

[29] D’Ausilio A., Badino L., Cipresso P., Chirico A., Ferrari E., Riva G., Gaggioli A. "Automatic imitation of the kinematic profile in interacting partners". Cogn Process, In Press, 2015.

[28] D’Ausilio A., Lohan, K., Badino, L., Sciutti A. "Studying Human-Human interaction to build the future of Human-Robot interaction". A. Gaggioli, A. Ferscha, G. Riva, S. Dunne and I. Viaud-Delmon, Eds. Human Computer Confluence: advancing our understanding of the emerging symbiotic relation between humans and computing devices. (pp. xxx). De Gruyter Open: Warsaw, Poland, In Press, 2015.

[27] Badino, L., Mereta, A. Rosasco, L. "Discovering discrete subword units with Binarized Autoencoders and Hidden-Markov-Model Encoders", Proc. of Interspeech, Dresden, Germany, 2015. Code avilable at

[26] Canevari, C., Badino, L., Fadiga, L., "A new Italian dataset of parallel acoustic and articulatory data", Proc. of Interspeech, Dresden, Germany, 2015. Dataset available at

[25] Bartoli, E., D'Ausilio, A., Berry, J., Badino, L., Bever, T., Fadiga, L.,"Listener-speaker perceived distance predicts the degree of motor contribution to speech perception", Cerebral Cortex, vol. 25(2), pp. 281-288, 2015.

[24] Badino, L., Canevari, C., Fadiga, L., Metta, G. "An auto-encoder based approach to unsupervised learning of subword units", Proc. of IEEE ICASSP, Florence, Italy, 2014.

[23] Badino, L., D'Ausilio, A., Glowinski, D., Camurri, A., Fadiga, L. "Sensorimotor communication in professional quartets", Neuropsychologia vol. 55, pp. 98--104, 2014, doi: 10.1016/j.neuropsychologia.2013.11.012

[22] Badino, L., D'Ausilio, A., Fadiga, L., Metta, G. "Computational validation of the motor contribution to speech perception", in Topics in Cognitive Science, 6 (3), 461-475, 2014.

[21] Canevari, C., Badino, L., Fadiga, L., Metta, G., "Cross-corpus and cross-linguistic evaluation of a speaker-dependent DNN-HMM ASR system using EMA data", Workshop on Speech Production for Automatic Speech Recognition, Lyon, France, 2013.

[20] Canevari, C., Badino, L., D'Ausilio, A., L., Fadiga, L., Metta, G., "Modeling speech imitation and ecological learning of auditory-motor maps", in Frontiers in Psychology, doi: 10.3389/fpsyg.2013.00364, 2013

[19] Canevari, C., Badino, L., Fadiga, L., Metta, G., "Relevance-weighted reconstruction of articulatory features in Deep Neural Network- based Acoustic-to-Articulatory Mappingin Proc. of Interspeech, Lyon, France, 2013.

[18] Badino, L., Canevari, C., Fadiga, L., Metta, G. "Deep-Level Acoustic-to-Articulatory Mapping for DBN-HMM Based Phone Recognition", in IEEE SLT 2012, Miami, Florida, 2012 --- we later discovered a bug in the Viterbi decoder. Corrected results are available at ---

[17] Glowinski, D., Badino, L., D'Ausilio, A., Camurri, A., Fadiga, L. "Analysis of Leadership in a String Quartet", in Proceedings of the Third International Workshop on Social behaviour in Music at ACM ICMI 12, Santa Monica, USA, 2012.

[16] Badino, L., Clark, R.A.J., Wester, M.,"Towards Hierarchical Prosodic Prominence Generation in TTS Synthesis", in Proc. of Interspeech 2012, Portland, Oregon, 2012.

[15] D'Ausilio, A., Badino, L., Yi, L., Tokay, S. Craighero, L., Canto, R., Aloimonos, Y., Fadiga, L., Leadership in Orchestra Emerges from the Causal Relationships of Movement Kinematics. PLoS ONE 7(5): e35757. doi:10.1371/journal.pone.0035757, 2012.

[14] Castellini C, Badino L, Metta G, Sandini G, Tavella M, Grimaldi M, Fadiga L. The Use of Phonetic Motor Invariants Can Improve Automatic Phoneme Discrimination. PLoS ONE 6(9): e24055. doi:10.1371/journal.pone.0024055, 2011.

[13] D'Ausilio, A., Badino, L., Yi, L., Tokay, S. Craighero, L., Canto, R., Aloimonos, Y., Fadiga, L., "Communication in orchestra playing as measured with Granger Causality". Intetain (Intelligent Technologies for Interactive Enertainment) 2011, Genova, Italy, 2011.

[12] Leonardo Badino. Identifying prosodic prominence patterns for English text-to-speech synthesis. PhD Thesis, University of Edinburgh, Edinburgh, 2010.

[11] J. Sebastian Andersson, Joao P. Cabral, Leonardo Badino, Junichi Yamgishi, Robert A.J. Clark. Glottal Source and Prosodic Prominence Modelling in HMM-based Speech Synthesis for the Blizzard Challenge 2009 . In Proc. Blizzard Challenge Workshop 2009, Edinburgh, UK, 2009.

[10] Leonardo Badino, J. Sebastian Andersson, Junichi Yamgishi, Robert A.J. Clark. Identification of Contrast and Its Emphatic Realization in HMM based Speech Synthesis . In Proc. of Interspeech, Brighton, UK, 2009.

[9] Leonardo Badino, Robert A.J. Clark. Automatic labeling of contrastive word pairs from spontaneous spoken English. In 2008 IEEE/ACL Workshop on Spoken Language Technology, Goa, India, 2008.

[8] Leonardo Badino, Robert A.J. Clark and Volker Strom. Including Pitch Accent Optionality in Unit Selection Text-to-Speech Synthesis. In Proc. of Interspeech, Brisbane, Australia, 2008.

[7] J. Sebastian Andersson, Leonardo Badino, Oliver S. Watt and Matthew P.Aylett. The CSTR/Cereproc Blizzard Entry 2008: The Inconvenient Data. In Proc. Blizzard Challenge Workshop (in Proc. Interspeech 2008), Brisbane, Australia, 2008.

[6] Matthew P.Aylett, J. Sebastian Andersson, Leonardo Badino, and Christopher J. Pidcock. The Cerevoice Blizzard Entry 2007: Are small database errors worse than compression artifacts? In Proc. Blizzard Challenge Workshop (in Proc. SSW6), Bonn, Germany, 2007.

[5] Leonardo Badino and Robert A.J. Clark. Issues of optionality in pitch accent placement. In Proc. 6th ISCA Speech Synthesis Workshop, Bonn, Germany, 2007.

[4] Leonardo Badino. Chinese text word segmentation considering semantic links among sentences. In Proc. ICSLP 2004, Jeju, Korea, 2004.

[3] Leonardo Badino, Claudia Barolo, and Silvia Quazza. Language independent phoneme mapping for foreign TTS. In Proc. 5th ISCA Speech Synthesis Workshop, Pittsburgh, USA, 2004.

[2] Leonardo Badino, Claudia Barolo, and Silvia Quazza. A general approach to TTS reading of mixed-language texts. In Proc. ICSLP 2004, Jeju, Korea, 2004.

[1] Enrico Zovato, Stefano Sandri, Silvia Quazza, and Leonardo Badino. Prosodic analysis of a multi-style corpus in the perspective of emotional speech synthesis. In Proc. ICSLP 2004, Jeju, Korea, 2004.


