Dr Anton Ragni

BEng, MEng, PhD

School of Computer Science

Senior Lecturer in Speech and Language Technologies

Seminar Organiser

Member of the Speech and Hearing (SpandH) research group

Anton Ragni profile photo
Profile picture of Anton Ragni profile photo
a.ragni@sheffield.ac.uk
+44 114 222 1925

Full contact details

Dr Anton Ragni
School of Computer Science
Regent Court (DCS)
211 Portobello
Sheffield
S1 4DP
Profile

Dr Anton Ragni is a Senior Lecturer in Speech and Language Processing in the School of Computer Science at the University of Sheffield.

He graduated with BEng and MEng degrees in Information Technology from the University of Tartu, Estonia, in 2005 and 2007 respectively. He was awarded his PhD from the University of Cambridge in 2013.

From 2005 to 2008, he underwent graduate training at the Nordic Graduate School of Language Technology and from 2007 to 2008, he was an intern in the Speech Technology Group, Toshiba Research Europe Ltd, UK. From 2013 to 2018 and from 2018 to 2019, he was a Research Associate and Senior Research Associate, respectively, in Speech Processing at the University of Cambridge.

His current research interest focuses on machine learning approaches for speech and language processing.

Research interests

Dr Anton Ragni's research interests include:

  • Core automatic speech recognition
  • Efficient and expressive speech synthesis
  • Spoken Language Translation
  • Information Retrieval
  • Conversation Modelling
Publications

Books

  • Young S, Evermann G, Gales M, Hain T, Kershaw D, Xunying L, Moore G, Odell J, Ollason D, Povey D , Ragni A et al () The HTK Book (for HTK Version 3.5, documentation alpha version). Cambridge University Engineering Department: Cambridge University Engineering Department. RIS download Bibtex download

Journal articles

  • Sun W, Tu Z & Ragni A (2024) Energy-Based Models for Speech Synthesis. ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), abs/2310.12765, 12667-12671. RIS download Bibtex download
  • Flynn R & Ragni A (2024) Self-Train Before You Transcribe.. CoRR, abs/2406.12937. RIS download Bibtex download
  • Ma Y, Øland A, Ragni A, Sette BMD, Saitis C, Donahue C, Lin C, Plachouras C, Benetos E, Quinton E , Shatri E et al (2024) Foundation Models for Music: A Survey.. CoRR, abs/2408.14340. RIS download Bibtex download
  • Cross M & Ragni A (2024) What happens to diffusion model likelihood when your model is conditional?. 1ST ECAI WORKSHOP ON MACHINE LEARNING MEETS DIFFERENTIAL EQUATIONS: FROM THEORY TO APPLICATIONS, 255. RIS download Bibtex download
  • Ma Y, Yuan R, Li Y, Zhang G, Chen X, Yin H, Lin C, Benetos E, Ragni A, Gyenge N , Liu R et al (2023) On the Effectiveness of Speech Self-supervised Learning for Music.. CoRR, abs/2307.05161. RIS download Bibtex download
  • Ragni A, Gales MJF, Rose O, Knill KM, Kastanos A, Li Q & Ness PM (2022) Increasing Context for Estimating Confidence Scores in Automatic Speech Recognition. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 30, 1319-1329. RIS download Bibtex download
  • Li Y, Zhang G, Yang B, Lin C, Wang S, Ragni A & Fu J (2022) HERB: Measuring Hierarchical Regional Bias in Pre-trained Language Models.. CoRR, abs/2211.02882. RIS download Bibtex download
  • Chen X, Liu X, Wang Y, Ragni A, Wong JHM & Gales MJF (2019) Exploiting future word contexts in neural network language models for speech recognition. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 27(9), 1444-1454. View this article in WRRO RIS download Bibtex download
  • Wu C, Gales MJF, Ragni A, Karanasou P & Sim KC (2018) Improving interpretability and regularization in deep learning. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 26(2), 256-265. View this article in WRRO RIS download Bibtex download
  • Ragni A, Li Q, Gales MJF & Wang Y (2018) Confidence Estimation and Deletion Prediction Using Bidirectional Recurrent Neural Networks.. CoRR, abs/1810.13025. RIS download Bibtex download
  • Shi-Xiong Zhang , Ragni A & Gales MJF (2010) Structured Log Linear Models for Noise Robust Speech Recognition. IEEE Signal Processing Letters, 17(11), 945-948. RIS download Bibtex download
  • Jacobsen SA & Ragni A () Continuous representations of intents for dialogue systems. RIS download Bibtex download
  • Wang Z & Ragni A () Approximate Fixed-Points in Recurrent Neural Networks. RIS download Bibtex download

Chapters

Conference proceedings papers

Preprints

Grants

Research Grants

  • Exemplar-based Expressive Speech Synthesis, EPSRC, 06/2021 - 11/2023, £218,290, as PI
  • Automatic voice conversion for transforming professional adult voice actors to artificial child voice actors, Innovate UK, 01/2021 - 01/2023, £173,605, as Co-PI
Professional activities and memberships

He is a member of IEEE, ISCA and a regular reviewer of major speech and machine learning journals and conferences.

Since 2016, he has been an Officer of ISCA Special Interest Group on Machine Learning in Speech and Language Processing. He received the Best Student Paper Award at the IEEE Workshop on Automatic Speech Recognition and Understanding for his paper “Generative kernels for noise robust ASR” co-authored with M. J. F. Gales in 2011.