Resume

I currently work on optimizing deep learning kernels using LLM agents at AMD. I am passionate about all aspects of speech and language processing.

Skills

Tools: PyTorch/Tensorflow, k2/icefall, sherpa-onnx, kaldi
Deployment and automation: Docker, Kubernetes, Apache Kafka, Websockets, FastAPI, Git, Jira
Programming languages: Python, C++, Bash
Leadership strengths: Innovation, idea prioritization & execution, teamwork, collaboration

Experience

AMD • Senior Member of Technical Staff [2025 - Present]

Amazon • Applied Scientist II [2025]
Focused on LLM-based speech-to-speech models.

Uniphore • Staff AI Scientist [2021 - 2025]
Technical and team leadership, focusing on end-to-end speech recognition, personalization, and related areas for conversational AI.

Google • Research Intern [2020]
Worked on a research project with the audio processing team.

Idiap Research Institute • Research Assistant [2017 - 2021]
Developed a PhD thesis on automatic speech assessment and recognition.

Interactive Intelligence • Senior Speech Engineer [2015 - 2017]
Built production-grade ASR models in multiple languages, enhancing acoustic modelling to improve efficacy, while reducing systematic ASR errors.

Samsung R&D Institute India • Lead Engineer [2013 - 2015]
Developed robust feature extraction techniques and implemented data selection methods, and improved acoustic models across multiple languages.

Education

Docteur ès Sciences (PhD)[2017 - 2021]
École polytechnique fédérale de Lausanne (EPFL), Electrical Engg.

Master of Science (MS) by Research[2010 - 2013]
Indian Institute of Technology Madras, Electrical Engg.

Bachelor of Engineering (BE)[2006 - 2010]
Andhra University, Electronics and Communication Engg.

Certifications

Business Concept, startup training by Innosuisse, 2020.
Manager Development Program from Uniphore, 2023.

Professional Services and Outreach

Invited speaker at the IEEE Workshop on Bridging Languages with Generative Models: Advances in Speech Translation , IIT Indore, December 2025.
Reviewer at ICASSP, Interspeech, ASRU, SLT, WASPAA and ICMI.
Chaired a session on Speech Signal Analysis at ISCA Interspeech 2023.
Participated in IndiaAI roundtable on Generative AI, April 2023.
Interviewed by IndiaAI, May 2022, and The Interview Portal, Feb. 2023.

Select publications

2024

Shashi Kumar, Srikanth Madikeri, Nigmatulina Iuliia, Esaú Villatoro-Tello, Petr Motlicek, Karthik Pandia D S, S. Pavankumar Dubagunta and Aravind Ganapathiraju, “Multitask Speech Recognition and Speaker Change Detection for Unknown Number of Speakers,” in Proceedings of ICASSP. [DOI]

2023

Lokesh Bansal, S. Pavankumar Dubagunta, Malolan Chetlur, Pushpak Jagtap, and Aravind Ganapathiraju, “On the Efficacy and Noise-Robustness of Jointly Learned Speech Emotion and Automatic Speech Recognition,” in Proceedings of Interspeech. [PDF]

Tilak Purohit, Sarthak Yadav, Bogdan Vlasenko, S. Pavankumar Dubagunta, and Mathew Magimai Doss, “Towards Learning Emotion Information from Short Segments of Speech,” in Proceedings of ICASSP. [PDF]

2022

S. Pavankumar Dubagunta, Edoardo Moneta, Eleni Theocharopoulos, and Mathew Magimai Doss, “Towards Automatic Prediction of Non-Expert Perceived Speech Fluency Ratings,” in Companion proceedings of the International Conference on Multimodal Interaction (ICMI). [DOI]

S. Pavankumar Dubagunta, Rob J. J. H. van Son, and Mathew Magimai Doss, “Adjustable deterministic pseudonymization of speech,” in Computer Speech and Language. [DOI]

2021

S. Pavankumar Dubagunta, “Novel Methods for Incorporating Prior Knowledge for Automatic Speech Assessment,” PhD thesis, École polytechnique fédérale de Lausanne (EPFL). [PDF] [Talk]

(Full list at https://scholar.google.com/citations?user=–k6n58AAAAJ)