PhD Researcher,
Universal Perception (UP) lab,
People-Centred AI, University of Surrey, UK
2022.09 ~ Present
Vision and Language Processing.
I am a PhD student at Surrey Institute for People-Centred AI , co-supervised by Dr. Xiatian (Eddy) Zhu and Dr. Diptesh Kanojia as a part of the Universal Perception (UP) lab, University of Surrey, UK.
Prior to this, I was a Speech and NLP researcher at TCS-Research, Mumbai under Dr. Sunil Kumar Kopparapu and Dr. Rupayan Chakraborty.
My primary research interest is Multimodal Deep Learning, within vision and language.
Selected Publications
(A complete list of my publications is available here).
AV-GS: Learning Material and Geometry Aware Priors for Novel View Acoustic Synthesis
NeurIPS (2024)
DiffSED: Sound Event Detection with Denoising Diffusion
AAAI [Oral] (2024)
A Novel Metric For Evaluating Audio Caption Similarity
IEEE ICASSP (2023)
Calibration Free Meta learning based approach for Subject Independent EEG Emotion Recognition
Biomedical Signal Processing and Control 72 (2022)
Contrastive Learning of Cough Descriptors for Automatic COVID-19 Preliminary Diagnosis
Interspeech (2021)
Deep Lung Auscultation using Acoustic biomarkers for Abnormal Respiratory Sound Event Detection
IEEE ICASSP (2021)
Deep Encoded Linguistic and Acoustic cues for Attention based End-to-End Speech Emotion Recognition
IEEE ICASSP (2020)
Automatic Speaker Independent Dysarthric Speech Intelligibility Assessment System
Computer Speech and Language 69 (2021)
End-to-End Spoken Language Understanding: Bootstrapping in Low Resource Scenarios
Interspeech (2019)
Research Experience
Researcher
Speech and NLP team, TCS-Research,
Mumbai, India.
2019.08 ~ 2022.09
Audio and Speech Signal Processing, Few-shot Audio Event Detection, Audio Captioning.
Research Intern
Speech and NLP team, TCS-Research,
Mumbai, India.
2019.01 ~ 2019.06
End-to-End Spoken Language Understanding.