Event News

Talk by Prof. Rajiv R. Shah from IIIT-Delhi India:
"Lipreading Before, Between and Beyond: An AI-based Approach"

We are pleased to announce that Prof. Rajiv R. Shah from IIIT-Delhi India will give a talk about AI-based lipreading. Anybody is welcome to the talk.


Lipreading Before, Between and Beyond: An AI-based Approach


Prof. Rajiv R. Shah
IIIT-Delhi India


13:00-14:00 / December 20, Friday


19F meeting room #1903, NII

Speechreading or lipreading is the technique of understanding and getting phonetic features from a speaker's visual features such as movement of lips, face, teeth and tongue. It has a wide range of multimedia applications such as in surveillance, Internet telephony, and as an aid to a person with hearing impairments. Our lab (MIDAS:Multimodal Digital Media Analysis Lab) has worked on several aspects of lipreading both as an application as well as theory. We started with viewing it as converting silent speech videos to speech and text and now, we are working on utilizing the same for helping people suffering from various pathologies such as aphasia, dysarthria, etc. Speech, the same as language, is severely limited by the datasets the research community has. The problem gets more aggravated in case of low resource languages like Hindi and Tamil. Even for high resource languages like English, they are restricted by the vocabulary size they contain. For e.g., the largest lipreading dataset contains no more than a few thousand unique English words. Thus, one of our goals has been to overcome these limitations of the current systems by generalizing it to different languages and extending its vocabulary size. All of this has helped us to use the capabilities of lipreading on modern applications such as video streaming and live video generation.


Rajiv Ratn Shah currently works as an Assistant Professor in the Department of Computer Science and Engineering (joint appointment with the Department of Human-centered Design) at IIIT-Delhi. He received his Ph.D. in computer science from the National University of Singapore (NUS), Singapore. Before joining IIIT-Delhi, he worked as a Research Fellow in Living Analytics Research Center (LARC) at the Singapore Management University (SMU), Singapore. Prior to completing his Ph.D., he received his M.Tech. and M.C.A. degrees in Computer Applications from the Delhi Technological University (DTU), Delhi and Jawaharlal Nehru University (JNU), Delhi, respectively. He has also received his B.Sc. in Mathematics (Honors) from the Banaras Hindu University (BHU), Varanasi. Dr. Shah is the recipient of several awards, including the prestigious Heidelberg Laureate Forum (HLF) 2018 and European Research Consortium for Informatics and Mathematics (ERCIM) 2017 fellowships. He won the best student poster award in 33rd AAAI Conference on Artificial Intelligence 2019 at Hawaii, USA and won the best poster runner up award in 20th IEEE International Symposium on Multimedia (ISM) 2018 conference at Taichung, Taiwan. Recently, we also won the best poster and best industry paper awards at 5th IEEE International Conference on Multimedia Big Data (BigMM) 2019 conference. He is also the winner of 1st ACM India Student Chapter Grand Challenge 2019. He has also received the best paper award in the IWGS workshop at the ACM SIGSPATIAL conference 2016, San Francisco, USA and was runner-up in the Grand Challenge competition of ACM International Conference on Multimedia 2015, Brisbane, Australia. He is involved in organizing and reviewing of many top-tier international conferences and journals. He is TPC co-chair for IEEE BigMM 2019 and 2020. He has also organized the Multimodal Representation, Retrieval, and Analysis of Multimedia Content (MR2AMC) workshop in the conjunction of the first IEEE MIPR 2018 and 20th IEEE ISM conferences. His research interests include multimedia content processing, natural language processing, image processing, multimodal computing, data science, social media computing, and the internet of things. Specifically, his current research interests include:

  • Multimodal deep learning based healthcare solutions
  • Multimodal fake news detection using deep learning techniques
  • Multimodal deep learning based solutions for education
  • Event detection and recommendation on social media
  • Multimodal multimedia search, retrieval, and recommendation
  • Deep learning based multimedia systems