Automatic speech-gesture mapping and engagement evaluation in human robot interaction

Bishal Ghosh, Abhinav Dhall, Ekta Singla

Research output: Chapter in Book/Report/Conference proceedingConference PaperResearch

2 Citations (Scopus)


In this paper, we present an end-to-end system for enhancing the effectiveness of non-verbal gestures in human robot interaction. We identify prominently used gestures in performances by TED talk speakers and map them to their corresponding speech context and modulated speech based upon the attention of the listener. Gestures are localised with convolution neural networks based approach. Dominant gestures of TED speakers are used for learning the gesture-to-speech mapping. We evaluated the engagement of the robot with people by conducting a social survey. The effectiveness of the performance was monitored by the robot and it self-improvised its speech pattern on the basis of the attention level of the audience, which was calculated using visual feedback from the camera. The effectiveness of interaction as well as the decisions made during improvisation was further evaluated based on the head-pose detection and an interaction survey.

Original languageEnglish
Title of host publication2019 28th IEEE International Conference on Robot and Human Interactive Communication (RO-MAN)
EditorsJohn-John Cabibihan, Mary Anne Williams, T. Asokan, Laxmidhar Behera
Place of PublicationPiscataway NJ USA
PublisherIEEE, Institute of Electrical and Electronics Engineers
Number of pages7
ISBN (Electronic)9781728126227
ISBN (Print)9781728126234
Publication statusPublished - 2019
EventIEEE/RSJ International Symposium on Robot and Human Interactive Communication 2019 - New Delhi, India
Duration: 14 Oct 201918 Oct 2019
Conference number: 28th (Proceedings)


ConferenceIEEE/RSJ International Symposium on Robot and Human Interactive Communication 2019
Abbreviated titleRO-MAN 2019
CityNew Delhi
Internet address

Cite this