DIF: Dataset of perceived Intoxicated Faces for drunk person identification

Vineet Mehta, Sai Srinadhu Katta, Devendra Pratap Yadav, Abhinav Dhall

Research output: Chapter in Book/Report/Conference proceedingConference PaperResearchpeer-review


Traffic accidents cause over a million deaths every year, of which a large fraction is attributed to drunk driving. An automated intoxicated driver detection system in vehicles will be useful in reducing accidents and related financial costs. Existing solutions require special equipment such as electrocardiogram, infrared cameras or breathalyzers. In this work, we propose a new dataset called DIF (Dataset of perceived Intoxicated Faces) which contains audio-visual data of intoxicated and sober people obtained from online sources. To the best of our knowledge, this is the first work for automatic bimodal non-invasive intoxication detection. Convolutional Neural Networks (CNN) and Deep Neural Networks (DNN) are trained for computing the video and audio baselines, respectively. 3D CNN is used to exploit the Spatio-temporal changes in the video. A simple variation of the traditional 3D convolution block is proposed based on inducing non-linearity between the spatial and temporal channels. Extensive experiments are performed to validate the approach and baselines.
Original languageEnglish
Title of host publicationProceedings of the 2019 International Conference on Multimodal Interaction
EditorsSusan R. Fussell , Björn Schuller, Yale Song, Kai Yu
Place of PublicationNew York NY USA
PublisherAssociation for Computing Machinery (ACM)
Number of pages8
ISBN (Electronic)9781450368605
Publication statusPublished - 2019
Externally publishedYes
EventInternational Conference on Multimodal Interaction 2019 - Suzhou, China
Duration: 14 Oct 201918 Oct 2019
Conference number: 21st


ConferenceInternational Conference on Multimodal Interaction 2019
Abbreviated titleICMI 2019
Internet address


  • Affect recognition
  • Intoxication Detection
  • Convolutional Neural Network

Cite this

Mehta, V., Srinadhu Katta, S., Pratap Yadav, D., & Dhall, A. (2019). DIF: Dataset of perceived Intoxicated Faces for drunk person identification. In S. R. Fussell , B. Schuller, Y. Song, & K. Yu (Eds.), Proceedings of the 2019 International Conference on Multimodal Interaction (pp. 367-374). Association for Computing Machinery (ACM). https://doi.org/10.1145/3340555.3353754