Appearance-based passenger counting in cluttered scenes with lateral movement compensation

Ricky Sutopo, Joanne Mun Yee Lim, Vishnu Monn Baskaran, Kok Sheik Wong, Massimo Tistarelli, Heng Fui Liau

Research output: Contribution to journalArticleResearchpeer-review


Autonomous passenger counting in public transportation represents an integral part of an intelligent transportation system, as it provides vital information to improve the efficiency and resource management of a public transportation network. However, counting passengers in highly crowded scenes is a challenging task due to their random movement, diverse appearance settings and inter-object occlusions. Furthermore, state-of-the-art methods in this domain rely heavily on additional custom cameras or sensors instead of existing onboard surveillance cameras, which consequently limits the feasibility of such systems for large-scale deployment. Hence, this paper puts forward an enhanced appearance descriptor with lateral movement compensation, which addresses the difficulty in counting passengers bidirectionally in cluttered scenes. We first construct a head re-identification dataset, which is used to train an appearance descriptor. This dataset addresses the absence of a person re-identification dataset, which in turn allows for accurate tracking of passengers in cluttered scenes. Then, a novel technique of applying a fedora counting line is introduced to count the number of passengers entering and exiting a bus. This technique compensates the impact of passengers’ lateral movement, which crucially increases the accuracy of bidirectional passenger counting using onboard bus surveillance cameras. In addition, a real-time implementation of the proposed method, which includes the integration of DeepStream and fedora counting line, is also presented. Experimental results on a challenging test dataset demonstrate that the proposed method outperforms benchmarked techniques with an average counting accuracy of 93.21% for entering and 96.10% for exiting public buses. Furthermore, the proposed system achieves this accuracy at an average frame rate of 16 frames per second, which represents a practical solution to a real-time application.

Original languageEnglish
Pages (from-to)9891-9912
Number of pages22
JournalNeural Computing and Applications
Issue number16
Publication statusPublished - Aug 2021


  • Cluttered scenes
  • Deep learning
  • Intelligent transportation system
  • People counting
  • Person re-identification

Cite this