Local shape context based real-time endpoint body part detection and identification from depth images

Zhenning Li, Dana Kulić

Research output: Chapter in Book/Report/Conference proceedingConference PaperResearchpeer-review

1 Citation (Scopus)


For many human-robot interaction applications, accurate localization of the human, and in particular the endpoints such as the head, hands and feet, is crucial. In this paper, we propose a new Local Shape Context Descriptor specifically for describing the shape features of the endpoint body parts. The descriptor is computed from edge images obtained from depth data generated by a time-of-flight sensor. The proposed descriptor encodes the distance from a reference point to the nearest edges in uniformly sampled radial directions. Based on this descriptor, a new type of interest point is defined, and a hierarchical algorithm for searching good interest points is developed. The interest points are then classified as head, feet, hands and others based on learned models. The system is computationally efficient, and capable of handling large variations in translation, rotation, scaling and deformation of the body parts. The system is tested using videos containing a variety of motions from a publicly available dataset, and is shown to be capable of detecting and identifying endpoint body parts accurately at very high speed.

Original languageEnglish
Title of host publicationProceedings - 2011 Canadian Conference on Computer and Robot Vision, CRV 2011
Number of pages8
Publication statusPublished - 23 Aug 2011
Externally publishedYes
EventCanadian Conference on Computer and Robot Vision 2011 - St. Johns, Canada
Duration: 25 May 201127 May 2011
Conference number: 8th
https://ieeexplore.ieee.org/xpl/conhome/5955405/proceeding (Proceedings)

Publication series

NameProceedings - 2011 Canadian Conference on Computer and Robot Vision, CRV 2011


ConferenceCanadian Conference on Computer and Robot Vision 2011
Abbreviated titleCRV 2011
CitySt. Johns
Internet address


  • depth image
  • endpoint body part
  • gesture recognition
  • human motion capture
  • local shape context

Cite this