Abstract
In this work, we present, LieNet, a novel deep learning framework that simultaneously detects, segments multiple object instances, and estimates their 6D poses from a single RGB image without requiring additional post-processing. Our system is accurate and fast (∼10 fps), which is well suited for real-time applications. In particular, LieNet detects and segments object instances in the image analogous to modern instance segmentation networks such as Mask R-CNN, but contains a novel additional sub-network for 6D pose estimation. LieNet estimates the rotation matrix of an object by regressing a Lie algebra based rotation representation, and estimates the translation vector by predicting the distance of the object to the camera center. The experiments on two standard pose benchmarking datasets show that LieNet greatly outperforms other recent CNN based pose prediction methods when they are used with monocular images and without post-refinements.
Original language | English |
---|---|
Title of host publication | 29th British Machine Vision Conference, BMVC 2018 |
Editors | Hubert P. H. Shum, Timothy Hospedales |
Place of Publication | London UK |
Publisher | British Machine Vision Association |
Number of pages | 12 |
Publication status | Published - 2018 |
Externally published | Yes |
Event | British Machine Vision Conference 2018 - Newcastle, United Kingdom Duration: 3 Sept 2018 → 6 Sept 2018 Conference number: 29th http://bmvc2018.org/ https://dblp.org/db/conf/bmvc/bmvc2018.html |
Conference
Conference | British Machine Vision Conference 2018 |
---|---|
Abbreviated title | BMVC 2018 |
Country/Territory | United Kingdom |
City | Newcastle |
Period | 3/09/18 → 6/09/18 |
Internet address |