Abstract
Transformer-based approaches have exhibited outstanding performances in the field of human-object interaction (HOI) detection. However, these approaches rely on underlying object detectors that have undergone large-scale pre-trainings on the ImageNet and MS-COCO dataset. This limits the potential of unique architectural designs and induces a learning bias, causing ineffective HOI representation learning. In this paper, we propose ScratchHOI, a transformer-based method for human-object interaction detection that can be trained from scratch, eliminating the need for pre-trained object detectors. ScratchHOI employs dynamic and static affinity-based feature aggregation for processing local and long-range visual information. Additional techniques are also employed to improve detection performance, such as dynamic and interactive anchor refinement for objects and interactions. Experiments on the HICO-Det dataset show that ScratchHOI achieves competitive performance against other state-of-the-art approaches over a variety of different evaluation measures.
Original language | English |
---|---|
Title of host publication | 2023 IEEE International Conference on Image Processing, Proceedings |
Editors | Chong-Wah Ngo, John See |
Place of Publication | Piscataway NJ USA |
Publisher | IEEE, Institute of Electrical and Electronics Engineers |
Pages | 1690-1694 |
Number of pages | 5 |
ISBN (Electronic) | 9781728198354 |
ISBN (Print) | 9781728198361 |
DOIs | |
Publication status | Published - 2023 |
Event | IEEE International Conference on Image Processing 2023 - Kuala Lumpur, Malaysia Duration: 8 Oct 2023 → 11 Oct 2023 Conference number: 30th https://ieeexplore.ieee.org/xpl/conhome/10221937/proceeding (Proceedings) https://2023.ieeeicip.org (Website) |
Conference
Conference | IEEE International Conference on Image Processing 2023 |
---|---|
Abbreviated title | ICIP 2023 |
Country/Territory | Malaysia |
City | Kuala Lumpur |
Period | 8/10/23 → 11/10/23 |
Internet address |
|