TY - JOUR
T1 - ASF-YOLO
T2 - A novel YOLO model with attentional scale sequence fusion for cell instance segmentation
AU - Kang, Ming
AU - Ting, Chee-Ming
AU - Ting, Fung Fung
AU - Phan, Raphaël C.W.
N1 - Funding Information:
This work was supported by the Monash University Malaysia and the Ministry of Higher Education, Malaysia under Fundamental Research Grant Scheme FRGS/1/2023/ICT02/MUSM/02/1.
Publisher Copyright:
© 2024 The Authors
PY - 2024/7
Y1 - 2024/7
N2 - We propose a novel Attentional Scale Sequence Fusion based You Only Look Once (YOLO) framework (ASF-YOLO) which combines spatial and scale features for accurate and fast cell instance segmentation. Built on the YOLO segmentation framework, we employ the Scale Sequence Feature Fusion (SSFF) module to enhance the multiscale information extraction capability of the network, and the Triple Feature Encoder (TFE) module to fuse feature maps of different scales to increase detailed information. We further introduce a Channel and Position Attention Mechanism (CPAM) to integrate both the SSFF and TFE modules, which focus on informative channels and spatial position-related small objects for improved detection and segmentation performance. Experimental validations on two cell datasets show remarkable segmentation accuracy and speed of the proposed ASF-YOLO model. It achieves a box mAP of 0.91, mask mAP of 0.887, and an inference speed of 47.3 FPS on the 2018 Data Science Bowl dataset, outperforming the state-of-the-art methods. The source code is available at https://github.com/mkang315/ASF-YOLO.
AB - We propose a novel Attentional Scale Sequence Fusion based You Only Look Once (YOLO) framework (ASF-YOLO) which combines spatial and scale features for accurate and fast cell instance segmentation. Built on the YOLO segmentation framework, we employ the Scale Sequence Feature Fusion (SSFF) module to enhance the multiscale information extraction capability of the network, and the Triple Feature Encoder (TFE) module to fuse feature maps of different scales to increase detailed information. We further introduce a Channel and Position Attention Mechanism (CPAM) to integrate both the SSFF and TFE modules, which focus on informative channels and spatial position-related small objects for improved detection and segmentation performance. Experimental validations on two cell datasets show remarkable segmentation accuracy and speed of the proposed ASF-YOLO model. It achieves a box mAP of 0.91, mask mAP of 0.887, and an inference speed of 47.3 FPS on the 2018 Data Science Bowl dataset, outperforming the state-of-the-art methods. The source code is available at https://github.com/mkang315/ASF-YOLO.
KW - Attention mechanism
KW - Medical image analysis
KW - Sequence feature fusion
KW - Small object segmentation
KW - You only look once (YOLO)
UR - http://www.scopus.com/inward/record.url?scp=85192340521&partnerID=8YFLogxK
U2 - 10.1016/j.imavis.2024.105057
DO - 10.1016/j.imavis.2024.105057
M3 - Article
AN - SCOPUS:85192340521
SN - 0262-8856
VL - 147
JO - Image and Vision Computing
JF - Image and Vision Computing
M1 - 105057
ER -