基于交叉注意力网络的小样本地铁轨道-车轮图像分割算法
曹建新1张月莹2姜伟昊3高鋆豪2
Few-shot Metro Track-wheel Image Segmentation Algorithm Based on Cross-attention Network
CAO Jianxin1ZHANG Yueying2JIANG Weihao3GAO Yunhao2
-
作者信息:1.杭州杭港地铁有限公司, 310018, 杭州
2.浙江省机电产品质量检测所有限公司, 310018, 杭州
3.杭州东上智能科技有限公司, 310018, 杭州
-
Affiliation:1.Hangzhou Hanggang Metro Co., Ltd., 310018, Hangzhou, China
2.Zhejiang Testing & Inspection Institute for Mechanical and Electrical Products Quality Co., Ltd., 310018, Hangzhou, China
3.Hangzhou Dongshang Intelligent Technology Co., Ltd., 310018, Hangzhou, China
-
关键词:
-
Key words:
-
DOI:10.16037/j.1007-869x.20230947
-
中图分类号/CLCN:U216.3
-
栏目/Col:运营管理
摘要:
[目的]地铁轨道图像间的域自适应问题使得现有算法对于类间相似性的轨道-车轮图像分割精度不高。对此,提出基于交叉注意力网络的小样本地铁轨道-车轮图像分割算法。[方法]详细阐述了基于交叉注意力网络的小样本地铁轨道-车轮图像分割算法的计算思路及过程。首先,利用一组共享权重的主干网络将支持分支和查询分支的输入轨道-车轮图片映射到深度特征空间;然后,将双分支映射特征的低层、中间层和高层特征进行尺度融合,并利用交叉注意力网络挖掘双分支融合特征间的关联语义,捕获相同类的不同地铁轨道-车轮图片在深度空间中的共有语义信息;最后,利用平均池化将双分支共有特征转换为类的特定原型,并利用原型指导查询图片中未标注轨道-车轮图片的分割。在自建的地铁轨道-车轮图像数据集上进行对比试验及消融试验,以验证算法的精度及有效性。[结果及结论]经测试,所提算法的mIoU(平均并交比)达66.17%,FB-IoU(前景背景并交比)达78.21%的FB-IoU。与当前主流的语义分割算法相比,基于交叉注意力网络的小样本地铁轨道-车轮图像分割算法分割性能提升效果明显,其实际应用价值较好。
Abstracts:
[Objective] The domain adaptation issue among metro track images results in low segmentation accuracy for track-wheel images with high inter-class similarity in existing algorithms. To address this challenge, a few-shot metro track-wheel image segmentation algorithm based on a cross-attention network is proposed. [Method] The computational roadmap and process of the few-shot metro track-wheel image segmentation algorithm based on cross-attention network is elaborated. First, a group of backbone networks with shared weights is employed to map the input track-wheel images from both the support branch and the query branch into a deep feature space. Then, the low-, mid-, and high-level features from the dual-branch mappings are fused across scales. A cross-attention network is used to mine the relational semantics between these fused features, enabling the capture of shared semantic information in the deep space across different metro track-wheel images belonging to the same class. Finally, an average pooling is applied to convert the common features of both branches into class-specific prototypes, and the prototypes are leveraged to guide the segmentation of unannotated track-wheel images in the query images. Comparative and ablation experiments are conducted on a self-constructed metro track-wheel image dataset to verify the accuracy and effectiveness of the algorithm. [Result & Conclusion] Testing shows that the proposed algorithm achieves a mIoU (mean intersection over union) of 66.17% and a foreground-background intersection over union (FB-IoU) of 78.21%. Compared with current mainstream semantic segmentation algorithms, the proposed few-shot metro track-wheel image segmentation algorithm based on cross-attention networks demonstrates significantly improved segmentation performance and shows potential for practical application.