基于空洞卷积金字塔的目标检测算法

候少麒; 梁杰; 殷康宁; 刘学婷; 殷光强

doi:10.12178/1001-0548.2021032

基于空洞卷积金字塔的目标检测算法

doi: 10.12178/1001-0548.2021032

1.
电子科技大学信息与通信工程学院　成都　611731
2.
电子科技大学信息与软件工程学院　成都　610054

基金项目: 国家重点研发计划(2018YFC0807501)

详细信息

作者简介:
候少麒(1992-)，男，博士生，主要从事计算机视觉方面的研究

通讯作者: 殷光强，E-mail：yingq@uestc.edu.cn

中图分类号: TP391.4

Object Detection Algorithm Based on Atrous Convolutional Pyramid

1.
School of Information and Communication Engineering, University of Electronic Science and Technology of China　Chengdu　611731
2.
School of Information and Software Engineering, University of Electronic Science and Technology of China　Chengdu　610054

摘要: 作为目标检测领域最突出的问题，遮挡和多尺度严重影响了算法的召回率和准确率。针对以上问题，该文从感受野入手，提出了一种基于空洞卷积金字塔网络(ACFPN)的目标检测算法。首先，将不同尺寸的空洞卷积层引入特征金字塔网络(FPN)中，构建混合感受野模块(HRFM)，旨在控制参数量的条件下，通过增大感受野获取更多全局特征信息，解决目标的遮挡问题；其次，改进FPN的结构，设计低层嵌入特征金字塔模块(LEFPM)，将浅层特征细节信息和高层特征语义信息相融合，提高特征图的丰富度和表征能力，增强模型的尺度适应性；特别地，针对漏检问题，引入FCOS算法中的无锚框(AF)机制，减少了候选框的冗余，进一步提高了定位精度。最后在公开数据集上进行测试，该算法在检测精度上大幅提升。
- 空洞卷积 /
- 特征融合 /
- 特征金字塔 /
- 目标检测 /
- 感受野
Abstract: As the most prominent problem in the field of object detection, occlusion and multi-scale seriously affect the recall and precision of the algorithm. To resolve the problems mentioned above, this paper starts from the receptive field, proposing an object detector based on the atrous convolution embedded feature pyramid network (ACFPN). Firstly, the atrous convolutional layers of different sizes are introduced into the feature pyramid to construct a hybrid receptive field module (HRFM), which aims to obtain more global feature information by increasing the receptive field with the number of parameters staging roughly the same, thereby solving the problem of occlusion; secondly, by improving the structure of the feature pyramid, we design a lower embedding feature pyramid module (LEFPM) to enhance the model’s scale adaptability, which combines shallow feature’s detail information and high-level feature’s semantic information to improve the richness and representation ability of feature maps; in particular, targing at the problem of missed detection, the Anchor Free mechanism of the fully convolutional one-stage (FCOS) algorithm is introduced to reduce the redundancy of candidate frames and further improve the positioning accuracy. The algorithm is tested on the public VOC dataset , and has shown a great improvement on detection accuracy.
- atrous convolution /
- feature fusion /
- feature pyramid /
- object detection /
- receptive field

图 1 两种特征金字塔的结构对比

下载: 全尺寸图片幻灯片

图 2 本文ACFPN的整体结构

下载: 全尺寸图片幻灯片

图 3 ResNet和Res2Net主干网络对比

下载: 全尺寸图片幻灯片

图 4 HRFM的结构

下载: 全尺寸图片幻灯片

图 5 本文LEFPM结构图

下载: 全尺寸图片幻灯片

图 6 各类别检测精度对比图

下载: 全尺寸图片幻灯片

图 7 Loss曲线图

下载: 全尺寸图片幻灯片

图 8 检测效果对比图

下载: 全尺寸图片幻灯片

表 1 VOC数据集信息

数据集	训练集		验证集		训练+验证集		测试集		总数
数据集	图片数/张	目标数/个	图片数/张	目标数/个	图片数/张	目标数/个	图片数/张	目标数/个	图片数/张	目标数/个
VOC2007	2501	6301	2510	6307	5011	12608	4952	12032	9963	24640
VOC2012	5717	13609	5823	13841	11540	27450	11540	27450	23080	54900
共计	8218	19910	8333	20148	16551	40058	16492	39482	33043	79540

下载: 导出CSV

表 2 本文提出模块的性能对比

方法	mAP/%	参数量/Mb
ResNet50+FPN (Baseline)	78.7	123.49
Res2Net50+FPN	79.4	124.18
Res2Net50+FPN+HRFM	84.4	125.24
Res2Net50+LEFPM+HRFM (本文 ACFPN)	86.4	124.19

下载: 导出CSV

表 3 各算法精度对比

时间	算法	数据集	mAP/%
2018	Pelee^[31]	VOC07+12	70.9
2018	SIN^[32]	VOC07+12	76.0
2019	FCOS^[18]	VOC07+12	78.7
2018	HKRM^[33]	VOC07+12	78.8
2018	MLKP^[34]	VOC07+12	80.6
2018	STDN^[35]	VOC07+12	80.9
2019	R-DAD^[36]	VOC07+12	81.2
2018	RFBNet^[23]	VOC07+12	82.2
2018	RefineDet^[37]	VOC07+12	83.8
2018	PFPNet^[38]	VOC07+12	84.1
---	本文算法	VOC07+12	86.4
2020	NAS Yolo (Top 1)^[28]	VOC07+12	86.5

下载: 导出CSV

[1]	许德刚, 王露, 李凡. 深度学习的典型目标检测算法研究综述[J]. 计算机工程与应用, 2021, 57(8): 10-25. XU D G, WANG L, LI F. Overview of research on typical target detection algorithms for deep learning[J]. Computer Engineering and Applications, 2021, 57(8): 10-25.
[2]	VIOLA P, JONES M J. Robust real-time face detection[J]. International Journal of Computer Vision, 2004, 57(2): 137-154. doi: 10.1023/B:VISI.0000013087.49260.fb
[3]	DALAL N, TRIGGS B. Histograms of oriented gradients for human detection[C]// 2005 IEEE computer society conference on Computer Vision and Pattern Recognition (CVPR). New York: IEEE, 2005: 886-893.
[4]	PAPAGEORGIOU C P, OREN M, POGGIO T. A general framework for object detection[C]// The Sixth International Conference on Computer Vision (ICCV). Bombay: IEEE, 1998: 555-562.
[5]	KRIZHEVSKY A, SUTSKEVER I, HINTON G E. Imagenet classification with deep convolutional neural networks[J]. Advances in Neural Information Processing Systems, 2012, 25: 1097-1105.
[6]	GIRSHICK R, DONAHUE J, DARRELL T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation[C]// Proceedings of the IEEE conference on Computer Vision and Pattern Recognition (CVPR). Columbus: IEEE, 2014: 580-587.
[7]	REN S, HE K, GIRSHICK R, et al. Faster R-CNN: Towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2016, 39(6): 1137-1149.
[8]	REDMON J, DIVVALA S, GIRSHICK R, et al. You only look once: Unified, real-time object detection[C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Las Vegas: IEEE, 2016: 779-788.
[9]	REDMON J, FARHADI A. YOLO9000: Better, faster, stronger[C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Hawaii: IEEE, 2017: 7263-7271.
[10]	Farhadi A, Redmon J. Yolov3: An incremental improvement[EB/OL]. [2018-04-08]. https://arxiv.org/abs/1804.02767v1.
[11]	LIU W, ANGUELOV D, ERHAN D, et al. Ssd: Single shot multibox detector[C]// European Conference on Computer Vision (ECCV). Amsterdam: Springer, 2016: 21-37.
[12]	HE K, ZHANG X, REN S, et al. Spatial pyramid pooling in deep convolutional networks for visual recognition[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(9): 1904-1916. doi: 10.1109/TPAMI.2015.2389824
[13]	LIN T Y, DOLLÁR P, GIRSHICK R, et al. Feature pyramid networks for object detection[C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Hawaii: IEEE, 2017: 2117-2125.
[14]	YU F, KOLTUN V. Multi-scale context aggregation by dilated convolutions[EB/OL]. [2016-04-30]. https://arxiv.org/abs/1511.07122.
[15]	GIRSHICK R. Fast r-cnn[C]// Proceedings of the IEEE International Conference on Computer Vision (ICCV). Santiago: IEEE, 2015: 1440-1448.
[16]	LAW H, DENG J. Cornernet: Detecting objects as paired keypoints[C]// Proceedings of the European Conference on Computer Vision (CVPR). Munich: Springer, 2018: 734-750.
[17]	DUAN K, BAI S, XIE L, et al. Centernet: Keypoint triplets for object detection[C]// Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV). Seoul: IEEE, 2019: 6569-6578.
[18]	TIAN Z, SHEN C, CHEN H, et al. Fcos: Fully convolutional one-stage object detection[C]// Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV). Seoul: IEEE, 2019: 9627-9636.
[19]	GAO S, CHENG M M, ZHAO K, et al. Res2net: A new multi-scale backbone architecture[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2019
[20]	WANG P, CHEN P, YUAN Y, et al. Understanding convolution for semantic segmentation[C]// 2018 IEEE Winter Conference on Applications of Computer Vision (WACV). Nevada: IEEE, 2018: 1451-1460.
[21]	SZEGEDY C, LIU W, JIA Y, et al. Going deeper with convolutions[C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Boston: IEEE, 2015: 1-9.
[22]	ZHAO H, SHI J, QI X, et al. Pyramid scene parsing network[C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Hawaii: IEEE, 2017: 2881-2890.
[23]	LIU S, HUANG D. Receptive field block net for accurate and fast object detection[C]// Proceedings of the European Conference on Computer Vision (ECCV). Munich: Springer, 2018: 385-400.
[24]	NAJIBI M, SAMANGOUEI P, CHELLAPPA R, et al. Ssh: Single stage headless face detector[C]// Proceedings of the IEEE International Conference on Computer Vision (ICCV). Venice: IEEE, 2017: 4875-4884.
[25]	EVERINGHAM M, VAN GOOL L, WILLIAMS C K I, et al. The pascal visual object classes (voc) challenge[J]. International Journal of Computer Vision, 2010, 88(2): 303-338. doi: 10.1007/s11263-009-0275-4
[26]	LIN T Y, GOYAL P, GIRSHICK R, et al. Focal loss for dense object detection[C]// Proceedings of the IEEE International Conference on Computer Vision (ICCV). Venice: IEEE, 2017: 2980-2988.
[27]	REZATOFIGHI H, TSOI N, GWAK J Y, et al. Generalized intersection over union: A metric and a loss for bounding box regression[C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Long Beach: IEEE, 2019: 658-666.
[28]	BOTTOU L. Stochastic gradient descent tricks[M]// Neural networks: Tricks of the trade. Berlin: Springer, 2012: 421-436.
[29]	FAN X, JIANG W, LUO H, et al. Spherereid: Deep hypersphere manifold embedding for person re-identification[J]. Journal of Visual Communication and Image Representation, 2019, 60: 51-58. doi: 10.1016/j.jvcir.2019.01.010
[30]	EVERINGHAM M, VAN GOOL L, WILLIAMS C K I, et al.PASCAL VOC Challenge performance evaluation and download server[EB/OL]. [2021-1-30]. http://host.robots.ox.ac.uk:8080/leaderboard/displaylb_main.php?challengeid=11&compid=3.
[31]	WANG R J, LI X, LING C X. Pelee: A real-time object detection system on mobile devices[EB/OL]. [2019-1-18]. https://arxiv.org/abs/1804.06882.
[32]	LIU Y, WANG R, SHAN S, et al. Structure inference net: Object detection using scene-level context and instance-level relationships[C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Salt Lake City: IEEE, 2018: 6985-6994.
[33]	JIANG C, XU H, LIANG X, et al. Hybrid knowledge routed modules for large-scale object detection[EB/OL]. [2018-10-30]. https://arxiv.org/abs/1810.12681.
[34]	WANG H, WANG Q, GAO M, et al. Multi-scale location-aware kernel representation for object detection[C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Salt Lake City: IEEE, 2018: 1248-1257.
[35]	ZHOU P, NI B, GENG C, et al. Scale-transferrable object detection[C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Salt Lake City: IEEE, 2018: 528-537.
[36]	BAE S H. Object detection based on region decomposition and assembly[C]// Proceedings of the AAAI Conference on Artificial Intelligence (AAAI). Hawaii: AAAI, 2019, 33(1): 8094-8101.
[37]	ZHANG S, WEN L, BIAN X, et al. Single-shot refinement neural network for object detection[C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Salt Lake City: IEEE, 2018: 4203-4212.
[38]	KIM S W, KOOK H K, SUN J Y, et al. Parallel feature pyramid network for object detection[C]// Proceedings of the European Conference on Computer Vision (ECCV). Munich: Springer, 2018: 234-250.

[1]	张桃红, 郭学强, 郑瀚, 罗继昌, 王韬, 焦力群, 唐安莹. Dual-Path Vision Transformer用于急性缺血性脑卒中辅助诊断 . 电子科技大学学报, 2024, 53(2): 307-314. doi: 10.12178/1001-0548.2023081
[2]	崔少国, 张乐迁, 文浩. GCFF-Net：一种面向视网膜血管精细分割的多层级图卷积特征融合神经编解码网络 . 电子科技大学学报, 2024, 53(): 1-11. doi: 10.12178/1001-0548.2023131
[3]	张雅斓, 董子瑞, 杜飞龙, 魏云, 卢瑞东, 班晓娟. 混合音频下心率信号感知的放松状态评估模型 . 电子科技大学学报, 2023, 52(2): 289-295. doi: 10.12178/1001-0548.2022366
[4]	刘丽娜, 王韬, 周一飞, 程志炯, 李方硕, 张昱航, 徐杰. 基于神经网络的配网电气拓扑识别算法 . 电子科技大学学报, 2023, 52(2): 247-253. doi: 10.12178/1001-0548.2022072
[5]	邓佳丽, 龚海刚, 刘明. 基于目标检测的医学影像分割算法 . 电子科技大学学报, 2023, 52(2): 254-262. doi: 10.12178/1001-0548.2022081
[6]	周雪, 梁超, 何均洋, 唐瀚林. 一体化多目标跟踪算法研究综述 . 电子科技大学学报, 2022, 51(5): 728-736. doi: 10.12178/1001-0548.2021349
[7]	谭露露, 张鑫鑫, 周银座. 多特性融合图卷积方法的分子生物活性预测 . 电子科技大学学报, 2021, 50(6): 921-929. doi: 10.12178/1001-0548.2021158
[8]	郜东瑞, 周晖, 冯李逍, 张云霞, 彭茂琴, 张永清. 基于特征融合和粒子群优化算法的运动想象脑电信号识别方法 . 电子科技大学学报, 2021, 50(3): 467-475. doi: 10.12178/1001-0548.2020107
[9]	赵夫群, 戴翀, 耿国华. 基于特征融合的文物碎片模型检索 . 电子科技大学学报, 2021, 50(2): 225-230. doi: 10.12178/1001-0548.2020281
[10]	董帅, 李文生, 张文强, 邹昆. 基于多视图循环神经网络的三维物体识别 . 电子科技大学学报, 2020, 49(2): 269-275. doi: 10.12178/1001-0548.2019017
[11]	艾斯卡尔·艾木都拉, 武文成. 基于多尺度局部梯度的点目标检测技术 . 电子科技大学学报, 2019, 48(6): 893-903. doi: 10.3969/j.issn.1001-0548.2019.06.014
[12]	闫钧华, 段贺, 艾淑芳, 李大雷, 许倩倩. 旋转复杂背景中红外运动小目标实时检测 . 电子科技大学学报, 2017, 46(5): 697-702. doi: 10.3969/j.issn.1001-0548.2017.05.010
[13]	刘玉红, 张艳山, 李永杰, 杨开富, 颜红梅. 基于视觉感受野特性的自适应图像去噪算法 . 电子科技大学学报, 2017, 46(6): 934-941. doi: 10.3969/j.issn.1001-0548.2017.06.024
[14]	凡时财, 曾祥峰, 周雪, 邹见效, 徐红兵. 融合超像素分割与码本模型的目标检测算法 . 电子科技大学学报, 2017, 46(4): 572-578. doi: 10.3969/j.issn.1001-0548.2017.04.016
[15]	张抒, 解梅. 基于热扩散理论的窗融合方法研究 . 电子科技大学学报, 2014, 43(2): 257-261. doi: 10.3969/j.issn.1001-0548.2014.02.019
[16]	胡学海, 王厚军, 黄建国. 分布式目标检测融合决策优化算法 . 电子科技大学学报, 2013, 42(3): 375-379. doi: 10.3969/j.issn.1001-0548.2013.03.011
[17]	雷刚, 蒲亦菲, 张卫华, 周激流. 张量典型相关分析及其在人脸识别中的应用 . 电子科技大学学报, 2012, 41(3): 435-440. doi: 10.3969/j.issn.1001-0548.2012.03.022
[18]	姜柯, 李艾华, 苏延召. 基于改进码本模型的视频运动目标检测算法 . 电子科技大学学报, 2012, 41(6): 932-936. doi: 10.3969/j.issn.1001-0548.2012.06.022
[19]	辛勤, 粘永健, 万建伟, 何密. 基于FastICA的高光谱图像压缩技术 . 电子科技大学学报, 2010, 39(5): 711-715,730. doi: 10.3969/j.issn.1001-0548.2010.05.014
[20]	居琰, 汪同庆, 彭建, 王贵新, 刘建胜, 袁祥辉. 特征融合用于手写体汉字识别研究 . 电子科技大学学报, 2002, 31(3): 229-233.

点击查看大图

图(8) / 表(3)

计量

文章访问数: 5041
HTML全文浏览量: 1682
PDF下载量: 112
被引次数: 0

全文HTML

目标检测是现实生活中最广泛的应用之一，其任务在于关注图片中的特定目标。一般来说，通用性目标检测包含两个子任务：一是判定特定目标的类别概率，二是给出该目标的具体位置。目标检测在实际应用中有着非常重要的作用，可以运用于人脸识别、行人重识别、工业检测、车牌号识别、医学影像等具体场景，涉及安防领域、工业领域、军事领域、交通领域、医疗领域和生活领域等。随着机器学习的蓬勃发展，普通场景下目标检测的精度已经很高，但针对复杂环境下目标数量众多、目标尺度多变、目标遮挡严重等问题，仍是国内外科研人员的研究重点^[1]。

传统的基于手工特征构建的目标检测算法过程复杂、计算量大，但为目标检测的发展奠定了理论基础。作为传统领域最经典的算法，文献[2]的目标检测器通过多尺度滑动窗口来生成可能存在的具有不同宽高比的目标区域，再利用模板进行目标匹配。另外一个与之相似的传统方法是利用梯度直方图(histogram of oriented gradient, HOG)^[3]特征和支持向量机(support vector machine, SVM)^[4]来进行目标分类。

随着计算机视觉技术的长足发展，基于深度学习的目标检测开始成为研究热门。在2012年ImageNet竞赛上取得冠军的AlexNet^[5]，是首个在大规模图像识别问题取得突破性进展的深度神经网络，并由此开启了深度神经网络在计算机视觉领域的广泛应用。基于深度神经网络的目标检测算法按照处理分类和回归的方法差异，又可划分为单阶段(one stage)和两阶段(two stage)两大派系。

两阶段算法中，以RCNN^[6]为代表的目标检测算法，其核心是采用区域提议方法，对输入图像进行选择性搜索并生成区域建议框，然后对每一个区域建议框使用卷积神经网络(convolutional neural networks, CNN)提取特征，再使用分类器进行分类。该类方法最大的短板是冗余框的重复计算，导致最快的算法^[7]在GPU上也只有7帧/s的推理速度。另一类单阶段目标检测算法是以YOLO^[8-10]和SSD^[11]为代表的基于直接回归的算法。这类算法将单个神经网络应用于整幅图像，并在最终的特征图上划分网格区域，同时预测每个区域的边界框和目标概率，在牺牲一定精度的同时大大减少了重复计算。

经过一系列的变种，这两类方法的共同点逐渐演变为在检测过程中都需要预先生成大量锚框(anchor)，这些算法统称为基于锚框(anchor based)的目标检测算法。锚框是在训练之前，在训练集上利用聚类算法得出的一组矩形框，代表数据集中目标主要分布的长宽尺寸。在推理时先在特征图上由这些锚框提取n个候选矩形框，再对这些矩形框做进一步的分类和回归。相对Two Stage算法来说，对候选框的处理依然经过前背景粗分类和多类别细分类两步。

单阶段目标检测算法由于缺少了两阶段算法的精细处理，在面对目标多尺度、遮挡等问题时表现不佳。另外，Anchor Based算法虽然在一定程度上缓解了选择性搜索带来的候选框计算量爆炸的问题，但每个网格中大量不同尺寸锚框的生成仍然造成了计算冗余，最重要的是锚框的生成依赖于大量的超参设置，手动调参会严重影响目标的定位精度和分类效果。

针对以上问题，本文提出了一种基于空洞卷积金字塔的目标检测算法(atrous convolution embedded feature pyramid network, ACFPN)，能够有效地解决因尺度和遮挡引起的漏检、错检问题，主要创新点如下：

1)设计多尺寸的空洞卷积构成的混合感受野模块(hybrid receptive field module, HRFM)，结合特征金字塔多尺度输出特性，在控制模型参数量的条件下，增大感受野获取更多全局特征细节信息，以解决目标的遮挡问题；

2)改进特征金字塔网络的结构，提出了低层嵌入特征金字塔模块(lower embedding feature pyramid module, LEFPM)，解决目标检测在处理多尺度变化上不足，融合浅层特征信息和高层特征信息，并在融合后的输出增加归一化处理和激活函数，优化模型训练；

3)引入Anchor Free机制，结合上述两点设计，减少冗余候选框带来的无效计算，提高了定位精度，有效解决漏检等问题。

4. 结束语

针对目标检测领域普遍存在的遮挡和多尺度问题，本文提出了一种基于空洞卷积特征金字塔的目标检测算法。利用空洞卷积可以有效增大感受野的优点，设计了混合感受野模块HRFM，采用多种不同尺寸的空洞卷积层密集连接，有效规避了单一空洞卷积造成的网格效应；在现有FPN的基础上重新构建网络结构，将低层特征图包含的细节信息嵌入到高层语义信息中，弥补算法对小目标物体的漏检缺陷，进一步提高目标定位的准确率。特别地，在主干部分，ACFPN将Res2Net50代替了常用的ResNet50，在增强特征表征能力的同时加快了模型收敛速度。Anchor Free机制可以有效降低候选框的冗余，从而提高定位精度，本文将FCOS的这一机制保留。通过在VOC数据集上进行测试，本文的ACFPN可以达到86.4%的mAP。本文方法为接下来行人重识别任务的开展提供了部分解决思路。

参考文献 (38)

姓名
邮箱
手机号码
标题
留言内容
验证码

留言板

基于空洞卷积金字塔的目标检测算法

doi: 10.12178/1001-0548.2021032

作者简介:
候少麒(1992-)，男，博士生，主要从事计算机视觉方面的研究

通讯作者: 殷光强，E-mail：yingq@uestc.edu.cn

Object Detection Algorithm Based on Atrous Convolutional Pyramid

计量

基于空洞卷积金字塔的目标检测算法

doi: 10.12178/1001-0548.2021032

1. 电子科技大学信息与通信工程学院　成都　611731

2. 电子科技大学信息与软件工程学院　成都　610054

作者简介:
候少麒(1992-)，男，博士生，主要从事计算机视觉方面的研究

通讯作者: 殷光强，E-mail：yingq@uestc.edu.cn

English Abstract

Object Detection Algorithm Based on Atrous Convolutional Pyramid

1. School of Information and Communication Engineering, University of Electronic Science and Technology of China　Chengdu　611731

2. School of Information and Software Engineering, University of Electronic Science and Technology of China　Chengdu　610054

全文HTML

1.1. 特征金字塔

1.2. 空洞卷积

1.3. Anchor Free机制

2.1. 整体框架

2.2. 混合感受野模块(HRFM)设计

2.3. 低层嵌入式特征金字塔模块(LEFPM)设计

3.1. 数据集和评价指标

3.1.1. 数据集

3.1.2. 评价指标mAP

3.2. 损失函数

3.3. 参数设置

3.4. 消融实验

3.5. 算法对比

3.6. 算法效果展示

目录

期刊在线

编辑办公

友情链接

留言板

基于空洞卷积金字塔的目标检测算法

doi: 10.12178/1001-0548.2021032

作者简介: 候少麒(1992-)，男，博士生，主要从事计算机视觉方面的研究

通讯作者: 殷光强，E-mail：yingq@uestc.edu.cn

Object Detection Algorithm Based on Atrous Convolutional Pyramid

计量

出版历程

基于空洞卷积金字塔的目标检测算法

doi: 10.12178/1001-0548.2021032

1. 电子科技大学信息与通信工程学院 成都 611731 2. 电子科技大学信息与软件工程学院 成都 610054

作者简介: 候少麒(1992-)，男，博士生，主要从事计算机视觉方面的研究

通讯作者: 殷光强，E-mail：yingq@uestc.edu.cn

English Abstract

Object Detection Algorithm Based on Atrous Convolutional Pyramid

1. School of Information and Communication Engineering, University of Electronic Science and Technology of China Chengdu 611731 2. School of Information and Software Engineering, University of Electronic Science and Technology of China Chengdu 610054

全文HTML

1.1. 特征金字塔

1.2. 空洞卷积

1.3. Anchor Free机制

2.1. 整体框架

2.2. 混合感受野模块(HRFM)设计

2.3. 低层嵌入式特征金字塔模块(LEFPM)设计

3.1. 数据集和评价指标

3.1.1. 数据集

3.1.2. 评价指标mAP

3.2. 损失函数

3.3. 参数设置

3.4. 消融实验

3.5. 算法对比

3.6. 算法效果展示

目录

期刊在线

编辑办公

友情链接

作者简介:
候少麒(1992-)，男，博士生，主要从事计算机视觉方面的研究

1. 电子科技大学信息与通信工程学院　成都　611731

2. 电子科技大学信息与软件工程学院　成都　610054

作者简介:
候少麒(1992-)，男，博士生，主要从事计算机视觉方面的研究

1. School of Information and Communication Engineering, University of Electronic Science and Technology of China　Chengdu　611731

2. School of Information and Software Engineering, University of Electronic Science and Technology of China　Chengdu　610054