Infrared Target Classification with Reconstruction Transfer Learning

MAO Yuan-hong; HE Zhan-zhuang; MA Zhong

doi:10.12178/1001-0548.2019162

Volume 49 Issue 4

Jul. 2020

Article Contents

Article Navigation > Journal of University of Electronic Science and Technology of China > 2020 > 49(4): 609-614

MAO Yuan-hong, HE Zhan-zhuang, MA Zhong. Infrared Target Classification with Reconstruction Transfer Learning[J]. Journal of University of Electronic Science and Technology of China, 2020, 49(4): 609-614. doi: 10.12178/1001-0548.2019162

Citation:

MAO Yuan-hong, HE Zhan-zhuang, MA Zhong. Infrared Target Classification with Reconstruction Transfer Learning[J]. Journal of University of Electronic Science and Technology of China, 2020, 49(4): 609-614. doi: 10.12178/1001-0548.2019162

Infrared Target Classification with Reconstruction Transfer Learning

doi: 10.12178/1001-0548.2019162

Xi’an Microelectronics Technology Institute　Xi’an　710065

Received Date: 2019-07-10
Rev Recd Date: 2020-03-01

Available Online: 2020-07-29

Publish Date: 2020-07-10

Abstract

Infrared target classification has important values in target recognition. At present, convolutional neural network has achieved excellent performance in visible image classification. However, for infrared images, the available networks can't achieve satisfying results due to the small number of annotated samples and large imaging differences. In this paper, visible images are used as source domain, infrared images as target domain. Transfer learning is used to address the challenges in the deep learning framework. In the transfer learning, if the target domain network can represent the distribution of its domain well, the performance and generalization of the target domain network should be more effective. Therefore, the convolutional autoencoder is trained with a large number of unannotated infrared samples, which greatly enhances the feature representation in the infrared image domain. By reducing the feature distribution distance between the two domains, the feature distributions become similar. The classification performance in the source domain is transferred to the target domain. With the changes above, the accuracy rate is improved by 11.27% compared with the method based on the visible images fine-tuning.
- convolutional autoencoder,
- convolutional neural networks,
- infrared images,
- target classification,
- transfer learning

References

[1]	CHENG Kai-sheng, LIN Huei-yung. Automatic target recognition by infrared and visible image matching[C]//2015 14th IAPR International Conference on Machine Vision Applications. Piscataway, NJ, USA: IEEE, 2015: 312-315.
[2]	张迪飞, 张金锁, 姚克明, 等. 基于SVM分类的红外舰船目标识别[J]. 红外与激光工程, 2016, 45(1): 104004. doi: 10.3788/irla201645.0104004 ZHANG Di-fei, ZHANG Jin-suo, YAO Ke-ming, et al. Infrared ship-target recognition based on SVM classification[J]. Infrared and Laser Engineering, 2016, 45(1): 104004. doi: 10.3788/irla201645.0104004
[3]	LOWE D G. Distinctive image features from scale-invariant keypoints[J]. International Journal of Computer Vision, 2004, 60(2): 91-110. doi: 10.1023/B:VISI.0000029664.99615.94
[4]	BAY H, TUYTELAARS T, GOOL L J V. SURF: Speeded up robust features[C]//9th European Conference on Computer Vision. Graz, Austria: Springer-Verlag, 2006: 404-417.
[5]	RUBLEE E, RABAUD V, KONOLIGE K, et al. ORB: An efficient alternative to SIFT or SURF[C]//2011 International Conference on Computer Vision. Piscataway, NJ, USA: IEEE, 2011: 2564-2571.
[6]	李炯, 雷虎民. 一种基于红外图像的目标自动识别算法[J]. 航空计算技术, 2005, 35(4): 26-28. doi: 10.3969/j.issn.1671-654X.2005.04.008 LI Jiong, LEI Hu-min. A method of automated recognition and classification based on infrared images[J]. Aeronautical Computer Technique, 2005, 35(4): 26-28. doi: 10.3969/j.issn.1671-654X.2005.04.008
[7]	李瑞东, 孙协昌, 李勐. 空间目标红外特征提取与识别技术[J]. 红外技术, 2017, 39(5): 427-435. LI Rui-dong, SUN Xie-chang, LI Meng. Infrared feature extraction and recognition technology of space target[J]. Infrared Technology, 2017, 39(5): 427-435.
[8]	SHAIK J S, IFTEKHARUDDIN K M. Automated tracking and classification of infrared images[C]// Proceedings of the International Joint Conference on Neural Networks. Piscataway, NJ, USA: IEEE, 2003: 1201-1206.
[9]	KRIZHEVSKY A, SUTSKEVER I, HINTON G E. ImageNet classification with deep convolutional neural networks[J]. Communications of The ACM, 2012, 60(6): 84-90.
[10]	SIMONYAN K, ZISSERMAN A. Very deep convolutional networks for large-scale image recognition. [EB/OL]. [2019-05-15]. https://arxiv.org/pdf/1409.1556.pdf.
[11]	HE K, ZHANG X, REN S, et al. Deep residual learning for image recognition[C]//IEEE Conference on Computer Vision and Pattern Recognition. Piscataway, NJ, USA: IEEE, 2016: 770-778.
[12]	HU J, SHEN L, SUN G. Squeeze-and-excitation networks[C]//IEEE Conference on Computer Vision and Pattern Recognition. Piscataway, NJ, USA: IEEE, 2018: 7132-7141.
[13]	YOSINSKI J, CLUNE J, BENGIO Y, et al. How transferable are features in deep neural networks[C]//Advances in Neural Information Processing Systems. Montreal, Canada: Curran Associates Inc, 2014: 3320-3328.
[14]	LONG M, CAO Y, WANG J, et al. Learning transferable features with deep adaptation networks[C]//International Conference on Machine Learning. Lille, France: IMLS, 2015: 97-105.
[15]	LONG M, WANG J, DING G, et al. Transfer feature learning with joint distribution adaptation[C]//IEEE International Conference on Computer Vision. Washington DC, USA: IEEE, 2013: 2200-2207.
[16]	HINTON G E. Reducing the dimensionality of data with neural networks[J]. Science, 2006, 313(5786): 504-507. doi: 10.1126/science.1127647
[17]	GRETTON A, SEJDINOVIE D, STRATHMANN H, et al. Optimal kernel choice for large-scale two-sample tests[C]//Advances in Neural Information Processing Systems. Nevada, USA: Curran Associates Inc, 2012: 1205-1213.
[18]	BADRINARAYANAN V, KENDALL A, CIPOLLA R. SegNet: A deep convolutional encoder-decoder architecture for image segmentation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(12): 2481-2495. doi: 10.1109/TPAMI.2016.2644615
[19]	MAATEN L VAN DER, HINTON G E. Visualizing data using t-SNE[J]. Journal of Machine Learning Research, 2008, 9: 2579-2605.

Proportional views

通讯作者: 陈斌, bchen63@163.com

1.
沈阳化工大学材料科学与工程学院沈阳 110142

Figures(6) / Tables(1)

Get Citation

PDF

XML

Article Metrics

Article views(6130) PDF downloads(55) Cited by()

Proportional views

HTML

红外目标分类在计算机视觉应用中有着重要的应用价值。例如，在自动目标识别中，能够实现对于检测目标在红外影像下的识别分类，可以有效提高识别性能^[1-2]。红外目标分类相对于传统可见光图像分类，也存在着更多困难。由于红外传感器的成像特点，红外图像中虽然包含目标的外部轮廓和区域特征，但是目标的颜色、纹理等细节信息非常有限。其次，对于同一类目标，红外目标存在不同视角和形变等因素的影响，增加了分类的困难。最后，红外传感器图像的样本采集成本高，很难获取大量有标注的数据集来进行训练。因此，基于大规模监督数据的深度学习方法难以直接应用于红外目标分类。

传统红外图像目标分类的方法主要是基于人工设计特征的分类。基于手工特征的匹配方法，主要通过SIFT^[3], SURF^[4], ORB^[5]等描述子，从红外图像中提取图像的特征点，然后输入分类器进行分类。文献[6]提出在红外图像上使用小波变换来提取特征，再进行分类识别。文献[7]使用PLB直方图和灰度直方图来生成红外图像特征，通过SVM进行分类。文献[8]提出红外目标的边缘特征，再使用SOM进行分类。上述方法，采用手工提取的特征无法和后续分类器进行端到端的整体优化。同时，由于获取的样本相对有限，也难以保证其在红外目标分类上的泛化性能。

深度学习的方法兴起之后，在可见光图像分类方面，取得了前所未有的性能进步^[9-12]。深度学习通过CNN(convolutional neural network)分支来提取待匹配图像的特征，使用Softmax函数对于样本进行分类。由于红外图像采集成本过高，目前没有公开的大规模有监督红外目标分类数据集。在深度学习中，如果仅使用少量的红外图像数据来进行学习，网络很快会发生过拟合，影响模型后续的泛化能力。

为了避免卷积神经网络在小样本训练中出现过拟合，通常在ImageNet数据集的预训练网络模型基础上，使用少量的红外标注样本进行再次训练，也就是业界常用的网络微调^[13]。虽然此种方法在一定程度上可以缓解模型过拟合的影响，但是由于传感器存在成像原理的差异，可见光和红外图像样本之间不是同分布的关系。因此，仅仅采用参数微调的方法，在红外目标分类上很难取得好的效果。

近年来，随着深度学习的广泛应用，与迁移学习方法之间的结合也不断加深^[14-15]。通过迁移学习，源域和目标域能够在特征空间中实现域间分布适配，解决了红外和可见光图像由于不同成像机理造成的样本分布差异。但是在迁移学习中，目标域中用于训练的样本通常非常有限，并且这些训练样本的分布可能和整个目标域的总体分布有比较大的偏差。在这种情况下，即使域适配的方法本身没有问题，迁移后的目标域模型在性能和泛化能力上也可能达不到很好的效果。因此，迁移学习时，要尽量提高目标域网络的特征表示能力，使得用于域适配的高层特征尽可能和真实分布接近，从而保证迁移学习效果。

另一方面，与可见光图像分类不同，红外目标分类领域中缺乏公开的大规模标注数据库，但是在数据采集过程中，大量无标注的红外图像却相对容易获得。这些无标签红外样本实际上也潜在包含了红外图像域的特征分布。深度自编码网络^[16]可以对于红外图像进行编码和解码，充分利用这些无监督样本进行学习。通过重构，保证目标的红外特征在目标域中不丢失，从而提高网络中红外图像的特征表达能力。

本文基于VGG16网络构造了卷积自编码器，对大量无标记的红外图像进行无监督学习，红外域分支的自编码器通过无监督学习提高了红外域特征提取能力。同时，使用迁移学习方法，将红外域分支和可见光域分支的高层特征分布进行域适配，使其特征分布相似，从而将可见光图像(源域)网络的学习能力迁移给红外图像域(目标域)。实验证明，基于重构的迁移学习，有效提高了网络对红外特征的表达能力，也提升了红外目标的分类效果。

5. 结束语

在红外图像分类中，针对红外图像中样本过少和样本不同分布的问题，本文提出了一种基于重构迁移学习的方法，充分使用了大量的无监督红外样本，提高了红外域特征表示能力，通过迁移学习实现了红外域图像和可见光图像特征分布相似。通过上述改进，相比于目前广泛使用的参数微调，本文方法的分类准确率提升了11.27%。

Reference (19)

[1]	CHENG Kai-sheng, LIN Huei-yung. Automatic target recognition by infrared and visible image matching[C]//2015 14th IAPR International Conference on Machine Vision Applications. Piscataway, NJ, USA: IEEE, 2015: 312-315.
[2]	张迪飞, 张金锁, 姚克明, 等. 基于SVM分类的红外舰船目标识别[J]. 红外与激光工程, 2016, 45(1): 104004.	ZHANG Di-fei, ZHANG Jin-suo, YAO Ke-ming, et al. Infrared ship-target recognition based on SVM classification[J]. Infrared and Laser Engineering, 2016, 45(1): 104004.
[3]	LOWE D G. Distinctive image features from scale-invariant keypoints[J]. International Journal of Computer Vision, 2004, 60(2): 91-110.
[4]	BAY H, TUYTELAARS T, GOOL L J V. SURF: Speeded up robust features[C]//9th European Conference on Computer Vision. Graz, Austria: Springer-Verlag, 2006: 404-417.
[5]	RUBLEE E, RABAUD V, KONOLIGE K, et al. ORB: An efficient alternative to SIFT or SURF[C]//2011 International Conference on Computer Vision. Piscataway, NJ, USA: IEEE, 2011: 2564-2571.
[6]	李炯, 雷虎民. 一种基于红外图像的目标自动识别算法[J]. 航空计算技术, 2005, 35(4): 26-28.	LI Jiong, LEI Hu-min. A method of automated recognition and classification based on infrared images[J]. Aeronautical Computer Technique, 2005, 35(4): 26-28.
[7]	李瑞东, 孙协昌, 李勐. 空间目标红外特征提取与识别技术[J]. 红外技术, 2017, 39(5): 427-435.	LI Rui-dong, SUN Xie-chang, LI Meng. Infrared feature extraction and recognition technology of space target[J]. Infrared Technology, 2017, 39(5): 427-435.
[8]	SHAIK J S, IFTEKHARUDDIN K M. Automated tracking and classification of infrared images[C]// Proceedings of the International Joint Conference on Neural Networks. Piscataway, NJ, USA: IEEE, 2003: 1201-1206.
[9]	KRIZHEVSKY A, SUTSKEVER I, HINTON G E. ImageNet classification with deep convolutional neural networks[J]. Communications of The ACM, 2012, 60(6): 84-90.
[10]	SIMONYAN K, ZISSERMAN A. Very deep convolutional networks for large-scale image recognition. [EB/OL]. [2019-05-15]. https://arxiv.org/pdf/1409.1556.pdf.
[11]	HE K, ZHANG X, REN S, et al. Deep residual learning for image recognition[C]//IEEE Conference on Computer Vision and Pattern Recognition. Piscataway, NJ, USA: IEEE, 2016: 770-778.
[12]	HU J, SHEN L, SUN G. Squeeze-and-excitation networks[C]//IEEE Conference on Computer Vision and Pattern Recognition. Piscataway, NJ, USA: IEEE, 2018: 7132-7141.
[13]	YOSINSKI J, CLUNE J, BENGIO Y, et al. How transferable are features in deep neural networks[C]//Advances in Neural Information Processing Systems. Montreal, Canada: Curran Associates Inc, 2014: 3320-3328.
[14]	LONG M, CAO Y, WANG J, et al. Learning transferable features with deep adaptation networks[C]//International Conference on Machine Learning. Lille, France: IMLS, 2015: 97-105.
[15]	LONG M, WANG J, DING G, et al. Transfer feature learning with joint distribution adaptation[C]//IEEE International Conference on Computer Vision. Washington DC, USA: IEEE, 2013: 2200-2207.
[16]	HINTON G E. Reducing the dimensionality of data with neural networks[J]. Science, 2006, 313(5786): 504-507.
[17]	GRETTON A, SEJDINOVIE D, STRATHMANN H, et al. Optimal kernel choice for large-scale two-sample tests[C]//Advances in Neural Information Processing Systems. Nevada, USA: Curran Associates Inc, 2012: 1205-1213.
[18]	BADRINARAYANAN V, KENDALL A, CIPOLLA R. SegNet: A deep convolutional encoder-decoder architecture for image segmentation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(12): 2481-2495.
[19]	MAATEN L VAN DER, HINTON G E. Visualizing data using t-SNE[J]. Journal of Machine Learning Research, 2008, 9: 2579-2605.

测试网络	准确率/%
SIFT+SVM	67.21
VGG16 train from scratch	65.75
VGG16+fine-tuning	78.63
transfer learning(two VGG16 branches)	86.74(92.29)
transfer learning(VGG16+aotuoencoder branches)	89.90(92.31)

Infrared Target Classification with Reconstruction Transfer Learning

doi: 10.12178/1001-0548.2019162

Abstract

References

Proportional views

通讯作者: 陈斌, bchen63@163.com

Article Metrics

Related

Proportional views