Dual-Path Vision Transformer for Auxiliary Diagnosis of Acute Ischemic Stroke

ZHANG Taohong; GUO Xueqiang; ZHENG Han; LUO Jichang; WANG Tao; JIAO Liqun; TANG Anying

doi:10.12178/1001-0548.2023081

Acute ischemic stroke is one of the fatal brain dysfunction diseases caused by the interruption of blood supply to the brain tissue. Digital Subtract Angiography (DSA) is the gold standard for diagnosing such cerebrovascular diseases. Based on the frontal and lateral DSA images of the patients, a dual-path image classification intelligent model, Dual-Path Vision Transformer (DPVF), is constructed in this paper to evaluate the treatment effectiveness of acute ischemic stroke in a graded manner. In order to improve the speed of auxiliary diagnosis, the model is constructed based on the lightweight design idea of EdgeViT. And in order to make the model have high accuracy, the spatial-channel self-attention module is proposed to promote the transformer model to capture more comprehensive feature information and improve the model representation. In addition, for the feature fusion of two branches of DPVF, a cross-attention module is constructed to cross-fuse the outputs of the two branches, which promotes the model to extract richer features and thus improves the model performance. The experimental results show that the accuracy of DPVF on the test set reaches 98.5%, which can effectively meet the practical requirements.

HTML

静脉溶栓可以一定程度上开通闭塞血管，恢复血流灌注，是治疗急性缺血性脑卒中（Acute Ischemic Stroke, AIS）的有效方式。对静脉溶栓治疗效果的评估常常需要借助X射线数字减影血管造影（Digital Subtraction Angiography, DSA）成像，DSA成像是诊断脑血管疾病的重要方法，它的基本原理是将造影前后拍摄的X射线图像进行减影，以消除血管造影影像上的骨骼和软组织结构，从而获得清晰的血管影像。在获得DSA图像后，医生可以基于图像对AIS治疗后的再灌注程度进行mTICI评分。mTICI评分根据血管再通程度分为5级，分别为0级、1级、2a级、2b级和3级。为了分级更准确，经常采用正面和侧面的DSA显影图像对以获取更充分的信息。然而，对DSA图像的识别、诊断和分级工作通常是由专业的医生来完成。近年来，随着人工智能、深度学习的快速发展，使用计算机辅助诊断可以显著提高诊断效率^[1]。其中，基于深度学习的图像分类是计算机辅助诊断的常用方法，将医疗图像作为输入，通过训练好的模型对其进行预测，输出病患病情进行智能辅助诊断。

在脑卒中辅助智能诊断模型研究中，文献[2]提出了一种基于视频的卒中损伤评估系统，使用Mask R-CNN^[3]、级联金字塔网络和时域卷积网络模型实现了自动评分。文献[4]基于血管造影参数成像（Angiographic Parametric Imaging, API）图，设计了一个能够自动评估机械血栓切除术（Mechanical Thrombectomy, MT）过程中神经血管的再灌注情况的卷积神经网络，对血管是否再通成功的预测准确率达81%。文献[5]基于卒中患者核磁共振图像（Magnetic Resonance Imaging, MRI）研究了一种基于深度学习和机器学习的混合方法，用于预测患者的语言障碍严重程度，使用CNN的高级特征和主成分分析（PCA）的图像特征作为岭回归的输入，实现了比仅使用深度学习或机器学习模型更好的性能。文献[6]基于医疗服务使用和健康行为数据，利用深度神经网络和PCA预测患者卒中的概率，AUC值达83.48%，对于具有较高卒中风险患者的早期发现具有重要意义。文献[7]使用卷积神经网络进行了急性缺血性卒中患者组织病变体积的预测，以便于医生根据患者病变体积制定科学的治疗方案，表明了使用深度卷积神经网络对卒中患者组织形态和治疗效果预测的有效性。文献[8]使用集成网络结合多个平面的API图来评估再灌注水平，使用CNN将API图分类为充分/不充分的再灌注；对于模型的输出，采用网格搜索算法对每个网络输出进行加权，结果表明使用来自多个视图的模型评估再灌注水平比使用单一视图更有效。文献[9]提出了一种基于CNN的全自动的定量TICI评分算法autoTICI，首先，利用多路径卷积神经网络将每个DSA图像序列划分为4个时期，分别为非对比度期、动脉期、实质造影期以及静脉期；其次，使用运动校正的动脉期和实质造影期的图像序列计算最小强度图，在最小强度图上，分割血管、灌注和背景；最后将autoTICI评分量化为治疗后的再灌注像素比率，实现对再灌注水平的定量分析。

以上研究表明：当前深度学习模型应用于AIS辅助影像智能诊断的研究工作主要基于CNN，且处理的大多是单面影像；文献[9]提出的模型可以同时处理正面和侧面影像，但两个视频流输入模型不仅导致数据处理量大，而且无法适应不同成像设备导致的视频流规格不一致问题，此外，提出的autoTICI的定量分析方法具有4个阶段，无法实现端到端训练。且CNN模型感受野较小，难以捕获图像全局特征。为了获得图像全局信息，并结合临床需要的正、侧面图像结合诊断，本文设计了一种基于Transformer的双路径图像分类模型Dual-Path Vision Transformer（DPVF）用于AIS辅助诊断，模型的两个路径分别用于提取患者正面和侧面DSA图像的信息特征。

3. 结束语

本文构建了一个基于Vision Transformer的轻量级双路径图像分类模型DPVF用于急性缺血性脑卒中患者DSA图像的自动分级。基于EdgeViT的轻量化设计思想进行了模型的构建，并提出空间−通道自注意力模块对原有自注意力模块进行改进，以使模型保持轻量化的同时捕获更全面的特征信息，提高模型表达能力；此外，构建交叉注意力模块对DPVF模型的两分支进行交叉融合，促使模型提取更丰富的特征，从而提高模型表现。实验结果表明，本文构建的DPVF比其他图像分类模型要好，证明了本文方法的可行性和有效性。

Reference (14)

[1]	邱甲军, 吴跃, 惠孛, 等. 肝细胞癌MR图像的纹理分类研究[J]. 电子科技大学学报, 2019, 48(4): 619-626.	QIU J J, WU Y, HUI B, et al. Texture classification study of mr images for hepatocellular carcinoma[J]. Journal of University of Electronic Science and Technology of China, 2019, 48(4): 619-626.
[2]	沈子祺, 谢文军, 刘晓平. 基于视频的自动Fugl-Meyer评估方法研究[J]. 电子测量与仪器学报, 2022, 36(2): 1-11.	SHEN Z Q, XIE W J, LIU X P. Automatic Fugl-Meyer assessment based on videos[J]. Journal of Electronic Measurement and Instrument, 2022, 36(2): 1-11.
[3]	HE K, GKIOXARI G, DOLLÁR P, et al. Mask R-CNN[C]//Proceedings of the IEEE International Conference on Computer Vision. [S.l.]: IEEE, 2017: 2961-2969.
[4]	SHIRAZ B M M, SNYDER K V, WAQAS M, et al. Use of quantitative angiographic methods with a data-driven model to evaluate reperfusion status (mTICI) during thrombectomy[J]. Neuroradiology, 2021, 63(9): 1429-1439.
[5]	CHAUHAN S, VIG L, DE F D G M, et al. A comparison of shallow and deep learning methods for predicting cognitive performance of stroke patients from MRI lesion images[J]. Frontiers in Neuroinformatics, 2019, 13: 53.
[6]	CHEON S, KIM J, LIM J. The use of deep learning to predict stroke patient mortality[J]. International Journal of Environmental Research and Public Health, 2019, 16(11): 1876.
[7]	NIELSEN A, HANSEN M B, TIETZE A, et al. Prediction of tissue outcome and assessment of treatment effect in acute ischemic stroke using deep learning[J]. Stroke, 2018, 49(6): 1394-1401.
[8]	BHURWANI M M S, SNYDER K V, WAQAS M, et al. Use of biplane quantitative angiographic imaging with ensemble neural networks to assess reperfusion status during mechanical thrombectomy[C]//Medical Imaging 2021: Computer-Aided Diagnosis. [S.l.]: Springer, 2021, 11597: 328-336.
[9]	SU R, CORNELISSEN S A P, VAN D S M, et al. AutoTICI: Automatic brain tissue reperfusion scoring on 2D DSA images of acute ischemic stroke patients[J]. IEEE Transactions on Medical Imaging, 2021, 40(9): 2380-2391.
[10]	PAN J, BULAT A, TAN F, et al. Edgevits: Competing light-weight CNNS on mobile devices with vision transformers[C]//Computer Vision-ECCV 2022: 17th European Conference. [S.l.]: Springer, 2022: 294-311.
[11]	VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[EB/OL]. [2022-10-22]. https://arxiv.org/pdf/1706.03762.pdf.
[12]	DOSOVITSKIY A, BEYER L, KOLESNIKOV A, et al. An image is worth 16x16 words: Transformers for image recognition at scale[EB/OL]. [2022-06-15]. https://arxiv.org/abs/2010.11929v1.
[13]	BA J L, KIROS J R, HINTON G E. Layer normalization[EB/OL]. [2022-06-25]. https://arxiv.org/pdf/1607.06450.pdf.
[14]	KINGMA D P, BA J. Adam: A method for stochastic optimization[EB/OL]. [2022-07-10]. http://www.arxiv.org/pdf/1412.6980.pdf.

mTICI级别	数量/对	所属分类
0	368	0
1	40	0
2a	74	0
2b	170	1
3	367	1

模型	TP	FP	FN	TN	Accuracy/%	Precision/%	Recall/%	F1 score/%
本文DPVF	94	2	1	106	98.5	97.9	98.9	98.4
EdgeViT（正+侧）	95	1	5	102	97.0	99.0	95.0	97.0
EdgeViT（正+侧）拼接	93	3	4	103	96.6	96.9	95.9	96.4
EdgeViT（正）	91	5	4	103	95.6	94.8	95.8	95.3
EdgeViT（侧）	90	6	2	105	96.1	93.8	97.8	95.8
ViT-B/16（正+侧）	83	13	10	97	88.7	86.5	89.2	87.8
ViT-B/16（正）	96	0	26	81	87.2	100	78.7	88.1
ViT-B/16（侧）	83	13	16	91	85.7	86.5	83.8	85.1
ShuffleNet V2（正+侧）	88	8	3	104	94.6	91.7	96.7	94.1
ShuffleNet V2（正）	92	4	10	97	93.1	95.8	90.2	92.9
ShuffleNet V2（侧）	93	3	9	98	94.1	96.9	91.2	94.0
ResNet-50（正+侧）	95	1	4	103	97.5	99.0	96.0	97.5
ResNet-50（正）	90	6	2	105	96.1	93.8	97.8	95.8
ResNet-50（侧）	95	1	5	102	97.0	99.0	95.0	97.0
MobileNet V2（正+侧）	76	20	3	104	88.7	79.2	96.2	86.9
MobileNet V2（正）	73	23	18	89	79.8	76.0	80.2	78.0
MobileNet V2（侧）	75	21	15	92	82.3	78.1	83.3	80.6
ConvNeXt（正+侧）	95	1	4	103	97.5	99.0	96.0	97.5
ConvNeXt（正）	92	4	2	105	97.0	95.8	97.9	96.8
ConvNeXt（侧）	94	2	4	103	97.0	97.9	95.9	96.9
AlexNet（正+侧）	94	2	20	87	89.2	97.9	82.5	89.5
AlexNet（正）	83	13	21	86	83.3	86.5	79.8	83.0
AlexNet（侧）	78	18	12	95	85.2	81.3	86.7	83.9

Dual-Path Vision Transformer for Auxiliary Diagnosis of Acute Ischemic Stroke

doi: 10.12178/1001-0548.2023081

Abstract

References

Proportional views

通讯作者: 陈斌, bchen63@163.com

Article Metrics

Related

Proportional views