联合多连接特征编解码与小波池化的轻量级语义分割

Lightweight Semantic Segmentation by Combining Multi-Link Feature Codec with Wavelet Pooling

  • 摘要: 语义分割是当前场景理解领域的基础技术之一。现存的语义分割网络通常结构复杂、参数量大、图像特征信息损失过多和计算效率低。针对以上问题,基于编−解码器框架和离散小波变换,设计了一个联合多连接特征编解码与小波池化的轻量级语义分割网络MLWP-Net(Multi-Link Wavelet-Pooled Network),在编码阶段利用多连接策略并结合深度可分离卷积、空洞卷积和通道压缩设计了轻量级特征提取瓶颈结构,并设计了低频混合小波池化操作替代传统的下采样操作,有效降低编码过程造成的信息丢失;在解码阶段,设计了多分支并行空洞卷积解码器以融合多级特征并行实现图像分辨率的恢复。实验结果表明,MLWP-Net仅以0.74 MB的参数量在数据集Cityscapes和CamVid上分别达到74.1%和68.2% mIoU的分割精度,验证了该算法的有效性。

     

    Abstract: Semantic segmentation is currently one of the basic technologies in the field of scene understanding. Existing semantic segmentation networks usually result in complex structures, a large number of parameters, excessive loss of image feature information, and low computational efficiency. To address these problems, this work proposes a lightweight semantic segmentation network named MLWP-Net (Multi-Link Wavelet-Pooled Network) which combines features with multiple connections and wavelet pooling based on the encoder-decoder framework and Discrete Wavelet Transform (DWT). In the encoding phase, a lightweight feature extraction bottleneck is designed by combining with the depthwise separable convolution, dilated convolution, and channel compression, using a multi-link strategy to fuse multi-level features; besides, a low-frequency-mixed wavelet pooling operation is employed to replace the traditional downsampling operation for effectively reducing the information loss during the encoding process. In the decoding stage, a multi-branch parallel dilated convolutional decoder is designed to fuse multiple features linked to the different layers in the encoder to recover the image resolution in parallel. The experimental results show that our MLWP-Net achieves 74.1% and 68.2% mIoU segmentation accuracy on the datasets of Cityscapes and Camvid with only 0.74M parameters, which demonstrates its effectiveness for semantic segmentation.

     

/

返回文章
返回