The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences
Download
Publications Copernicus
Download
Citation
Articles | Volume XLIII-B2-2020
Int. Arch. Photogramm. Remote Sens. Spatial Inf. Sci., XLIII-B2-2020, 617–622, 2020
https://doi.org/10.5194/isprs-archives-XLIII-B2-2020-617-2020
Int. Arch. Photogramm. Remote Sens. Spatial Inf. Sci., XLIII-B2-2020, 617–622, 2020
https://doi.org/10.5194/isprs-archives-XLIII-B2-2020-617-2020

  12 Aug 2020

12 Aug 2020

DEEP SEMANTIC SEGMENTATION FOR THE OFF-ROAD AUTONOMOUS DRIVING

I. Sgibnev, A. Sorokin, B. Vishnyakov, and Y. Vizilter I. Sgibnev et al.
  • FGUP «State Research Institute of Aviation Systems», Russia, 125319, Moscow, Viktorenko street, 7

Keywords: semantic segmentation, DCNN, off-road, autonomous driving, lightweight architectures

Abstract. This paper is devoted to the problem of image semantic segmentation for machine vision system of off-road autonomous robotic vehicle. Most modern convolutional neural networks require large computing resources that go beyond the capabilities of many robotic platforms. Therefore, the main drawback of such models is extremely high complexity of the convolutional neural network used, whereas tasks in real applications must be performed on devices with limited resources in real-time. This paper focuses on the practical application of modern lightweight architectures as applied to the task of semantic segmentation on mobile robotic systems. The article discusses backbones based on ResNet18, ResNet34, MobileNetV2, ShuffleNetV2, EfficientNet-B0 and decoders based on U-Net and DeepLabV3 as well as additional components that can increase the accuracy of segmentation and reduce the inference time. In this paper we propose a model using ResNet34 and DeepLabV3 decoding with Squeeze & Excitation blocks that was optimal in terms of inference time and accuracy. We also demonstrate our off-road dataset and simulated dataset for semantic segmentation. Furthermore, we present that using pre-trained weights on simulated dataset achieves to increase 2.7% mIoU on our off-road dataset compared pre-trained weights on the Cityscapes. Moreover, we achieve 75.6% mIoU on the Cityscapes validation set and 85.2% mIoU on our off-road validation set with a speed of 37 FPS for a 1,024×1,024 input on one NVIDIA GeForce RTX 2080 card using NVIDIA TensorRT.