DEEP SEMANTIC SEGMENTATION FOR THE OFF-ROAD AUTONOMOUS DRIVING

Sgibnev, I.; Sorokin, A.; Vishnyakov, B.; Vizilter, Y.

doi:https://doi.org/10.5194/isprs-archives-XLIII-B2-2020-617-2020

Articles | Volume XLIII-B2-2020

https://doi.org/10.5194/isprs-archives-XLIII-B2-2020-617-2020

© Author(s) 2020. This work is distributed under
the Creative Commons Attribution 4.0 License.

https://doi.org/10.5194/isprs-archives-XLIII-B2-2020-617-2020

© Author(s) 2020. This work is distributed under
the Creative Commons Attribution 4.0 License.

Articles | Volume XLIII-B2-2020

12 Aug 2020

| 12 Aug 2020

DEEP SEMANTIC SEGMENTATION FOR THE OFF-ROAD AUTONOMOUS DRIVING

I. Sgibnev, A. Sorokin, B. Vishnyakov, and Y. Vizilter

Keywords: semantic segmentation, DCNN, off-road, autonomous driving, lightweight architectures

Abstract. This paper is devoted to the problem of image semantic segmentation for machine vision system of off-road autonomous robotic vehicle. Most modern convolutional neural networks require large computing resources that go beyond the capabilities of many robotic platforms. Therefore, the main drawback of such models is extremely high complexity of the convolutional neural network used, whereas tasks in real applications must be performed on devices with limited resources in real-time. This paper focuses on the practical application of modern lightweight architectures as applied to the task of semantic segmentation on mobile robotic systems. The article discusses backbones based on ResNet18, ResNet34, MobileNetV2, ShuffleNetV2, EfficientNet-B0 and decoders based on U-Net and DeepLabV3 as well as additional components that can increase the accuracy of segmentation and reduce the inference time. In this paper we propose a model using ResNet34 and DeepLabV3 decoding with Squeeze & Excitation blocks that was optimal in terms of inference time and accuracy. We also demonstrate our off-road dataset and simulated dataset for semantic segmentation. Furthermore, we present that using pre-trained weights on simulated dataset achieves to increase 2.7% mIoU on our off-road dataset compared pre-trained weights on the Cityscapes. Moreover, we achieve 75.6% mIoU on the Cityscapes validation set and 85.2% mIoU on our off-road validation set with a speed of 37 FPS for a 1,024×1,024 input on one NVIDIA GeForce RTX 2080 card using NVIDIA TensorRT.

DEEP SEMANTIC SEGMENTATION FOR THE OFF-ROAD AUTONOMOUS DRIVING

Useful Links

Useful External Links

Our Contact