The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences
Publications Copernicus
Download
Citation
Articles | Volume XLII-4
Int. Arch. Photogramm. Remote Sens. Spatial Inf. Sci., XLII-4, 717–724, 2018
https://doi.org/10.5194/isprs-archives-XLII-4-717-2018
Int. Arch. Photogramm. Remote Sens. Spatial Inf. Sci., XLII-4, 717–724, 2018
https://doi.org/10.5194/isprs-archives-XLII-4-717-2018

  19 Sep 2018

19 Sep 2018

INDOOR SEMANTIC SEGMENTATION FROM RGB-D IMAGES BY INTEGRATING FULLY CONVOLUTIONAL NETWORK WITH HIGHER-ORDER MARKOV RANDOM FIELD

J. Yang and Z. Kang J. Yang and Z. Kang
  • Department of Remote Sensing and Geo-Information Engineering, School of Land Science and Technology, China University of Geosciences, Xueyuan Road, Beijing, 100083, China

Keywords: Fully convolutional network, RGB-D images, Higher order potentials, Indoor scenes, Semantic segmentation

Abstract. Indoor scenes have the characteristics of abundant semantic categories, illumination changes, occlusions and overlaps among objects, which poses great challenges for indoor semantic segmentation. Therefore, we in this paper develop a method based on higher-order Markov random field model for indoor semantic segmentation from RGB-D images. Instead of directly using RGB-D images, we first train and perform RefineNet model only using RGB information for generating the high-level semantic information. Then, the spatial location relationship from depth channel and the spectral information from color channels are integrated as a prior for a marker-controlled watershed algorithm to obtain the robust and accurate visual homogenous regions. Finally, higher-order Markov random field model encodes the short-range context among the adjacent pixels and the long-range context within each visual homogenous region for refining the semantic segmentations. To evaluate the effectiveness and robustness of the proposed method, experiments were conducted on the public SUN RGB-D dataset. Experimental results indicate that compared with using RGB information alone, the proposed method remarkably improves the semantic segmentation results, especially at object boundaries.