The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences
Download
Publications Copernicus
Download
Citation
Articles | Volume XLII-2/W16
Int. Arch. Photogramm. Remote Sens. Spatial Inf. Sci., XLII-2/W16, 61–66, 2019
https://doi.org/10.5194/isprs-archives-XLII-2-W16-61-2019
Int. Arch. Photogramm. Remote Sens. Spatial Inf. Sci., XLII-2/W16, 61–66, 2019
https://doi.org/10.5194/isprs-archives-XLII-2-W16-61-2019

  17 Sep 2019

17 Sep 2019

AUTOMATIC OBJECT EXTRACTION FROM HIGH RESOLUTION AERIAL IMAGERY WITH SIMPLE LINEAR ITERATIVE CLUSTERING AND CONVOLUTIONAL NEURAL NETWORKS

A. C. Carrilho1 and M. Galo1,2 A. C. Carrilho and M. Galo
  • 1Graduate Program in Cartographic Sciences - PPGCC, São Paulo State University – UNESP,Presidente Prudente, São Paulo, Brazil
  • 2Dept. of Cartography, São Paulo State University - UNESP, Presidente Prudente, São Paulo, Brazil

Keywords: Object extraction, Simple Linear Iterative Clustering, Convolutional Neural Networks

Abstract. Recent advances in machine learning techniques for image classification have led to the development of robust approaches to both object detection and extraction. Traditional CNN architectures, such as LeNet, AlexNet and CaffeNet, usually use as input images of fixed sizes taken from objects and attempt to assign labels to those images. Another possible approach is the Fast Region-based CNN (or Fast R-CNN), which works by using two models: (i) a Region Proposal Network (RPN) which generates a set of potential Regions of Interest (RoI) in the image; and (ii) a traditional CNN which assigns labels to the proposed RoI. As an alternative, this study proposes an approach to automatic object extraction from aerial images similar to the Fast R-CNN architecture, the main difference being the use of the Simple Linear Iterative Clustering (SLIC) algorithm instead of an RPN to generate the RoI. The dataset used is composed of high-resolution aerial images and the following classes were considered: house, sport court, hangar, building, swimming pool, tree, and street/road. The proposed method can generate RoI with different sizes by running a multi-scale SLIC approach. The overall accuracy obtained for object detection was 89% and the major advantage is that the proposed method is capable of semantic segmentation by assigning a label to each selected RoI. Some of the problems encountered are related to object proximity, in which different instances appeared merged in the results.