The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences
Download
Publications Copernicus
Download
Citation
Articles | Volume XLIII-B2-2022
Int. Arch. Photogramm. Remote Sens. Spatial Inf. Sci., XLIII-B2-2022, 649–656, 2022
https://doi.org/10.5194/isprs-archives-XLIII-B2-2022-649-2022
Int. Arch. Photogramm. Remote Sens. Spatial Inf. Sci., XLIII-B2-2022, 649–656, 2022
https://doi.org/10.5194/isprs-archives-XLIII-B2-2022-649-2022
 
30 May 2022
30 May 2022

AN EFFICIENT HIERARCHICAL IMAGE RETRIEVAL METHOD FOR LARGE SET OF IMAGES USING LEARNING-BASED GLOBAL AND LOCAL IMAGE FEATURES

Z. Wang, Z. Zhan, G. Zhou, and X. Wang Z. Wang et al.
  • School of Geodesy and Geomatics, Wuhan University, Wuhan 430079, China

Keywords: Learning-based Features, Local Image Features, Global Image features, Hierarchical Image retrieval

Abstract. Image retrieval is one of the supporting technologies for (near) real-time photogrammetry and loop closure detection in visual SLAM, the conventional retrieval strategy is to firstly obtain the image features of the query image and database images, and search for the resulted images based on nearest features retrieval. However, the image retrieval method based on traditional hand-crafted features (SIFT, SURF, GIST) are hard to guarantee both the efficiency of time and precision in practical applications. Nowadays, learning-based features have shown superior performance in ample computer vision tasks. Thus, this paper investigates several popular learning-based global features (ResNet101, VGG16+NetVLAD, Yolov3+VGG16+NetVLAD) and local features (SuperPoint), to take care of both time efficiency and precision, we present hierarchical image retrieval solutions that combines these two kinds of features, in which global feature is for accelerating searching speed and local feature is for precision. Specifically, three sets of hierarchical retrieval solutions are designed by various combinations of learning-based global feature and local feature. Their precision and time efficiency are compared on different public benchmarks (one contains more than 10,000 images), the experimental results show that among the proposed solutions, VGG16+NetVLAD+SuperPoint has the best performance in efficiency, but the precision is slightly lower than the solution preprocessed with Yolov3.