The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences
Publications Copernicus
Download
Citation
Articles | Volume XLIII-B2-2021
Int. Arch. Photogramm. Remote Sens. Spatial Inf. Sci., XLIII-B2-2021, 47–52, 2021
https://doi.org/10.5194/isprs-archives-XLIII-B2-2021-47-2021
Int. Arch. Photogramm. Remote Sens. Spatial Inf. Sci., XLIII-B2-2021, 47–52, 2021
https://doi.org/10.5194/isprs-archives-XLIII-B2-2021-47-2021

  28 Jun 2021

28 Jun 2021

LARGER RECEPTIVE FIELD BASED RGB VISUAL RELOCALIZATION METHOD USING CONVOLUTIONAL NETWORK

J. Qin1, M. Li1,2, D. Li1, X. Liao3, J. Zhong1, and H. Zhang1 J. Qin et al.
  • 1State Key Laboratory of Information Engineering in Surveying Mapping and Remote Sensing, Wuhan University, Wuhan 430079, China
  • 2Department of Physics, ETH Zurich, Zurich 8093, Switzerland
  • 3Department of Land Surveying and Geo-Informatics, The Hong Kong Polytechnic University, Hong Kong 999077, China

Keywords: Visual Relocalization, Camera Relocalization, Pose Regression, Deep ConvNet, RGB Image

Abstract. Visual Relocalization is a key technology in many computer vision applications. Traditional visual relocalization is mainly achieved through geometric methods, while PoseNet introduces convolutional neural network in visual relocalization for the first time to realize real-time camera pose estimation based on a single image. Aiming at the problem of accuracy and robustness of the current PoseNet algorithm in complex environment, this paper proposes and implements a new high-precision robust camera pose calculation method (LRF-PoseNet). This method directly adjusts the size of the input image without cropping, so as to increase the receptive field of the training image. Then, the image and the corresponding pose tags are input into the improved LSTM-based PoseNet network for training, and the Adam optimizer is used to optimize the network. Finally, the trained network is used to estimate the camera pose. Experimental results on open RGB dataset show that the proposed method in this paper can obtain more accurate camera pose compared with the existing CNN-based methods.