LARGER RECEPTIVE FIELD BASED RGB VISUAL RELOCALIZATION METHOD USING CONVOLUTIONAL NETWORK
- 1State Key Laboratory of Information Engineering in Surveying Mapping and Remote Sensing, Wuhan University, Wuhan 430079, China
- 2Department of Physics, ETH Zurich, Zurich 8093, Switzerland
- 3Department of Land Surveying and Geo-Informatics, The Hong Kong Polytechnic University, Hong Kong 999077, China
Keywords: Visual Relocalization, Camera Relocalization, Pose Regression, Deep ConvNet, RGB Image
Abstract. Visual Relocalization is a key technology in many computer vision applications. Traditional visual relocalization is mainly achieved through geometric methods, while PoseNet introduces convolutional neural network in visual relocalization for the first time to realize real-time camera pose estimation based on a single image. Aiming at the problem of accuracy and robustness of the current PoseNet algorithm in complex environment, this paper proposes and implements a new high-precision robust camera pose calculation method (LRF-PoseNet). This method directly adjusts the size of the input image without cropping, so as to increase the receptive field of the training image. Then, the image and the corresponding pose tags are input into the improved LSTM-based PoseNet network for training, and the Adam optimizer is used to optimize the network. Finally, the trained network is used to estimate the camera pose. Experimental results on open RGB dataset show that the proposed method in this paper can obtain more accurate camera pose compared with the existing CNN-based methods.