SELF-LOCALIZATION METHOD BY INTEGRATING SENSORS

Recently, development of high performance CPU, cameras and other sensors on mobile devices have been used for wide variety of applications. Most of the applications require self-localization of the mobile device. Since the self-localization is based on GPS, gyro sensor, acceleration meter and magnetic field sensor (called as POS) of low accuracy, the applications are limited. On the other hand, self-localization method using images have been developed, and the accuracy of the method is increasing. This paper develops the self-localization method by integrating sensors, such as POS and cameras, on mobile devices simultaneously. The proposed method mainly consists of two parts: one is the accuracy improvement of POS data filtering, and another is development of self-localization method by integrating POS and camera. The POS data filtering combines all POS data by using Kalman filter in order to improve the accuracy of exterior orientation factors. The exterior orientation factors with POS filtering are used as initial value of ones in image-based self-localization method. The image-based self-localization method consists of feature points extraction and tracking, relative orientation, coordinates estimation of the feature points, and orientation factors updates of the mobile device. The proposed method is applied to POS data and images taken in urban area. Through experiments with real data, the accuracy improvement by POS data filtering is confirmed. The proposed self-localization method with POS and camera make the accuracy more sophisticated by comparing with only POS data filtering.


INTRODUCTION
Recently, development of high performance CPU, cameras and other sensors on mobile devices have been used for wide variety of applications. One of the most popular applications is augmented reality (AR). The AR technique superimposes various data on the scene which the users see at the time instead of comprehensive and detailed three dimensional modelling such as virtual reality. The AR uses sequential images taken from same view points of users as environmental scene, and then reality of visualization increase compared with the virtual reality. AR requires self-localization, namely orientation of the mobile device.
Since the self-localization of the mobile device is based on GPS, gyro sensor, acceleration meter and magnetic field sensor (called as position and orientation system (POS) in this paper) of low accuracy, the applications are limited to tags superimposition on the sequential images. On the other hand, self-localization method using images have been developed, and the accuracy of the method is increasing. The method, however, requires initial values of the orientation factors and distance of baseline. GCPs can be used to specify the absolute coordinates. But it takes a lot of time and effort, and the applicability is restrictive. This paper develops the self-localization method by integrating sensors, such as POS and cameras, on mobile devices simultaneously. The proposed method mainly consists of two parts: one is the accuracy improvement of POS data filtering, and another is development of self-localization method by integrating POS data and images.

Overview
In this study, an iPhone4s is used as a mobile device. The centre of the mobile device is defined as the origin of the coordinate system for the mobile device. In the coordinate system, z axis is set as the vertical direction of device monitor, and y axis is norm direction of the monitor plane, and x axis is defined according to right-handed coordinate system. The rotations with regard to the axis x, y, z (roll, pitch, yaw) are expressed as g  , g  , g  , respectively. For absolute coordinate system, origin is same as the device coordinate system. Z axis is set as the direction of gravitational force, and X axis is northseeking. When the both coordinate systems coincide, the rotation angles are define as 0, namely g  =0, g  =0, g  =0.
As mentioned before, POS includes GPS, gyro sensor, acceleration meter and magnetic field sensor. The corresponding data are three dimensional coordinates of the sensor (longitude: E (degree), latitude: N (degree), ellipsoidal height: h (m)), angular velocity ( g   , g   , g  (rad/s)), acceleration ( a x  , a y  , a z  (G)) and azimuth angle ( ti  (rad)), respectively.
Normally, each sensor is used separately (Daehne and Karigiannis, 2002), or some sensors are selected among them (Agrawal and Konolige, 2006;Kourogi and Kurata, 2005;Wagner et al., 2009). The proposed method combines all sensors data in order to improve the accuracy of exterior orientation factors. Figure 1 shows the pipeline of the POS data filtering.

Filtering method of rotation
Initial rotation of the mobile device is calculated from acceleration meter and magnetic field sensor, and rotation during movement is calculated from the gyro sensor. The data of gyro sensor have drift error. In order to correct the drift error, filtering method with the data of the acceleration meter and the magnetic field sensor is applied. In this step, the data of such sensors are combined by using Kalman filter (Schall et al., 2009). The assumption, that the mobile sensor moves linearly, is accepted here.

Calculation of rotation:
When the mobile device remains stationary, the acceleration meter detect only acceleration of gravity g. The relationship between the acceleration and initial rotation ( From the equation (1), roll and pitch can be calculated.
Yaw should be transformed according to the above defined coordinate system: For calculation of rotation during movement, data of the gyro sensor are added to the initial rotation. Derivatives of the rotation with respect to time (   ,   ,  ) are represented by using angular velocity ( g cos sin sin cos sin 1 0 cos cos sin cos cos 0 s i n c o s g g By integrating equation (4), the rotation during movement can be acquired.

Filtering of rotation:
According to integration, the error is accumulated. It is important to point out here that the acceleration meter and the magnetic field sensor are free from such cumulative error. With the acceleration meter and the magnetic field sensor, the rotation during movement is corrected. Filtering theory is applied in order to correct the cumulative error.
For the efficient computation, Kalman filter is adopted as filtering method (Ristic et al., 2004). The Kalman filter consists of dynamic equation and observation equation as follows: In this study, these vectors and matrices are constructed as follows: The B t is same as equation (4). Matrices H t and G t are set as identity matrices. In equation (8), suffix a, m, g express data of the acceleration meter, the magnetic field sensor, gyro sensor, respectively.
Since the Kalman filter is based on linear equation with normal distribution, the equations can be solved analytically. The equations (9) and (10) are related to forecast, and (11) -(13) filtering.
State vector x t|t is a result of filtering method for rotation.

Filtering method of position
Although the position of the mobile device is measured with GPS directly, the accuracy of the position can be improved by relating to the other sensors used for rotation calculation. Here the Kalman filter is also applied to all POS data to estimate the position.

Calculation of position:
Since GPS data are originally longitude E, latitude N, ellipsoidal height h, the coordinates are transformed to the Plane Rectangular Coordinate System (X G , Y G , Z G ). The coordinates are used for initial position of the mobile device. For measurement of position during movement, data of GPS and the acceleration meter are considered.
The relationship between the measured acceleration and second derivatives of position ( X  , Y  , Z  ) can be expressed as follows.
Here, rotation matrices are represented as R. Additionally, Where, Δt is GPS data acquisition interval. By using both of velocity and acceleration, position at time t is calculated.

Filtering of position:
In the similar fashion of filtering method of rotation, Kalman filter is applied for improving position. Dynamic and observation equations are same as equations (6), (7). State vector, observation vector and matrix F t are set for position filtering.
Analytical solution of the dynamic and observation equations can be conducted in the same manner of equations (9) -(13).

IMAGE INTEGRATING METHOD
The position and rotation based on POS data filtering are used as initial value of the orientation factors in image-based selflocalization method. The position is already represented in the geodetic coordinate, and then the distance of baseline is also supplied in the real scale. The image-based self-localization method consists of feature points extraction and tracking, relative orientation, three dimensional coordinates estimation of the feature points, and orientation factors updates of the mobile device. Figure 2 shows an overview of the proposed method.

Feature points extraction and tracking
Recently, sophisticated feature points extraction and tracking algorithms have been developed. One of the most reliable algorithms is SURF (Speeded-UP Robust Features) (Bay et al., 2008). Figure 3 shows an example of feature points extraction by the SURF in our experiments.

Figure 3. Feature points extraction by SURF
Even if the SURF is applied to feature points extraction and tracking, incorrect matching points are still exist. The feature points tracking are refined by using not only adjacent frames also sequential frames. Firstly, extracted feature points are searched in sequence between adjacent frames. After the tracking process within a certain number of frames, position of feature points are re-projected into first frames (back matching). If the displacement between first and last position of the points is larger than a threshold, the feature points are discarded. With the result of the matching, three dimensional coordinates of the feature points can be calculated. When the depth of the points is larger than a threshold, the feature points are also discarded.
After the above mentioned thresholding process, incorrect matching points still remain (Figure 4). Especially in the case  (Fischler and Bolles, 1981) is also applied ( Figure  5). RANSAC algorithm is a method of outlier removal. Finally, the remaining points are accepted as feature points.

Three dimensional coordinates estimation
The position and rotation of a mobile sensor are already acquired through POS data filtering described in the previous chapter. With the position and rotation, and feature points tracking results, three dimensional coordinates of the feature points can be calculated. Camera calibration is conducted in advance. For the calibration, checkerboard is used.
First of all, an initial tree dimensional coordinates is built based on intersection (Stewenius et al., 2006). For the optimization of intersection, RANSAC algorithm (Fischler and Bolles, 1981) is also applied.

Orientation factors updates
Once initial value of the orientation factors and three dimensional coordinates of the feature points are acquired, the orientation factors updates are conducted by bundle adjustment (Triggs et al., 2000;Luhmann et al., 2014).
The feature points have coordinates p i = (X i , Y i , Z i ). The each sequential frame has a three dimensional coordinates q j = (X j , Y j , Z j ) as the camera position. At the frames j, the feature point i has camera coordinate system (u ij , v ij ). A transformation between the camera coordinate and the world coordinate systems represents collinearity equation.
Tukey bi-weight objective function is applied as a robust objective function and x is a set of parameters. Iteration of reweighted least squares method is used to allow the robust estimator to converge.
There are two types of the bundle adjustment: full bundle adjustment and local bundle adjustment. The local bundle adjustment uses only some recent frames. The full bundle adjustment is more accurate than the local bundle adjustment, but computational load is more expensive. In the sense of computation, the local bundle adjustment is more preferable at the expense of accuracy (Arth et al., 2009). The local bundle adjustment can be applied recursively (Mclauchlan, 2000).
E 1:j expresses the objective function by using from 1st frame to jth frame. According to the recursive form, bundle adjustment can be conducted effectively. It is important to point out here that the accuracy depends on the number of frames with the recursive form. We examined the relationships between number of frames and computation time / sum of squared error. In this study, based on the length of baseline, the local bundle adjustment is applied for improving the accuracy.
In order to solve the bundle adjustment problem, Levenberg-Marquardt method (Hartley and Zisserman, 2004) is applied. The objective function E is approximated by the following formula:

EXPERIMENTS
The proposed method is applied to POS data and images taken in urban area. An iPhone4s is used as a mobile device, which is equipped with GPS, gyro sensor, acceleration meter, magnetic field sensor (POS) and a camera. Camera calibration is conducted in advance. In order to evaluate the accuracy, orientation data of Mobile Mapping System (MMS) of Trimble are compared. The MMS is also equipped with the iPhone4s.
The experimental site is in urban area (Figure 6), at which condition of GPS receiver is appropriate. Figure 6. Experimental site

Experiment of POS data filtering
First experiment is application of POS data filtering. Figure 7 compares estimation results of roll, pitch and yaw with MMS data. The displacement between estimated value and MMS data with respect to roll, pitch, and yaw is 0.024 (rad), 0.031 (rad), and 0.17 (rad) on average, respectively. In the result of yaw, the gap was large, because of limitation of function in iPhone, that is negative value of yaw angle is converted to zero automatically.
Position accuracy is also examined (Figure 8). The root mean square of the positions displacement is 5.67 (m) Figure 8. Result of position with POS sensor fusion

Experiment of POS and camera integration
As shown in Figure 8, accuracy of POS data filtering became worse at the area in a curve. Focusing on such area, result of POS and camera integration was examined. Figure 9 shows the accuracies of rotation angle by all sensors (POS and camera) integration. From the comparison between the accuracies of the proposed method and ones of POS data filtering, clear improvement was not confirmed. Indoor-Outdoor Seamless Modelling, Mapping and Navigation, 21-22 May 2015, Tokyo, Japan Figure 10 shows the result of all sensors integration. Compared with the previous result, the accuracy improvement can be confirmed. During the application area, the root mean square of the positions displacement with only POS data filtering is 3.59 (m). On the other hand, one with POS and camera integration improved to 3.26 (m).

MMS
iPhone POS iPhone POS + camera Figure 10. Result of position with all sensors integration

CONCLUSIONS
This paper develops the self-localization method using sensors, such as POS and cameras, on mobile devices simultaneously. The proposed method applies the Kalman filter in order to combine all sensors data except for camera (POS data filtering). Additionally, by using the position and rotation from POS data filtering as initial value of bundle adjustment, POS and camera integration method is achieved.
Through experiments with real data, the accuracy improvements of position and rotation by POS data filtering were confirmed. The results of final integration method improved the accuracy. It means that proposed self-localization method with POS and camera make the accuracy more sophisticated compared with only POS data filtering. Especially, the improvement at the area in a curve is noticeable. According to the experiments, the significance of the proposed method is confirmed.
As a further work, accuracy of three dimensional coordinates estimation of feature points will be evaluated by comparing with laser scanner data on MMS. Additionally, integrated filtering method between POS data filtering and bundle adjustment will become challenging investigation. As a result, promising method can be constructed, and then more impressive visualization will be accomplished.