DEVELOPMENT OF A SELF-LOCALIZATION METHOD USING SENSORS ON MOBILE DEVICES

Recently, development of high performance CPU, cameras and other sensors on mobile devises have been used for wide variety of applications. Most of the applications require self-localization of the mobile device. Since the self-localization is based on GPS, gyro sensor, acceleration meter and magnetic field sensor (called as POS) of low accuracy, the applications are limited. On the other hand, self-localization method using images have been developed, and the accuracy of the method is increasing. This paper develops the self-localization method using sensors, such as POS and cameras, on mobile devices simultaneously. The proposed method mainly consists of two parts: one is the accuracy improvement of POS data in itself by POS sensor fusion based on filtering theory, and another is development of self-localization method by integrating POS and camera. The proposed method combines all POS data by using Kalman filter in order to improve the accuracy of exterior orientation factors. The exterior orientation factors based on POS sensor fusion are used as initial value of ones in image-based self-localization method. The image-based selflocalization method consists of feature points extraction / tracking, coordinates estimation of the feature points, and orientation factors updates of the mobile device. The proposed method is applied to POS data and images taken in urban area. Through experiments with real data, the accuracy improvement by POS sensor fusion is confirmed. The proposed self-localization method with POS and camera make the accuracy more sophisticated by comparing with only POS sensor fusion.


INTRODUCTION
Recently, development of high performance CPU, cameras and other sensors on mobile devises have been used for wide variety of applications.One of the most popular applications is augmented reality (AR).The AR technique superimposes various data on the scene which the users see at the time instead of comprehensive and detailed three dimensional modelling such as virtual reality.The AR uses sequential images taken from same view points of users as environmental scene, and then reality of visualization increase compared with the virtual reality.AR requires self-localization, namely orientation of the mobile device.
Since the self-localization of the mobile device is based on GPS, gyro sensor, acceleration meter and magnetic field sensor (called as position and orientation system (POS) in this paper) of low accuracy, the applications are limited to tags superimposition on the sequential images.
On the other hand, self-localization method using images have been developed, and the accuracy of the method is increasing.The method, however, requires initial values of the orientation factors and distance of baseline.GCPs can be used to specify the absolute coordinates.But it takes a lot of time and effort, and the applicability is restrictive.
This paper develops the self-localization method using sensors, such as POS and cameras, on mobile devices simultaneously.The proposed method mainly consists of two parts: one is the accuracy improvement of POS data in itself by POS sensor fusion based on filtering theory, and another is development of self-localization method by integrating POS data and images.

Overview
In this study, an iPhone4s is used as a mobile device.The center of the mobile device is defined as the origin of the coordinate system for the mobile device.Z axis is set as the direction of gravitational force, and X axis is north-seeking.The rotations about the axis X, Y, Z (roll, pitch, yaw) are expressed as  ,  ,  , respectively.
As mentioned before, POS includes GPS, gyro sensor, acceleration meter and magnetic field sensor.The corresponding data are three dimensional coordinates of the sensor (longitude: E (degree), latitude: Normally, each sensor is used separately (Daehne and Karigiannis, 2002) or limited in parts of sensors (Agrawal and Konolige, 2006;Kourogi and Kurata, 2005;Wagner et al., 2009).The proposed method combines all sensors data in order to improve the accuracy of exterior orientation factors.Figure 1 shows the pipeline of the POS data fusion.

Filtering method of rotation
Initial rotation of the mobile device is calculated from acceleration meter and magnetic field sensor, and rotation during movement is calculated from the gyro sensor.The data of gyro sensor have drift error.In order to correct the drift error, filtering method with the data of the acceleration meter and the magnetic field sensor is applied.In this step, the data of such sensors are integrated by using Kalman filter (Schall et al., 2009).The assumption, that the mobile sensor moves linearly, is accepted here.

Calculation of rotation:
When the mobile device remains stationary, the acceleration meter detect only acceleration of gravity g.The relationships between the acceleration and initial rotation ( 0 From the equation ( 1), roll and pitch can be calculated.
Yaw should be transformed according to the above defined coordinate system: For calculation of rotation during movement, data of the gyro sensor are added to the initial rotation.Derivatives of the rotation with respect to time (   ,   ,  ) are represented by using angular velocity.0 s i n c o s 1 0 cos cos sin cos cos cos sin sin cos sin By integrating equation (4), the rotation during movement can be acquired.

Filtering of rotation:
According to integration, the error is accumulated.It is important to point out here that the acceleration meter and the magnetic field sensor are free from such cumulative error.With the acceleration meter and the magnetic field sensor, the rotation during movement is corrected.Filtering theory is applied in order to correct the cumulative error.
For the efficient computation, Kalman filter is adopted as filtering method (Ristic et al., 2004).The filtering method achieves POS sensor fusion.
The Kalman filter consists of dynamic equation and observation equation as follows: where x t = state vector at time t y t = observation vector at time t F t , H t , G t = matrices of linear function at time t v t = system noise vector at time t ~N(0, Q t ) w t = observation noise vector at time t ~N(0, R t ).
In this study, these vectors and matrices are constructed as follows: Matrices H t and G t are set as identity matrices.In equation ( 8), suffix a expresses data of the acceleration meter, and m data of the magnetic field sensor.
Since the Kalman filter is based on linear equation with normal distribution, the equations can be solved analytically.
State vector x t | t is a result of filtering method for rotation.

Filtering method of position
Although the position of the mobile device is measured with GPS directly, the accuracy of the position can be improved by relating to the other sensors used for rotation calculation.Here the Kalman filter is also utilized with all POS data to estimate the position.

Calculation of position:
Data of GPS are directly used for initial position of the mobile device.For measurement of position during movement, data of the acceleration meter is added to the initial position.
The relationships between the measured acceleration and second derivative of position ( X  , Y  , Z  ) can be expressed as follows: Here, rotation matrices are represented as R. Additionally, velocity ( X  , Y  , Z  ) can be calculated from data of GPS.
where, Δt is GPS data acquisition interval.By using both of velocity and acceleration, position at time t is calculated. (16)

Filtering of position:
In the similar fashion of filtering method of rotation, Kalman filter is applied to correct position.Dynamic equation and observation equation are same as equation ( 6), (7).State vector, observation vector and matrix F t are set for position filtering.

INTEGRATING METHOD OF IMAGES
The  The SURF algorithm uses box filter, which approximates Hessian-Laplace detector, for making integration images.The integration image improves computational speed.Additionally, points included in a certain radius circle are added for calculation of norm, and then orientation is adopted with maximum norm.According to above mentioned feature, the SURF is robust against scaling and rotation.Finally, image is divided into 4 x 4 block, and then differences of features are represented as 64 dimension SURF features by using those gradient and amplitude ( , , , dx dx dx dy     ). Figure 3 shows an example of feature points extraction by the SURF in our experiments.Even if the SURF is applied to feature points extraction and tracking, incorrect matching points are still exist.Additionally, feature points tracking is refined by using not only adjacent frames also sequential frames.Firstly, extracted feature points are searched in sequence between adjacent frames.After the tracking process is conducted within a certain number of frames, position of feature points are re-projected into first frames.If the displacement between first and last position of the points is larger than a threshold, the feature points are discarded.With the result of the matching, three dimensional coordinates of the feature points can be calculated.If the depth of the points is larger than a threshold, the feature points are also discarded.Finally, the remaining points are accepted as feature points.
After the above mentioned thresholding process, incorrect matching points still remain (Figure 4).Especially in the case of application in urban area, similar textures make such incorrect matching.In this study, RANSAC (Random Sample Consensus) (Fischler and Bolles, 1981) is applied (Figure 5).RANSAC algorithm is a method of outlier removal.

Three dimensional coordinates estimation
The position and rotation of mobile sensor are already acquired through POS sensor fusion described in the previous chapter.
With the position and rotation, and feature points tracking results, three dimensional coordinates of the feature points can be calculated.Camera calibration is conducted in advance.For the calibration, checkerboard is used.
First of all, an initial tree dimensional coordinates is built based on intersection (Stewenius et al., 2006).For the optimization of intersection, RANSAC algorithm (Fischler and Bolles, 1981) is also applied.

Self-localization updates
Once initial value of the orientation factors and three dimensional coordinates of the feature points are acquired, the orientation factors updates are conducted by bundle adjustment (Triggs et al., 2000).
The feature points have coordinates The each sequential frame has a three dimensional coordinates q j = (X j , Y j , Z j ) as the camera position.At the frames j, the feature point i has camera coordinates system (u ij , v ij ).A transformation between the camera coordinates and the world coordinates systems represents collinearity equation.There are two types of the bundle adjustment: full bundle adjustment and local bundle adjustment.The local bundle adjustment uses only some recent frames.The full bundle adjustment is more accurate than the local bundle adjustment, but computational load is more expensive.In the sense of computation, the local bundle adjustment is more preferable at the expense of accuracy (Arth et al., 2009).The local bundle adjustment can be applied recursively (Mclauchlan, 2000).
E 1:j expresses the objective function by using from 1st frame to jth frame.According to the recursive form, bundle adjustment can be conducted effectively.It is important to point out here that the accuracy depends on the number of frames with the recursive form.We examined the relationships between number of frames and computation time / sum of squared error.In this study, based on the length of baseline, the local bundle adjustment is applied for improving the accuracy.
In order to solve the bundle adjustment problem, Levenberg-Marquardt method (Hartley and Zisserman, 2004) is applied.The objective function E is approximated by the following formula:

EXPERIMENTS
The proposed method is applied to POS data and images taken in urban area.An iPhone4s is used as a mobile device, which is equipped with GPS, gyro sensor, acceleration meter, magnetic field sensor (POS) and a camera.Camera calibration is conducted in advance.In order to evaluate the accuracy, orientation data of mobile mapping system (MMS) of Trimble are compared.The MMS is also equipped with the iPhone4s.The experimental site is in urban area (Figure 6), at which condition of GPS receiver is appropriate.The displacement between estimated value and MMS data with respect to roll, pitch, and yaw is 0.024 (rad), 0.031 (rad), and 0.17 (rad) on average, respectively.In the result of yaw, the gap was large, because of limitation of function in iPhone, that is negative value of yaw angle is converted to zero automatically.
Position accuracy is also examined 8).The root mean square of the positions displacement is 5.67 (m) Figure 8. Result of position with POS sensor fusion

Experiment of POS and camera integration
As shown in Figure 8, accuracy of POS filtering became worse at the area in a curve.Focusing on such area, result of POS and camera integration was examined.
Figure 9 shows the accuracies of rotation angle by all sensors (POS and camera) integration.From the comparison between the accuracies of all sensors integration and ones of POS sensor fusion, clear improvement was not confirmed.Figure 10 shows the result of all sensors integration.Compared with the previous result, the accuracy improvement can be confirmed.During the application area, the root mean square of the positions displacement with only POS sensor fusion is 3.59 (m).On the other hand, one with POS and camera integration improved to 3.26 (m).This paper develops the self-localization method using sensors, such as POS and cameras, on mobile devices simultaneously.The proposed method applies the Kalman filter in order to combine all sensors except for camera (POS sensor fusion).Additionally, by using the position and rotation from POS sensor fusion as initial value of bundle adjustment, POS and camera integration method is achieved.
Through experiments with real data, the accuracy improvements of position and rotation by POS sensor fusion were confirmed.The results of final integration method improved the accuracy.It means that proposed self-localization method with POS and camera make the accuracy more sophisticated compared with only POS sensor fusion.Especially, the improvement at the area in a curve is noticeable.According to the experiments, the significance of the proposed method is confirmed.
As a further work, accuracy of three dimensional coordinates estimation of feature points will be evaluated by comparing with laser scanner data on MMS.Additionally, integrated filtering method between POS filtering and bundle adjustment will become challenging investigation.As a result, promising method can be constructed, and then more impressive visualization will be accomplished.

Figure 1 .
Figure 1.Flow of the POS data fusion position and rotation based on POS sensor fusion are used as initial value of the orientation factors in image-based selflocalization method.The position is already represented in the geodetic coordinate, and then the distance of baseline is also supplied in the real scale.The image-based self-localization method consists of feature points extraction / tracking, three dimensional coordinates estimation of the feature points, and orientation factors updates of the mobile device.Figure 2 shows an overview of the proposed method.

Figure 2 .
Figure 2. Flow of the POS and camera integration

Figure 3 .
Figure 3. Feature points extraction by SURF

Figure 4 .
Figure 4. Incorrect matching by SURF interior orientation a kl = factors of rotation matrix The position and rotation updates are computed iteratively by minimizing a robust objective function of the re-projection error.
and x is a set of parameters.Iteration of reweighted least squares method is used to allow the robust estimator to converge.

Figure
Figure 6.Experimental site4.1 Experiment of POS sensor fusionFirst experiment is application of POS sensor fusion by using only POS data.Figure7compares estimation results of roll, pitch and yaw with MMS data.
yaw Figure 9. Result of rotation angle with all sensors integration

Figure 10 .
Figure 10.Result of position with all sensors integration