POSE ESTIMATION OF UNMANNED AERIAL VEHICLES BASED ON A VISION-AIDED MULTI-SENSOR FUSION

: GNSS/IMU navigation systems offer low-cost and robust solution to navigate UAVs. Since redundant measurements greatly improve the reliability of navigation systems, extensive researches have been made to enhance the efficiency and robustness of GNSS/IMU by additional sensors. This paper presents a method for integrating reference data, images taken from UAVs, barometric height data and GNSS/IMU data to estimate accurate and reliable pose parameters of UAVs. We provide improved pose estimations by integrating multi-sensor observations in an EKF algorithm with IMU motion model. The implemented methodology has demonstrated to be very efficient and reliable for automatic pose estimation. The calculated position and attitude of the UAV especially when we removed the GNSS from the working cycle clearly indicate the ability of the purposed methodology.


INTRODUCTION
One of the major research topics of Unmanned Aerial Vehicles (UAVs) development is improving the accuracy, coverage and reliability of navigation systems within the imposed weight and cost restriction (Prazenica et al. 2005;Lemaire et al. 2007;Karlsson et al. 2008;Conte & Doherty 2009;Saeedi et al. 2009).In the last decades, different combinations of navigation sensors are proposed to enhance the efficiency and robustness of automatic navigation systems.An Inertial Navigation System (INS) makes use of an Inertial Measurement Unit (IMU) to provide effective attitude, angular rate, and acceleration measurement, as well as position and velocity at highbandwidth output.However, the accuracy of an inertial navigation solution degrades with time due to the high drift rates (El-Sheimy 2002; Kim & Sukkarieh 2004;Kim 2004;Groves 2008).Global Navigation Satellite Systems (GNSS) provide a three-dimensional positioning solution by passive ranging using radio signals transmitted from orbiting satellites with high long-term position accuracy.To combine the advantages of both technologies, GNSS-aided IMUs have been developed that provide a continuous, high-bandwidth and complete navigation solution with high long-and short-term accuracy (Lewantowicz 1992;Snyder et al. 1992;Greenspan 1994;Phillips & Schmidt 1996;Sukkarieh et al. 1999;Kim & Sukkarieh 2002;Kumar 2004;Groves 2008;Nemra & Aouf 2010).Visual navigation techniques enhance the reliability and robustness of GNSS/IMU.They improve the pose parameters by measuring features in the environment and comparing them with a database (Kumar et al. 1998;Cannata et al. 2000;Wildes et al. 2001;Sim et al. 2002;Samadzadegan et al. 2007;Conte & Doherty 2009;Saeedi et al. 2009;Kamel et al. 2010;Sheta 2012;Hwangbo 2012;Sanfourche et al. 2012;Lee et al. 2013).However, the visual navigation techniques require an initialization with an approximate position solution in order to minimize the computational load and the number of ambiguities.Thus, visual navigation techniques are usually not a stand-alone navigation technique; instead they are integrated in multi-sensor navigation systems (Groves 2008).Visual navigation is rapidly developing as a cost effective, accurate tool to improve localization and pose estimation of UAVs.In this context, the research community has developed suitable vision-based systems to deal with short-and long-term GNSS outage (Sim et al. 2002;Gracias et al. 2003).The visual navigation techniques can be categorized into the Simultaneous Localization and Mapping (SLAM) and position estimation of the camera only classes (Saeedi et al. 2009).In the literatures, many ways are proposed to fuse navigation sensors, depending on the environment, dynamics, budget, accuracy requirements and the degree of robustness or integrity required.The most important challenge is the design of an integration architecture that is a trade-off between maximizing the accuracy and robustness of the navigation solution, minimizing the complexity and optimizing the processing efficiency.Moreover, the designed architecture can be severely imposed by the need to combine equipment from different manufacturers.Therefore, different architectures may be used for different sensors in the same integrated navigation system.In this paper, a vision-aided multi-sensor fusion method is presented to determine reliable pose parameters.In the proposed methodology, an integrated architecture is designed to optimize the pose accuracy and the robustness of the navigation solution, and improve the processing efficiency by integrating multi-sensor observations in an Extended Kalman Filter (EKF) algorithm with IMU motion model.In the next chapter, the concept of the proposed method is described.Then, experiments and results obtained by the integration proposed method will be presented.

SENSOR FUSION METHOD
We propose a sensor fusion method for UAV pose estimation by integrating reference data, captured images, barometric height data, and GNSS/IMU data.In the proposed methodology, an integrated architecture is designed to optimize the pose accuracy and the robustness of the navigation solution, and improve the processing efficiency.The purposed method divided into geospatial database, process model, observation model, and pose estimation (Figure 1).In the following, the main components of the each step will be described with more details.
Figure 1.The proposed workflow for sensor fusion

Geospatial Database
The geospatial database contains geo-referenced significant points which are extracted from ortho-rectified satellite imagery.The goal is to automatically match points from the database with points extracted from UAV images.For this, significant features and descriptors vector are extracted using Speeded-Up Robust Features (SURF) (Bay et al. 2009).Significant points (salient point, region corners, line intersections and etc.) are understood as features here that are distinct, spread all over the image and efficiently detectable in both spaces (Tuytelaars & Mikolajczyk 2008).Finally, the derived coordinates and descriptor vectors of significant features are congested in the geo-referenced database.This workflow consumes neither high memory to store orthorectified images on the platform nor time to extract significant features on the mission-phase.

Nonlinear Process Model
Kalman filter, one of the most widely used fusion filters in the aerial navigation applications, is an efficient approximation of Bayesian recursive filter that estimates the state of a dynamic system from a series of noisy measurements (Bishop & Welch 2001;Grewal & Andrews 2001;Kleinbauer 2004;Groves 2008).In order to use Kalman filter, the process model can be written as a first-order vector difference equation in discrete time as: where f is the state transition function at time k that forms the current vehicle state, x(k), from the previous state, x(k-1), the current control input, u(k) and the process noise, w(k), which is usually assumed to be independent, white and with normal probability distribution.
A strapdown INS can determine navigation parameters using inertial sensors.In this respect, gyroscope signals are used to determine attitude parameters (Savage 1998a).Then, accelerometer signals are transformed to the reference navigation frame using calculated attitude parameters.Finally, position and velocity can be determined by integrating the transformed accelerations (Savage 1998b).Therefore, the INS equation can be used as the process model to transform the previous state to the current state.In the earth-fixed localtangent frame formulation with Euler angles as its attitude parameters, the vehicle model becomes (Sukkarieh 1999;Kim 2004): (k) is the matrix which transforms the rotation rates in the body frame to Euler angle rates.

Observation Model
An observation model, represents the relationship between the state and the measurements.In this paper, it depends to the state parameters with aided navigation observations made at time k as: where h is the observation model at time k, and ν(k) is the observation noise, which is usually modelled as a zero mean Gaussian noise.In the following, aided navigation system observations are addressed with more details.

Visual Observation
Vision aided navigation systems based on aerial images can improve the pose estimation of UAV.We propose an automatic matching workflow of aerial images to a geo-referenced database (Figure 2).The matching workflow encounters geometric and radiometric deformation, due to the diversity of image acquisition (different viewpoints, different times, and different sensors) and various types of degradations (light condition, occlusion, shadow, and relief).
Figure 2. The proposed vision-aided navigation system Firstly, a modified SURF operator is used to detect and describe local features in the sensed imagery, which saves invariantly with respect to the translation, rotation and scale while can be more fast and reliable in the visual navigation system.In this respect, the modified SURF extracts the SURF keypoints that are stronger (the highest Hessian's determinant) in a pre-defined circular threshold.Next, given a set of keypoints detected in aerial image and the geo-referenced satellite imagery, a simple matching scheme based on the nearest neighbours in SURF descriptor's feature space is utilized.This simple matching scheme considers the SURF features and may produce outliers.Therefore, we employed the RANdom SAmple Consensus (RANSAC) (Fischler & Bolles 1981) to efficiently reject outliers using homography equations (Hartley & Zisserman 2004).Finally, we use collinearity equations to transform between 2D image points and 3D object space points from the geospatial database.Successively, the unknown Exterior Orientation Parameters (EOPs) are estimated using iterative least-square.Since for each conjugate pair of points two equations can be written, at least three well-distributed conjugate points are required to estimate the six unknown EOPs.
In the image matching process, while we are looking for the point correspondence, simultaneously the initial estimation of the camera position (provided by the process model) is used to narrow the matching search region.Moreover, for performing a robust image matching in cases without having enough information content from the geospatial database, a mosaic of several images will be employed to provide more information.Then, object space coordinates of tie points are estimated using linear form of collinearity equations.Then, single image resection algorithm is used to estimate the EOPs of the scene which doesn't have enough information.Finally, we have used bundle adjustment to simultaneously optimize EOPs and the tie ground coordinates.It can also possible to augment the tie ground coordinates to update the geo-referenced database (Kim 2004).Therefore, not only the proposed vision-aided navigation system can estimate the EOPs using the geo-referenced database, but also it can update the database simultaneously.The EOPs which have been determined by the visual aided navigation process have a different definition than the angles and rotations from the INS which are defined according to the aviation standard norm "ARINC 705" (Bäumker 1995;Bäumker & Heimes 2001;Samadzadegan & Saeedi 2007).Moreover, the calibration parameters between the camera and body frames estimated on the pre-mission phase must be employed to transform the pose parameters from the camera into the body frame.

GNSS Observation
A GNSS position solution is determined by passive ranging in three dimensions (Kaplan et al. 2006).The GNSS measured position of the UAV is transformed into the body frame based on pre-calibrated lever arms.

Barometer Observation
Height measurement sensors are the most important aided navigation system that be used in UAV navigation (Gray 1999).
A barometric altimeter, one of the most widely used height sensors, uses a barometer to measure the ambient air pressure.The height is then calculated from a standard atmospheric model (Kubrak et al. 2005;Groves 2008).

Pose Estimation
The final UAV pose is estimated by the proposed multi-sensor fusion method, which combines the aided navigation measurements with the standard IMU processor.Because the IMU mechanization equation is nonlinear, the EKF algorithm is employed to estimate the pose parameters of the UAV.The EKF algorithm is recursive and is broken into prediction and update steps.In prediction step, the vehicle pose parameters are predicted forward in time with data supplied by the inertial sensors.The state covariance is propagated forward via: In update step, the observation model runs at discrete time steps to correct the process model's estimates by using aided navigation systems.Therefore, by comparing predicted values of the measurement vector with actual measurements from aided navigation systems, the EKF algorithm maintains the estimates of the IMU pose parameters via: where the gain matrix W(k) and innovation υ(k) are calculated as: Thus, the proposed method not only has used aided navigation system measurements in the EKF algorithm for precisely determining the pose parameters of the vehicle using IMU motion model, but also has investigated hybrid integration to combine equipment from different manufacturers.

EXPERIMENTS AND RESULTS
The potential of the proposed multi-sensor navigation method was evaluated through experimental testes conducted in an area with urban, industrial, agricultural and the mountainous regions.A Quickbird satellite imagery with 60cm ground resolution is used as an interface level to simulate aerial imageries using collinearity equation and down sample four times to produce reference imagery (Figure 3). Figure 4 illustrates the reference image of the area, the planned mission trajectory and the extracted SURF keypoints in the geospatial database.The resolution of the geo-panchromatic satellite imagery was about 2.5m and 18718 keypoints were extracted on the Region of Interest (ROI) of the image.In Figure 4a, the red and blue rectangles show the ROI and the ground coverage of the planned mission.The camera centres of the images are showed by green dots and the footprint by blue lines.In Figure 4b, the yellow cross points show the geo-database that was generated by the SURF operator.The properties of the navigation sensors which are the GNSS, the IMU, the camera, and the barometric height measurement unit are described in Table 1.  2 indicate the visual navigation results in comparison with the ground truth in order to prove the feasibility and efficiency of the vision-aided navigation system.In Figure 5, the vision-aided navigation system position and attitude results are compared with the ground truth.In this figure, the continuous horizontal lines indicate the maximum uncertainties while the continuous vertical lines illustrate the scene numbers which use the mosaic-aided navigation system to estimate the pose parameters.One error source is not well distributed keypoints in the geospatial database in particular the peaks of the pose errors are created due to not well-distributed correspondence points.The accuracy of the proposed visual navigation system is also reported in Table 2.  2, the vision-aided navigation system can be used as an alternative approach when the other navigation systems are not available to constrain the IMU drifts over time.and VIM) results in comparison with the ground truth data.The visual navigation system can be used as an alternative approach when the GNSS signals are not available to constrain the IMU drifts over time.The accuracy of the proposed VI navigation system is also reported in Table 3.The proposed VIM navigation system results in comparison with the ground truth are shown in Table 4.According to the results, the mosaic-aided information can improve the VI system pose parameters about 2.6 percent in accuracy and 16.4 percent in precision.
In Table 5, the effect of the barometric height measurements on the accuracies together with the VIM navigation system (VBIM) is illustrated.According to the results, the barometric height measurements can improve the VI Down parameter about 0.7 percent in accuracy and 0.9 percent in precision.
The effect of GNSS measurements that can be augmented with the VBIM navigation system in comparison with the ground truth are showed in Table 6.According to the results, the GNSS measurements can improve the VBIM position parameters about 78.5 percent in accuracy and 55.7 percent in precision.From the GNSS and barometer-aided inertial navigation (GBI) results, the visual navigation system can improve the GBI East and Down parameters about 46.8 percent in accuracy and 10.6 percent in precision while it diminish the GBI North parameter about 92 percent in accuracy and 24.8 percent in precision.
A schematic illustration of the proposed multi-sensor navigation system results are given in Figure 6.In this figure, the first section indicates the pose accuracy of different multi-sensor navigation systems while the second section shows the pose precision of them.Based on the results, it is obvious that the more sensors are included, the better are the accuracies.The VGBIM navigation system is the most accurate and reliable positioning system between the proposed multi-sensor navigation systems as the accuracies of the position and attitude parameters are about 2.5 meter and 0.7 degree.The pose accuracy of the UAV in cases without GNSS position (VBIM) clearly indicate the potential of the proposed multi-sensor system.

CONCLUSIONS
This paper proposed a vision-aided multi-sensor fusion method to determine reliable pose parameters of UAVs.In the proposed methodology, an integrated architecture is designed to optimize the pose accuracy and the robustness of the navigation solution and to improve the processing efficiency.The described navigation solution is that of an INS reference system, corrected using the pose errors made by an EKF fusion filter integration algorithm.In this context, a visual navigation system is proposed to robustly align an aerial image to a geo-referenced ortho satellite imagery to tackle with GNSS outage.Different combinations of sensor systems also are evaluated to assess the influence of each sensor on the accuracies separately.From the experiments and results, it is obvious the redundant measurements greatly enhance the reliability of navigation systems.It can be reported that the reached accuracy of the pose parameters in cases with GNSS outage clearly indicates the potential of the purposed methodology.
and ψ n (k) are the position, velocity and attitude in the navigation frame.f b (k) and ω b (k) are acceleration and rotation rates measured in the body frame.C n (k) is the Direction Cosine Matrix and E n

Figure 5 .
Figure 5.The visual navigation system accuracy

Table 6 .
The VGBIM navigation system accuracy

Table 3 -
4 indicate the vision-aided inertial navigation accuracies without and with the mosaic-aided information (VI