SECOND ITERATION OF PHOTOGRAMMETRIC PIPELINE TO ENHANCE THE ACCURACY OF IMAGE POSE ESTIMATION

In classical photogrammetric processing pipeline, the automatic tie point extraction plays a key role in the quality of achieved results. The image tie points are crucial to pose estimation and have a significant influence on the precision of calculated orientation parameters. Therefore, both relative and absolute orientations of the 3D model can be affected. By improving the precision of image tie point measurement, one can enhance the quality of image orientation. The quality of image tie points is under the influence of several factors such as the multiplicity, the measurement precision and the distribution in 2D images as well as in 3D scenes. In complex acquisition scenarios such as indoor applications and oblique aerial images, tie point extraction is limited while only image information can be exploited. Hence, we propose here a method which improves the precision of pose estimation in complex scenarios by adding a second iteration to the classical processing pipeline. The result of a first iteration is used as a priori information to guide the extraction of new tie points with better quality. Evaluated with multiple case studies, the proposed method shows its validity and its high potiential for precision improvement.


INTRODUCTION
Considered as the first step of digital photogrammetric processing, tie point extraction provides main observations for pose estimation (bundle block adjustment).There exist two common approaches of tie point detection for multiple images.First approach (image based) uses local search and match algorithm such as Correlation and Least Square Matching.Second approach (key point based) uses key points detection and matching over multiple images with descriptor.
The reaseach of [Shragai et al., 2011] shows that image based method gives a good result when we know the a priori information of camera orientation while the key point based method adapts to general cases.Key point based tie points extraction method like SIFT [Lowe, 2004], SURF [Bay et al., 2008]... proved their robustness and reliability in various applications, from aerial to terrestrial scenarios [Lingua et al., 2009].Tie points measured by these methods is used to obtain a reasonable photogrammetric production.
Though this approach is operational in many cases, occasionally, especially in the presence of large depth variety like complex interior terrestrial scenario, perspective deformations between neighbor images reduce its performance, limit the precision and the accuracy of pose estimation.There exist approaches which overcome the problem of high geometrical distortion between images like A-SIFT [Morel and Yu, 2009], but at the price of sacrificing the computation time and at the risk of a large number of false matchs.
When it comes to fine metrology photogrammetric applications, tie points extracted only with image information may not be precise enough for pose estimationn.In photogrammetric processing, the quality of tie points includes the measurement, the multiplicity and the homogeneity of the distribution in 2D images as well as in 3D scenes.To extract tie points with enhanced quality for photogrammetric use, we propose a method that combines the information of 3D scenes and 2D images.The information of 3D scenes is obtained by a first photogrammetric processing which predicts and corrects the perspective effect between images before extracting tie points.Finally, a second bundle block adjustment using these tie points of high quality will improve significantly the precision of pose estimation.
Figure 1.A second iteration based on tie points of high quality extracted with a first processing result.

METHOD
Our algorithm uses as input an approximate 3D triangle mesh and the orientation of each image obtained from a first photogrammetric processing.The algorithm being applied to each triangle in the mesh consists of 3 steps: 1. Selection of images.
2. Detection of interest points on affine invariant region.

Matching by correlation.
Each step is presented below in details.

Images selection
For each 3D triangle in the mesh, a set of suitable images must be selected.Firstly, 3D triangle must be visible in the field-of-view of image.A simple filter ZBuffer is used to check this condition.
Then, a unique master image for this 3D triangle is selected on the basis of an optimal resolution condition.This master image is chosen to maximize the resolution of observation in all directions within the plane of the mesh.
The optimal resolution condition can be modeled by the deformation of the ellipse tensor between the plane of the triangle in 3D scene and in 2D image which is momentarily back-projected.
Due to the perspective deformation, a circle will be transformed into an ellipse depending on the direction of observation.Therefore, we use here a prudent strategy by defining the minimal resolution obtained among all possible directions in image.Each 3D triangle defines a 3D plane that is related to its back-projections in 2D images by a homography transformation.Because the size of the triangle is small, this relation can be approximated by an affine transformation.To estimate the parameters of the affine transformation for an image, the triangle's vertices in 3D scene and 2D image are used.(2) The value min( Vn(θ)) represents the minimal possible resolution obtained through all directions in image n of circle C(O, 1).
Hence, the master image is the maximum of all possible minimal resolutions through all images: min( Vn(θ) 2 ).Finally, after selecting master image for each 3D triangle, we have a partrition of the scene between different master images.
The set of second images is selected with the optimal resolution condition applied to master image and other images.

Detection of interest points
After selection of images, we have a mesh composed of a set of 3D triangles, each containing a master and a set of slave (secondary) images.A detection of interest points is then performed on region of interest (ROI) for each master-secondary image pair in the set.Each 3D triangle is back-projected to the images to form a correspondent triangle ROI.The ROI in the secondary image is then normalized to master image geometry by an affine transformation estimated from known corresponding coordinates of back-projected triangles.Then, a detection of interest points is performed on these affine invariant ROIs.To optimize the matching by correlation, interest points are detected with 3 conditions: • Local maxima points (minimum/maximum) as candidates.
• Non-repetitive with adjacent patterns filter.
2.2.1 Low-contrast filter: First of all, local maxima points are selected as candidate points.Then, low-contrast points are eliminated by a filter inspired by FAST [Rosten, 2006].For each candidate point whose intensity value is denoted p, we examine its neighboring points (1...n) with intensity values (p1...pn) respectively.The neighboring points of a candidate point are the pixels on a circle of radius R around it.Hence, a configuration of n = 12 or n = 24 neighboring points corresponds to R = 3 or R = 4, respectively.With an appropriate threshold value t, a neighboring point i is valid if: By examing all neighboring points, a candidate point is considered as contrast if: • 75% of neighboring points are valid.
• 60% of neighboring points are valid and contiguous.The configuration of patches around in non-repetitive filter is as same as low-contrast filter.An appropriate radius will be selected to form a circle of neighboring patches around candidate point.

Matching by correlation
Since we work on ROIs of invariant geometry (all ROIs of secondary images are normalized to the geometry of the master image), a Zero Mean Normalized Cross-Correlation (ZNCC) is selected as a matching method.For each interest point in master image, we search for a point with highest correlation in the secondary images.First of all, the potential matching region is defined as a circle around interest points of master image.Then, possible match candidates are selected as interest points in secondary images which have the same caracteristic (maximum/minimum) with considered point in master image.The correlator for each couple of possible match operates on three scales to speed up the process: • n-times downsampled without interpolation (one out of n pixels is taken into account).
• sub-pixel up to 0.01 of a pixel scale.
To remove matches of low reliability, an independent threshold value is applied at each of these three scales.

EXPERIMENTS AND RESULTS
The proposed method aims to re-extract high quality tie points by using information from the result of first photogrammetric processing to improve the precision of pose estimation by a second bundle block adjustment.To evaluate the performance of the method, we use two different real-case studies.For each case, a classical photogrammetric processing operates firstly to get input data for the proposed method.Then, improved tie points are extracted by the proposed method and a second bundle block adjustment is performed on these new tie points.The precision of pose estimation will be calculated with ground control points.Tie point projection error will be evaluated as well.All photogrammetric processings are realized with MicMac [Deseilligny and Clery, 2011] -a free open source project for photogrammetry.
Figure 11.Two photogrammetric processing chains used in experiment.
Two photogrammetric processing chains are shown in Figure 11: • Method 1: Classical photogrammetric processing.
In all cases, tie points are the unique mesurement for bundle block adjustment.The 3D model is geo-referenced with a similarity transformation estimated with CPs 1 .Two experiments on real case studies are described below.
3.0.1 Case 1 -a terrestrial scene This is a terrestrial acquisition of a polygone calibration structure.It contains 56 images, 9 GCPs 2 and 11 CPs.Images are acquired in ENSG 3 with a commercially available camera Sony RX-1.
The comparison of accuracy on CPs between method 1 and method 2 is given in Figure 13 and it shows a significant enhancement.The residuals (euclidean distance) decrease on 10/11

CONCLUSIONS AND PERSPECTIVES
In this paper, we present a detailed description of a method that improves the pose estimation precision.The method relies on the result of the classical photogrammetric processing (first iteration) and a suplementary high quality tie point extraction algorithm.
The approach proves its efficiency on two real test cases with different scenarios.For a deeper evaluation, the method must be tested on large complexe indoor scenario, in which it will probably perform better.
Moreover, the method could be used to accelerate tie point extraction procedure.A first photogrammetric processing with tie points extracted by SIFT at a low resolution can firstly generate an approximate model, then the pose estimation can be refined with the proposed method.

Figure 2 .
Figure 2. Master image selection by optimal resolution condition.

Figure 3 .
Figure 3. Different ellipses given by different directions of observation.The image on the right is selected as the master image for the concerned 3D triangle.

Figure 4 .
Figure 4. Partrition of the scene between master images.Top-left: 3D mesh, top-right: each color represents one master image, bottom: master images with their color represented in scene.

Figure 5 .
Figure 5. Upper-line: Correspondent ROIs on different images, left: ROI of master image.Lower-line: normalized ROI to master image geometry.

Figure 7 .
Figure 7. Non-repetitive filter.Left and middle: Risk of false match with linear pattern.Right: Verify the patch of one candidate (red) with patches around (yellow).

Figure 8 .
Figure 8.Detection of interest points: From the left: Local maxima points (minimum/maximum), Low-contrast points (cyan), risk of repetitive points (yellow), final points.

Figure 9 .
Figure 9. Potential matching region (yellow circle in secondaire image ROI -right) and possible match candidates (same caracteristic point: max (red), min (blue)) correspondent with considered interest point in master image (left-green)

Figure 17 .
Figure 17.Accuracy check on CPs result-Case 2 The average residual decreases by a ratio of 2.42, from 1.55mm down to 0.64mm.Accordingly, average image residual (reprojection error) is also decreased from 0.57 to 0.36 pixel 1 Control Point 2 Ground Control Point 3 Ecole nationale des sciences geographiques -France of CPs.

Table 1 .
Residual of CPs in detail of Case 1-Method 1.

Table 2 .
Residual of CPs in detail of Case 1-Method 2

Table 3 .
Residual of CPs in detail of Case 2-Method 1 Figure 18.Tie point extraction on diachronic satellite image pair.Left: with SIFT, right: with the proposed method.

Table 4 .
Residual of CPs in detail of Case 2-Method 2