AUTOMATIC ORIENTATION OF LARGE BLOCKS OF OBLIQUE IMAGES

Abstract. Nowadays, multi-camera platforms combining nadir and oblique cameras are experiencing a revival. Due to their advantages such as ease of interpretation, completeness through mitigation of occluding areas, as well as system accessibility, they have found their place in numerous civil applications. However, automatic post-processing of such imagery still remains a topic of research. Configuration of cameras poses a challenge on the traditional photogrammetric pipeline used in commercial software and manual measurements are inevitable. For large image blocks it is certainly an impediment. Within theoretical part of the work we review three common least square adjustment methods and recap on possible ways for a multi-camera system orientation. In the practical part we present an approach that successfully oriented a block of 550 images acquired with an imaging system composed of 5 cameras (Canon Eos 1D Mark III) with different focal lengths. Oblique cameras are rotated in the four looking directions (forward, backward, left and right) by 45° with respect to the nadir camera. The workflow relies only upon open-source software: a developed tool to analyse image connectivity and Apero to orient the image block. The benefits of the connectivity tool are twofold: in terms of computational time and success of Bundle Block Adjustment. It exploits the georeferenced information provided by the Applanix system in constraining feature point extraction to relevant images only, and guides the concatenation of images during the relative orientation. Ultimately an absolute transformation is performed resulting in mean re-projection residuals equal to 0.6 pix.


INTRODUCTION
The use of oblique images was first adopted in the 1920s for surveillance and reconnaissance purposes and during the last century it was mainly used for military applications.Only in the last decade oblique imagery has become a standard technology for civil applications, thanks to the development of airborne digital cameras and in particular the development of multi-camera systems, as proposed by many companies (Leica RCD30, Pictometry, Midas, BlomOblique, IGI, UltraCam Osprey, etc.).The virtue of oblique photography lies in its simplicity of interpretation and understanding for inexperienced users.These qualities allowed the use of oblique images in very different applications, such as monitoring services during mass events and environmental accidents, (Petrie, 2008, Grenzdörfer et al., 2008, Kurz et al., 2007), building detection and reconstruction (Xiao et al., 2012), building structural damage classification (Nyaruhuma et al., 2012), road land updating (Mishra et al., 2008) and administration services (Lemmens et al., 2008).What is more, oblique imagery can provide a great improvement in city modelling (Wang et al., 2008) and in some cases it can serve as a good alternative to mobile mapping surveys or airborne LiDAR acquisitions.The reason is the density and the accuracy of point clouds that can be generated with matching techniques (Fritsch et al., 2012, Gerke, 2009, Besnerais et al., 2008).What is more, a filtered point cloud can be used to produce reliable meshes for visualization purposes (i.e.AppleC3).So far, most prominent cities in the world have been covered with georeferenced oblique images and these flights are usually repeated every few years to be always up-to-date (Karbo and Simmons, 2007).Abundance and practical utility of this data is apparent, however, automated processing is still a challenge.This is primarily because complex image configurations are imposed by the acquisitions and because of shortcomings of on-board direct orientation sensors that do not satisfy the strict requirements of metric applications (point cloud generation, manual plotting).Although aerotriangulation has a long tradition, most of commercial solutions were designed only for nadir images (Jacobsen, 2008), hence cannot cope with different camera viewing angles and varying scale within images.Several papers dealing with the oblique images' orientation have been already presented (Wiedemann andMore, 2012, Gerke andNyaruhuma, 2009) proposing the use of additional constraints within bundle adjustment (relative position between images, verticality of lines in the scene, etc.), or simply aligning the (not adjusted) oblique cameras to the (adjusted) nadir ones with the use of GNSS/IMU information.Merely a single contribution has succeeded to automatically orient a large block of oblique images with a commercial software (Fritsch et al., 2012).In this paper, a fully automatic methodology for simultaneous orientation of large datasets of oblique and nadir images is described.The presented workflow relies only upon open-source software: a developed tool to analyse image connectivity and Apero (Pierrot-Deseilligny and Clery, 2011) to orient the image block.The connectivity tool was developed to exploit the georeferenced information provided by the Applanix system in constraining feature point extraction to relevant images only, and guideing the concatenation of images during the relative orientation.In the following sections theoretical background of different least squares approaches to bundle block adjustment will be given.Following this, possible ways of multi-camera orientation will be discussed.The methodology part will include a detailed description of the connectivity algorithm as well as obtained results.Finally the conclusion and the future outlook of this work will be presented.
There are different techniques by which least squares can be solved.Firstly the model behind the problem must be identified with its number of observations, underlying parameters (unknowns) and the relationship between them.If the model is linear, the objective function is quadratic and independent of unknowns therefore it is convex.The solution is then obtained with direct methods (Gauss elimination, Cholesky, QR, SVD, conjugate gradient) from the function's gradient.
In case of non-linear least squares problems (e.g.collinearity model), it is practical to obtain linear equations.Linearization implies that approximate values for all parameters are known and the most optimal values are computed in an iterative framework such that with each iteration the estimates are updated and hopefully closer to the real solution.However, the objective function is no more independent of the model parameters so there is no certainty that it is convex.The existing algorithms for finding a minimum of such functions differ in how the structure and derivatives of the this function are exploited (Nocedal and Wright, 2006).Essentially the difference is in how the unknowns are corrected from one iteration to the next.Within the photogrammetric community the most common approach is the iterative Netwon's Method (i.e.Gauss-Markov method, not reviewed hereafter), whereas in other fields e.g. computer vision, Gauss-Newton or its 'cousin' Levenberg-Marquadt are in use.
Popularity of the Newton-like methods lies in the fast convergence near the minimum.The disadvantage is that the worse the initial approximations, the more costful the iteration and the less guarantee that a global minimum is reached (Triggs et al., 2000).
To combat this ambiguity, the standard Newton method is accompanied by a step control policy that verifies whether convergence progresses along the descent direction.This combination of Newton and search direction is the foundation for the Gauss-Newton method.It sometimes is referred to as approximated because it approximates the Hessian of the cost function. where, J T Jdx GN = eJ (5) Let's denote the relationship between observations y and parameters as f (x) .Setting (3) in (2) results in (4).Next, when inserted in (1) and equating its derivative to zero it will give (5) (Engels et al., 2006).Provided that the Jacobian is of full rank, the solution to (5) gives us the search direction.It becomes also clear from ( 5) that the left hand-side of the equation represents normal equations i.e. the search direction is the solution of linear least squares and can be solved with direct algorithms.Once the direction is retrieved, an update dx (a.k.a.Gauss-Newton step) is calculated with ( 6), where the scalar α is called the step length and is chosen according to specified conditions (Nocedal and Wright, 2006).The Levenberg-Marquadt method is a further modification of the Newton method.Here, the step generation is carried out by a trust-region method rather than line-search as in Gauss-Newton.
The algorithm builds a spherical region around current parameter estimates and looks simultaneously for the direction and length of the step.On the contrary, line-search methods look independently for the direction and α corresponding to the length (Nocedal and Wright, 2006).In practice, the normal equations in ( 5) are interactively augmented by a damping factor (7). Owing to that, the algorithm operates as the Gauss-Newton for λ → 0, steepest descent direction for λ → ∞, and something in-between for the remaining λ values.The damping is raised if the cost (1) is not reduced, otherwise it is decreased.Through manipulation of the damping, rank-deficient Jacobians can also be handled (Lourakis and Argyros, 2005).

Multi-camera Bundle Block Adjustment
The Bundle Block Adjustment (BBA) in a multi-camera system must handle n different cameras with different interior (IO) and exterior orientations (EO).From a theoretical point of view, camera orientations can be retrieved in the following ways: Orientation without constraints Each camera image is oriented using independent EO for each acquisition and common IO for a given camera.The design matrix is then a combination of classical collinearity equations (in fact their derivatives): where x0n, y0n, fn are the IO parameters for the nth camera, X i 0,n ,Y i 0n , Z i 0n are the coordinates of the projection center for nth camera and ith acquisition and R is the image rotation matrix from local to global coordinate system.In this case no additional constraints and no information about the level arm between cameras are exploited.The drawback is that the number of equations quickly increases as n images are acquired for each position.
International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XL-1/W1, ISPRS Hannover Workshop 2013, 21 -24 May 2013, Hannover, Germany Orientation with additional constraints The knowledge about reciprocal positions between images is integrated into the Bundle Block Adjustment.In particular, equations about the relative rotations and bases between cameras are added to the mathematical model.By doing so, not only we may lower the number of unknowns in our system but also stabilize the whole image block.
There is more than one way to include this information within the Bundle Block.A straightforward way is to express the collinearity equations for oblique cameras as a function of the nadir camera extended with respective displacements and rotations between image coordinate systems: x where M is the rotation matrix from oblique to nadir image coordinate system, ∆xn, ∆yn, ∆zn are the displacements of camera projection centers, Zx, Zy and N as defined in equation ( 8).This approach is already familiar from multi-linear sensor orientation both on satellite and airborne platforms (Ebner et al., 1992).Such systems are calibrated prior to the acquisition and the BBA comes down to adjusting only the nadir looking lines, while treating the off-nadir ones as constants.An alternative is to introduce the constraints as observed unknowns and control their influence on the BBA (as well as their ultimately adjusted values) with appropriate stochastic properties.
Other methods (Gerke and Nyaruhuma, 2009) use additional constraints such as scene constraints to improve the Bundle Block Adjustment.
Direct orientation of oblique images Nadiral images are oriented in the traditional Bundle Block Adjustment while orientation of oblique images is inferred from their known calibration.Compared to the previous methods, this approach is less accurate and the multi-sensor capabilities are not fully utilized.Even though calibration parameters are available, their stability over time is questionable.On the other hand, it delivers results using traditional software therefore may be appreciated in applications with less stringent accuracy requirements (Wiedemann and More, 2012).

ORIENTATION
There are two orientation strategies available in Apero.Only relative orientation is performed and the coordinate system is fixed arbitrarily, or the images are oriented relatively and then transformed to an absolute reference frame if external data exists i.e. ground control points (GCP), projection centers' position, other knowledge about the scene (Pierrot-Deseilligny, 2012).In either case the Bundle Adjustment requires decent approximations for all unknowns.Initial approximations in Apero are computed with direct methods such as essential matrix and spatial resection (Pierrot-Deseilligny and Clery, 2011).They are far from optimal because they work independently on one, two or three images, not taking into account the full image block.The danger of inconsistent parameter estimates is therefore high, especially when difficult image configuration cases are handled.The employment of wise image ordering for the initial solution is therefore crucial as it saves not only time but also the possibility to converge in a wrong solution or to diverge.The in-house developed tool resolves this task.Algorithm 1 illustrates the general scheme of the tool.that stores information about the incidence of images' footprints (projected on mean ground height).Incidence relationships are kept in a connectivityContainer struct, cf.algorithm 3, and are accessed via a multiindex key.For instance, if all links to an image are of interested, multiindexing allows to carry out a partial search across existing struct instances and returns all instances that include that image ordered by a descending overlap.Multiple indices are a very powerful feature of boost library and allow for specification of complex data structures in a way most suitable for the application.
In step one links between images are created.To establish a link images are tested against overlapping area and camera looking direction.Nadir camera images can be 'tied' to any image provided the overlap is satisfied and they do not have a common projection center.Oblique camera images can be 'tied' to nadir images when overlap is satisfied, or oblique images under the additional condition that their looking directions are similar.The graphs' creation works locally -for each aeroplane station (at each station all cameras register the scene) a neighbourhood is selected to extract incidence relations between the footprints projected from the station and other footprints in the neighbourhood.For that purpose the algorithm uses a Kd-Tree that contains all stations, and at every position the neighbours within a search radius are retrieved (cf. Figure 1).Following the SIFT feature extraction (Apero module called Tapioca), the algorithm proceeds with the last step -image concatenation.The links created in the first step and stored in connectivi-tyContainer are now enriched with the number of found features.The first two nadir images of the block initialize the ordering and specify coordinate system of the relative orientation.
International Archives of the Remote and Spatial Information Sciences, Volume XL-1/W1, ISPRS Hannover Workshop 2013, 21 -24 May 2013, Hannover, Germany Projection centers inside the sphere will be used to evaluate a connectivity graph.
Any images that overlap with the first pair and for which enough SIFT features were extracted are oriented to the first pair.To remain constant scale with respect to that pair, every image (slave) is 'tied' with two images that were already oriented (masters).
The main events then take place.As the algorithm moves along the flight trajectory (stations), for every image its links to images that have fulfilled prior imposed constraints are obtained.The result is a pair of iterators to the beginning and end of the list of links, a link being an object of the connectivityContainer class, see Figure 3 and Algorithm 2. Those elements for which the 2 nd master could be found (camera1 imagei as the 1 st master) are added to the concatenation queue.In order to be sure that images are interchangeably 'tied' to nadir and oblique images, the latter is prioritized during the search for the 2 nd master.The described pipeline continues so long all the images in the block are 'tied' to their master images.Figure 2

PERFORMED TESTS
The tests were performed with a dataset over the city of Milano (Italy): the test areas included a dense urban neighbourhood with complex buildings and streets of different dimensions (cf. Figure 4).The block was acquired by the Midas-BlomOblique system composed of 5 cameras (Canon Eos 1D Mark III) with different focal lengths: the nadir camera with a 80 mm and the oblique ones with a 100 mm focal length.Oblique cameras were rotated in the four looking directions (forward, backward, left and right), 45 • with respect to the nadir.
A block of about 550 images extended across an area of 8 km in latitude and 3.5 km in longitude.The overlap between images acquired by the same camera was 80% along track and 50% across track.The system was accompanied by the Applanix GNSS/IMU that provided the first orientation of the images, sufficient for rough direct georeferencing but not accurate enough for image matching processes due to the persisting parallax problem between images.The image block had no ground control so the rank deficiency in the final adjustment was removed thanks to the GNSS/IMU data.Direct georeferencing was also essential for the connectivity tool to define the sequence of images that have to be coupled in the tie point extraction.For each image, a number ranging from 8 (on   The pipeline described in the previous section was then adopted to establish images' concatenation.Minimum overlap between images was set to 20% and cameras' attitude could differ by maximum 10 • to be regarded as similarly looking.The concatenation succeeded in linking all the images of the block.The relative orientation was then performed in Apero (no constraints between cameras' positions and attitudes were introduced) resulting in the whole block being correctly processed (cf. Figure 5).The mean value of all image sigmas (square root of the weighted quadratic residuals) was equal to 0.5pix.The images were finally absolutely oriented exploiting the GNSS information and constraining the images in their projection centres (IMU information un-exploited).It was assumed that the GNSS solutions define the position of each image with respect to the eccentricity of the system with an accuracy of 10-15 cm and the observations' weighting was set accordingly.The mean value of all image sigmas for the georeferenced result reached 0.6pix (cf. Figure 6).

CONCLUSIONS AND FUTURE DEVELOPMENTS
The multi-camera systems have shown to be a valid instrument for several applications.The interest for this kind of systems is greatly increasing in the last few years as it is highlighted by the increasing number of Geomatics companies that are now producing and commercializing multi-camera systems.Anyway, oblique image blocks differ from traditional ones in terms of geometric image configurations and higher number of images to be processed.A conventional solution to allow photogrammetric processing in a quick and automated way has not yet reached maturity and remains a research topic.
In this paper, a new methodology to process large blocks of oblique images was presented.The benefits of the connectivity tool are twofold: in terms of computational time and success of the Bundle Block Adjustment carried out in Apero.The standard parameters set in the connectivity algorithm are connected to the flight plan parameters and, for this reason, are stable for a great variety of flight configurations.
The mean re-projection residuals (image sigmas) of the BBA fell below a pixel size but no ground truth information was available and a complete assessment of the orientation quality (with ground control points and check points) was impossible.In the future, further investigations on this topic are forseen.Besides this, several tests will be performed in order to estimate the reliability of this methodology with other camera configurations and different test areas.The possibility to implement additional constraints in Apero's BBA will be evaluated too.Lastly, the oriented images will serve as input in image matching processes in order to evaluate the reliability of this kind of images for dense 3D point cloud generation.

Figure 1 :
Figure 1: Neighbourhood selected for a given aeroplane station.Projection centers inside the sphere will be used to evaluate a connectivity graph.

Figure 2 :
Figure 2: Left: Blue dots represent oblique projection centers (PP), grey dots represent oblique and nadir PP; right: three stages of image block concatenation.

Figure 3 :
Figure 3: Up: result of a partial search on camera1 image1 (red), linked footprints are displayed in purple; down: the same but contained in a data structure.

Figure 4 :
Figure 4: Test area in the Milano urban neighbourhood (red rectangle).

Figure 5 :
Figure 5: Detail of the relative oriented camera positions in the space.
International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XL-1/W1, ISPRS Hannover Workshop 2013, 21 -24 May 2013, Hannover, Germany Simonson and Suto, 2009)The developed tool creates the graph and manipulates it with the help of boost libraries, specifically boost geometry, boost polygon and boost multiindex (Boost libraries, 2013,Simonson and Suto, 2009), as well as The Point Cloud Library (PCL, 2013).Pseudocode of concatenate(...).
input : image pair, CC container, ORI set, SIFT output: image triplet, boolean return // ORI -a set of already existing masters for s = 1 in CC instances: check image pair in ORI; if both non-oriented: International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XL-1/W1, ISPRS Hannover Workshop 2013, 21 -24 May 2013, Hannover, Germany