CALIBRATION AND EPIPOLAR GEOMETRY OF GENERIC HETEROGENOUS CAMERA SYSTEMS

The application of perspective camera systems in photogrammetry and computer vision is state of the art. In recent years nonperspective and especially omnidirectional camera systems were increasingly used in close-range photogrammetry tasks. In general perspective camera model, i. e. pinhole model, cannot be applied when using non-perspective camera systems. However, several camera models for different omnidirectional camera systems are proposed in literature. Using different types of cameras in a heterogeneous camera system may lead to an advantageous combination. The advantages of different camera systems, e. g. field of view and resolution, result in a new enhanced camera system. If these different kinds of cameras can be modeled, using a unified camera model, the total calibration process can be simplified. Sometimes it is not possible to give the specific camera model in advance. In these cases a generic approach is helpful. Furthermore, a simple stereo reconstruction becomes possible using a fisheye and a perspective camera for example. In this paper camera models for perspective, wide-angle and omnidirectional camera systems are evaluated. The crucial initialization of the model’s parameters is conducted using a generic method that is independent of the particular camera system. The accuracy of this generic camera calibration approach is evaluated by calibration of a dozen of real camera systems. It will be shown, that a unified method of modeling, parameter approximation and calibration of interior and exterior orientation can be applied to derive 3D object data.


INTRODUCTION
The calibration of cameras is a very basic task in photogrammetry and computer vision.The modeling of object and image point projection is inevitable to derive accurate measurements from camera images.Perspective projection is the most commonly used projection model in this context.In many cases, this model is able to yield sub-pixel accurate camera calibrations.However, this model is inappropriate for omnidirectional or wideangle camera systems, e. g. fisheye or catadioptric.Due to their advantage of an extended field of view these camera systems become increasingly popular.Therefore less cameras are necessary to project an area in object space than using conventional camera systems.Like perspective camera systems omnidirectional ones can be used for stereographic tasks.However, there are some disadvantages like less resolution and a more complex optical design compared to perspective cameras.For modeling such extensive field of view systems particular models are necessary.In literature exist many camera models for wide-angle and fisheye lens camera systems which require different calibration methods (Basu and Licardie, 1995, Fleck, 1995, Gennery, 2006, Kumler, 2000, Miyamoto, 1964).In contrast, a common approach of modeling catadioptric systems has been established in literature that holds true for an entire class of catadioptrics (Baker andNayar, 1999, Geyer andDaniilidis, 2000).Hence, every particular camera system has its own appropriate camera model.Additionally it is often difficult to decide in advance which model is appropriate for a particular camera system.In these cases a general model is useful that is able to model the majority of commonly used camera systems.Such a general model relates any image point to a unique direction in object space.There are different approaches to model this relation.The most general approach is a non-parametric or local one.This approach relates every single pixel with a direction in object space (Ramalingam et al., 2005).There are no explicit constraints for neighboring pixels.The relations of neighbor-pixels can be totally different and therefore the modeling of almost any camera system is possible.In contrast, the parametric or global approach gives a definition for imageobject-point mapping that is valid for the whole image plane and depends only on the position of a pixel and a small set of parameters (Sturm et al., 2011).Because commonly used camera systems in photogrammetry and computer vision project object points in a regular manner, i. e. analytically, the parametric modeling approach will be used here.Additionally, a generic model makes an universal application possible and a prior determination of an appropriate particular camera model unnecessary.Generic parametric models for at least two different camera systems have been introduced in literature (Fitzgibbon, 2001, Gennery, 2006, Geyer and Daniilidis, 2000).These types of models render the calibration of certain amount of non-perspective camera systems possible, including perspective ones and are not restricted to a single camera.Despite the generic character of such a model an additional distortion model is still necessary to compensate for lens effects.
The actual calibration, i.e. determination of the model's parameters, will be carried out using object and image point correspondences.A maximum likelihood estimation (MLE) algorithm is used for full scale parameter estimation (Marquardt, 1963).Due to the non-linear character of the projection model it has to be linearized before applying the MLE algorithm.Linearization requires estimated values of each parameter.Deriving these initial values is the challenging part of the total method.Furthermore the method of deriving these initial parameters has to be as universal as the overall method is, in order to calibrate different camera classes with a unitary method.The radial alignment constraint (RAC) proposed by Tsai (Tsai, 1987) is invariant to radial com-ponent of the projection model.Camera classes differ basically in their radial projection component.Therefore the RAC provides a unified method of deriving initial parameter values.The combination of a generic camera model and an universal initial value estimation yields a unified method of camera calibration.Such a method is applicable to the vast majority of camera systems used in close-range photogrammetry and computer vision.This means the successful calibration of perspective, wide angle, telephoto, fisheye and catadioptric camera systems (with or without single viewpoint) with a common method is possible.Having such a generic calibration, two arbitary cameras, e.g.wide angle and fisheye or catadioptric and fisheye, can be used for stereo application.For validation of the proposed method real single and stereo camera systems were calibrated.Results are presented in the respective section of this paper.

CAMERA CLASSES AND MODELING
In the following section commonly used camera systems are classified according to their optomechanics.Every class and their camera models mainly used in literature will be introduced.There are three main classes: dipotrics, catoptrics and catadioptrics.Where dioptrics are made solely of lenses and catoptrics solely of mirrors, catadioptrics represent a combination of both.Generally only dioptrics and catadioptrics are used in close-range metrology.Therefore catoptrics will be left out in the remaining part of this paper.The remaining two classes can be further divided into two sub-classes.One subclass consists of camera systems where all object rays intersect approximately in one unique point, i. e. the projection center.This constraint is called single viewpoint (SVP).All other camera systems that do not obey the SVP, i. e. with many projection centers, represent the other subclass.An incident angle dependent projection center is depicted in figure 1.Here, the intersection of object ray and optical axis depends on the inclination and is shifted accordingly on the optical axis.The approach of parametric camera modeling defines a mathematical projection that is based on a small set of parameters, which projects a particular object ray, represented by its incident angle θ, onto an unique image point, represented by its radial distance r to the principal point, and vice versa.
This modeling is based on two assumptions.Firstly, the azimuth of the object ray is invariant to the projection and therefore con-stant.Secondly, the camera model is strictly monotonous.The second assumption is quite obvious because it implies that each image point represents an unique object ray.

Dioptrics
Light refracting optical systems consisting of lenses are called dioptrics.This class includes short and long focal length lens systems which cover most of the commonly used camera systems in close-range metrology.For physical background and further details please refer to textbooks like (Born et al., 1959, Hecht, 2002).The pinhole model is mostly used as the camera model of vision systems of this class.The relation of inclination angle θ and radius r of its projected image is given by: Where c is the principal distance.This parameter represents the distance along the principle axis between projection center O and image plane.The principal axis is perpendicular to the image plane.The projection center represents the pinhole where all object rays meet.Therefore, this model obeys the SVP constraint.The pinhole model realizes the perspective projection.
Figure 2 depicts the functional relation of the perspective projection.It is shown that either image radius or principal distance has to be small to realize wide field of views with the perspective model.Additionally, typical sensor sizes are depicted.The according principal distance yields the largest possible inclination angle.The asymptote is due to the definition of tangent.This characteristics underline that the perspective camera model is not appropriate for modeling of wide angle camera systems having long focal distances.According to the perspective model the radius increases significantly when inclination angle increases.This would lead to huge sensors for wide angle camera systems or physically impossible short focal lengths.To allow wide field of views a different camera model is necessary, because generally wide angle lenses do not obey the perspective camera model.These models reduce radii of large inclinations compared to the perspective model.Furthermore, some of these models allow for field of views larger than 180 • .Such models are evaluated in (Fleck, 1995) and depicted in figure 3. The stereographic, equidistant and equi-solid angle models are suited best to model camera systems with extensive field of views.These kinds of trigonometric models can be merged into a one-parameter model using Gennerys approach (Gennery, 2006): International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XXXIX-B5, 2012 XXII ISPRS Congress, 25 August -01 September 2012, Melbourne, Australia An often used camera system of the class of dioptrics is the fisheye lens.In literature many models for fisheyes were proposed and no single model has become widely accepted.Fisheyes do not obey the SVP constraint.A schematic ray tracing is depicted in figure 4. The imaginary non-refracted object rays are tangent to a caustic and intersect with the optical axis accordingly.The shift of projection center can be modeled as follows (Gennery, 2006): with: Our own experiments showed that the trigonometric part of formula 5 is insignificant.The shifting of projection center can be modeled accurately by using a polynomial only: Often the actual amount of shift is rather small.That is why many authors approximate it by using a single projection center.One example is the fisheye.In literature the equidistant model is often used as an ideal model for fisheyes.This model is defined as follows: Another common approach for fisheye modeling is a polynomial: Formula 6 can be reformed as follows: This results in the aforementioned equidistant model with supplemental modeling of projection center shifting as in formula 6.That means polynomial modeling as in equation 8 implicitly includes projection center shifting.Hence, polynomial is adequate as a generic model for fisheyes an many other camera systems with inclination angle dependent projections centers.

Catadioptrics
According to Hecht (Hecht, 2002) catadioptric are imaging systems made of a combination of lenses and mirrors.Generally, these camera systems include one mirror shaped as a rotated conic section and a dioptric camera system which projects the mirror's surface onto an image plane.Baker and Nayar (Baker and Nayar, 1999) derived the basis for application of these camera systems in photogrammetry and computer vision.They gave a formalization for this camera class.This formalization was the basis for catadioptric model proposed by Geyer and Daniilidis (Geyer and Daniilidis, 2000).This analytic model describes the whole class of catadioptric systems that obey the SVP contraint.This model is depicted schematically in figure 5.The object point is projected onto a unit sphere first.The sphere's center is the common projection center of all object rays.The intersections of the sphere and the object rays are projected onto an image plane via a second projection center O.This is the projection center of the dioptric camera system, which is in many cases a perspective one.The functional relationship of this model is given by: International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XXXIX-B5, 2012 XXII ISPRS Congress, 25 August -01 September 2012, Melbourne, Australia The usage, calibration and modeling of catadipotrics are comprehensively described in literature.A more profound understanding of catadioptrics is givenin literature (Baker andNayar, 1999, Geyer andDaniilidis, 2000).

GENERIC PROJECTION MODELS
In this section we introduce an approach that is able to model all camera systems described in section 2. This will be called the generic camera model in the following.The model in equation 3 as well as equation 10 are suitable as generic models.Also the polynomial in equation 8 as an approximation of other models is suitable as a generic camera model.Furthermore there exists an approach in literature (Fitzgibbon, 2001) that is often used in this context.The division model proposed by Fitzgibbon can be generalized to a rational function and is also evaluated to serve as generic model: Table 1 summarizes the theoretical projection characteristics of the aforementioned candidates for generic model in terms of modeling the specific camera systems of the introduced classes.Gennery's model is equivalent to the dioptric's models.The ability to model catadioptrics is limited and depends on the particular system.Generally speaking, it is not appropriate to model catadioptrics.Geyer and Daniilidis's (GaD) model is partly equivalent and approximates the rest of the dioptrics models with an accuracy of less than 1%.That is why this model is also not adequate to serve as a generic model.Both of the polynomial and the rational function show a complexity depended ability to approximated camera models introduced in this paper.To summarize, both models perform equally well in terms of approximating specific models.Generally, they incorporate two parameters to yield accuracies below 1%.Soley approximating the perspective model, the polynomial is not appropriate in cases of wide field of views.
Despite both approximating models are not equivalent to the majority of the models there are some important advantages over Gennery's or GaD's models.These two models showed an accuracy and applicability which is independent of the camera class.Furthermore they are able to cope with a mixture of models evaluated in this paper.These conclusions are drawn from camera systems which obey one of the aforementioned models.Usually, this is not true in reality and the conclusions are not valid for real camera systems.An evaluation of these models using real camera systems is given in section 4.2.In most cases of calibrating real camera systems a supplemental distortion model is needed.This is because the proposed generic models only include radial symmetric projection components.Hence, there is no need to include a radially symmetric distortion model.Including it in optimization may lead to an unstable solution or will not converge.

CALIBRATION
The calibration of camera systems using the models introduced in the sections 2 and 3 is to be done in a classic manner.A photogrammetric target is set up and calibrated precisely.Figure 6 shows the calibration target used in this work.The centers of the ellipses were determined as precise as 15µm.By employing the information about ellipse centers in image and object space, the model's parameters can be determined.The observed object points of the calibration target are projected into image space.
The projected position is compared to the actual (measured) position in image space.Parameters are optimized until both of the positions coincide best and the RMS is minimal.
Figure 6: Calibration target with calibrated bars Actual optimization is conducted using the robust Levenberg-Marquardt algorithm (LMA).About a dozen of images with different orientations and swing yield point correlations.These correlations are used by the maximum likelihood estimation of the model parameters.In case of the polynomial the function to be optimized is defined by:

Initial values
Since equation 12 is non-linear it has to be linearized prior to parameter estimation.The linearization depends on adequate initial values of the parameters (Taylor approximation).The determination of these initial values is the main challenge of the proposed algorithm of generic camera calibration.The estimation of initial values should be independent of the actual camera class.The radial alignment constraint (RAC) is an appropriate method to do so.The RAC makes use of the invariance of the object ray azimuth described before.The direction in xy-plane is preserved.The context is depicted in figure 7 and can be formulated mathematically by equation 13: A direct linear transformation of equation 13 yields parts of the parameters to be optimized.Further details are given in (Tsai, 1987).Both of the values of Z and c that cannot be determined by this algorithm are derived in a subsequent step.The generic polynomial and parameter values derived so far are used to set up another MLE that solve for Z and c.
The combination of the aforementioned initial value determination and the full scale optimization using LMA leads to a generic calibration method.This method can be used for calibration of dioptric as well as catadioptric camera systems.

Results of calibration
The proposed method of calibration and generic models were used to calibrate a multitude of different camera systems.Such  Nevertheless, it has to be extended with a distortion model using two up to four radial parameters.In opposition to the theoretical conclusions with ideal conditions both of the models of Gennery and GaD have to be extended by an appropriate radial distortion model.
One up to three additional parameters were included in the radial distortion model to yield calibrations as accurate as with trigonometric models.The overall amount of necessary parameters exceeds the one of the trigonometric model approach.The polynomial and the rational function perform equally well.Incorporating these models yield accurate calibration with the smallest amount of parameters at least in parts.The reasons for the comparatively bad results of the second catadioptric (CD 2) could not be found.

STEREO WITH GENERIC PROJECTION MODELS
Having camera calibrations use the generic method proposed above, classic epipolar geometry can be set up and stereo vision becomes possible.In particular the combination of completely different camera systems using a common model becomes possible.The main difference to the classic approach of epipolar geometry lies in the z-component of image point vectors.They have to comply with the epipolar constraint: Where the z-component is determined as follows: The inversion of the generic camera model yields the inclination angle.Inversion of the polynomial or rational function is unique and allowed because the camera model has to be strictly monotonic as motivated before.The actual angle is the smallest positive value of the inversion.Having determined that angle, the equation that gave that angle can be reformed, yielding the z-component:

Results
The method described in sections 3, 4 and 5 was applied exemplarily to two different stereo systems and the achieved accuracies were analyzed.Stereo system 1 consists of a catadioptric system and a fisheye with a baseline of 1140 mm.The experimental setup is shown in figure 8.As an illustration, the polynomial model was chosen.
Figure 8: Experimental design of heterogeneous stereo system 1 The absolute orientation of the stereo model could be transformed to the object coordinate system and hence, the accuracy of the stereo model can be evaluated.Similarly, a second design was chosen from two fisheye camera systems.Both stereo systems were successfully calibrated and a stereo model was derived.

SUMMARY AND OUTLOOK
In this paper, generic projection models were presented which are able to replace specific models of dioptric and catadioptric camera systems with comparable accuracy and complexity.This helped to develop a unified calibration method, which applies to all cameras of the introduced camera classes.It is capable to model camera systems with a single center of projection, as well as those with an incident angle dependent center of projection.Experiments have shown that the complexity rather than the particular model is responsible for the applicability to a specific camera system.It has been shown that the ideal models cannot cope with camera geometries, without additional radial distortion terms.With these additional terms, they have almost the same complexity as the generic models.This fact motivates the use of generic models because the radial projection component is mandatory and it can therefore be included in the imaging model itself.Generic mapping models, as polynomial or rational approximation, are universal and flexible and show an almost classindependent accuracy.They also offer the possibility of camera systems from different classes to be used transparently in a multicamera system and thus, allow a scene description, which conveniently combines the strengths of the respective camera systems.
In this paper we could not clarify why the method presented has shown a relatively poor accuracy in the calibration of the two examined catadioptric camera systems.In (De Villiers et al., 2011) it was shown, that elliptically shape point markers are not suitable for the catadioptric projection, because the center of ellipses is not invariant with respect to highly non-perspective projections.

Figure 1 :
Figure 1: Shifting of projection center along optical axis

Figure 5 :
Figure 5: Catadioptric model according to Geyer and Daniilidis

Figure 9 :
Figure 9: Epipolar curves with the generic projection model (intersection as epipol)

Table 1 :
Theoretical ability of approximation of camera models (amount of parameters)

Table 3 :
Accuracies of the stereo modelsWe achieved accuracies as shown in table 3. The comparatively poor calibration accuracies of the catadioptric camera systems are reflected in the stereo model.The error (RMS) in the object space was 3.4 mm at a distance of about one meter.System 2 showed a much smaller error (RMS) of 2.5 mm in object space, which is a direct result of the better calibration of the fisheye.Projection of the object rays of the homologous points in each of the other image does not result in epipolar lines.Instead, in dependence of the complexity of the selected generic projection model, epipolar curves are generated.These epipolar curves are shown in figure9.Fisheye 0.04 (0.32) orth 4 0.10 (0.87) 5 0.06 (0.35) 5 0.05 (0.39) 4 0.05 (0.44) 4 Wide-angle (120 •