POSE ESTIMATION AND MAPPING USING CATADIOPTRIC CAMERAS WITH SPHERICAL MIRRORS

Catadioptric cameras have the advantage of broadening the field of view and revealing otherwise occluded object parts. However, they differ geometrically from standard central perspective cameras because of light reflection from the mirror surface which alters the collinearity relation and introduces severe non-linear distortions of the imaged scene. Accommodating for these features, we present in this paper a novel modeling for pose estimation and reconstruction while imaging through spherical mirrors. We derive a closed-form equivalent to the collinearity principle via which we estimate the system’s parameters. Our model yields a resection-like solution which can be developed into a linear one. We show that accurate estimates can be derived with only a small set of control points. Analysis shows that control configuration in the orientation scheme is rather flexible and that high levels of accuracy can be reached in both pose estimation and mapping. Clearly, the ability to model objects which fall outside of the immediate camera field-of-view offers an appealing means to supplement 3-D reconstruction and modeling.


INTRODUCTION
Digital imaging makes photogrammetry relevant for a wide variety of applications.Yet, the limited field of view imposes limitations on scene coverage and necessitate acquisition of a large amount of images, even for a modest scenes.In this regards, expansion of the field of view by incorporation of or imaging through mirrors (aka catadioptric cameras) can facilitate mapping of otherwise unseen or occluded objects or object parts.The literature also shows a broad spectrum of cameras which differ from one another by the mirror shape and their number (Gluckman and Nayar, 2001;Yi and Ahuja, 2006;López-Nicolás and Sagüés, 2014;Jeng and Tsai, 2003;Geyer and Daniilidis, 2002;Luo et al., 2007).
Such imaging configurations broaden the field of view, but are governed by light reflection from the mirror surface.Thus, the collinearity relation that stands at the core of photogrammetric modeling does not hold here any longer.The challenge is therefore to establish object-to image-space relation in a manner that can lead to estimation of the camera pose parameters and to the performance of mapping.Our focus in this paper is on an imaging system which incorporates a camera and a spherical mirror.The latter is inexpensive and simple to manufacture thereby making it attractive to incorporate (Ohte et al., 2005).Its symmetric form also makes it advantageous from a modeling perspective.In terms of system modeling, Ohte et al. (2005) model the reflection and projection of an object-point from spherical mirrors.The focus lies on the image formation rather than on pose estimation.Micušık and Pajdla (2003) propose an approximation to the central perspective model with a single calibration parameter.However, the approximation error does not allow to estimate the mapping error as a function of the image noise.Lanman et al. (2006) describe a system composed of an array of spherical mirrors which provides multiple views from a single image.The authors propose a bundle adjustment-like solution, however fail to discuss accuracy matters.Agrawal (2013) uses a co-planarity constraint, based on the fact that an object-point and the sphereand camera-centers are co-planar.Geometrical properties within the plane are not considered there, and the sphere-center and camera parameters require eight or more control-points for the parameter estimation.
In this paper we propose a novel model for pose estimation and reconstruction while imaging through a spherical mirror.Doing so, we first develop expressions that relate object-and imagespace points and then model the imaging system as a whole.Our derivations yield a closed-form equivalent to the collinearity principle, which we then show that can be developed into a linear one.Studying the requirements for estimating the imaging system parameters shows that a minimum of only three control-points is needed and that estimates are stable and accurate.The paper then studies reconstruction models and evaluate their accuracy.Thus, it offers not only study of the system geometry and its implications on modeling and accuracy, but also provides a viable framework for pose estimation and modeling.

SPHERICAL IMAGING CONFIGURATION MODELING
Geometric relations within spherical catadioptric systems -Images acquired by catadioptric-cameras are formed by reflection of light from the mirror surface and onto the image plane (Fig. 1).The law of reflection states that the incident ray −−→ ximi, the reflected ray, − − → mic, and the normal to mirror surface are coplanar and lie on the plane of reflection (Fig. 1).The angles between both rays and the normal to the mirror surface are also equal to one another (Fig. 1).Furthermore, as the radius coincides with the normal to the sphere surface, the sphere center, o, also lies within the plane of reflection.We also point to a general classification of catadioptric cameras according to the geometric properties of the extended rays (as if they were not reflected).If all the extended rays intersect at a single point, the camera is termed central, similar to the pinhole camera.If all the extended rays intersect along a single line, the camera is termed axial, as is the case with our system (Ramalingam et al., 2006).Here the camera-axis is the vector − → oc, which links the perspective-and camera-centers, with µ the distance between o and c (Fig. 1).The parameters we model in this setup, include: the camera projection center c (Fig. 1) and the spherical mirror center, o.The objective is to estimate these parameters using a set of referencepoints, xi, which are projected onto the image plane from the sphere surface.

Geometric quantities of planes of reflection
In order to express the relation between a control-point, its projection onto the image plane, and the direction of the extended ray, we derive first expressions that relate to both the axial camera configuration and the reflection of light from the sphere surface.These consist of four elements within the plane of reflection, and include: i) the angle γi between the camera-axis, − → oc, and − − → mic; ii) the distance di, between the projection center, c and q i , the point at the intersection of the extended ray −−→ ximi with − → oc; iii) the angle δi, between − → oc and the normal to the sphere's surface; and iv) the angle φi, between − − → xiq i and − → oc (Fig. 1).To keep the model applicable to any type of central camera (e.g., one equipped with a fish-eye lens), we use angular quantities which are measurable in the image reference frame.
Computing γi, requires first to define the image-space direction of − → oc.This direction must be estimated as the mirror's center, o, does not show on the image.We make, first, a reference to methods that are based on either placement of actual markers on the lens or projection of the sphere's boundary onto the image following the camera orientation (Kanbara et al., 2006;Francken et al., 2007).However, we develop an alternative one that requires neither, and which is valid for any central camera.
We begin by observing that the projection of the mirror boundary on the image, relates to the tangent ray to the sphere's surface, − → mc (Fig. 1), suggesting that the angle, α, between these two vectors remains the same for any point on the boundary, and thus: with vi the image space direction towards sphere boundary.As our interest is only in the direction of − → oc, we set: and so can write: which can then be extended to multiple observations and allow to estimate − → oc linearly and thereby compute the angle α by Eq. ( 2).The angle γi can then be computed by the scalar product between − → oc and − − → cmi.Additionally, as the radius vector is perpendicular to the tangent ray (Fig. 1), we also have that: suggesting both that the ratio r/µ is constant and known, and that if r is known, µ can be derived.
We can now derive expressions for the following quantities within the plane of reflection: i) the angle γi between − → oc and − − → mic, ii) the distance di, from the projection center, c, to the intersection of the extended ray −−→ ximi with the camera-axis at q i , iii) the angle δi, between − → oc and the normal to the sphere's surface, and iv) the angle φi, between − − → xiq i and − → oc (Fig. 1).
The angle γi is the scalar product between − → oc and − − → cmi and the following three quantities are given by (cf. the Appendix for their derivation): These geometric quantities describe both the reflection of a point onto image space and the direction of −−→ ximi, the extended ray, for any plane of reflection.The relation between two planes of reflections is given by a rotation, ψij about the − → oc, which is given by: where ξij is the angle between two planes of reflection, and whose derivation is given in Ilizirov and Filin (2016).
via which the angle ψij between the two plane of reflection is derived and, together with the plane of reflection derivatives, define the axial camera.

Transformation between the plane of reflection and objectspace
To establish the relation between the planes of reflection and object-space, we introduce an intermediate coordinate system, M , whose center lies at c, its x-axis is − → oc, its y-axis is orthogonal to the x-axis on an arbitrary plane of reflection (e.g., on that containing x1), and the z-axis completes a right-hand-side reference frame.A control point in the M system is expressed by its position along the line − − → q i xi (Fig. 1), namely: with [ ]M a vector in the M system, u a scale factor, [q i ]M = di 0 0 T , and [p i ]M the direction of − − → q i xi, for which we use a conical parametrization: where vi represents the rotation around − → oc, and wi the line slope in the plane of reflection.The angle vi is derived from ψij, and the parameter wi can be derived from φi (Fig. 1) by: The transformation from the M system to object-space is of Euclidean nature. 1 Therefore: where c is the camera perspective center, and R = [r1 r2 r3] is the rotation matrix whose columns are: Finally, use of Eq. ( 12) and Eq. ( 9) allows writing: which provides an equivalent to the colinearity relation, where a point in object-space, xi, is linked to derivable image-space quantities (here, φi, di and vi).Similar to central perspective cameras, collineation is of the control-point-to-projection-center direction in both image-and object space.From Eq. ( 14) the rotation angles in R and the camera position, c can be derived directly.To derive the mirror's center, o, we use: Parameters estimation using DLT -Notably, and while not elaborated here, our representation (Eq.14) facilitates linear modeling of the system.By dividing the last two rows by the first, the parameter u can be eliminated, thereby and equivalent form to the DLT.Its appeal lies in the direct estimation without the need for first approximations or iterations.

Mapping
Using two or more images can be used to estimate position of a point in object-space.Using Eq. ( 9), we describe the rays towards x in two M -systems for a point that appears in two images (M1 and M2, respectively) by: 1 Under the assumption that the radius is given.
where u1 and u2 are unknown scalars.Using the estimated model parameters, each of the rays can be transformed into object space: yielding: which form two lines in object space that intersect in x.We have six equations and five unknowns, (x, u1 and u2) and the point can be estimated by least-squares adjustment.

ANALYSIS AND RESULTS
Having established the parameter estimation models, their performance is now analyzed.The analysis is carried out over different control configurations, at different levels of noise.
The model is tested using both simulated experiments, under realistic imaging configurations, and real-world data.The imaging system consists of a standard pinhole camera and a spherical mirror, where the camera intrinsic parameters are assumed to be calibrated in advance.Parameters similar to the real-world experiments were used for the synthetic tests, with: µ = 500 mm, and r = 74 mm.

Influence of control point configurations on the parameter estimation
To test the quality of the estimation and the influence of the controlpoint distribution, we observe that the distribution of points in image-space has the greater influence on the solution (cf.Sec.2.2).Thus, we describe the points by their angular image-related quantities: v and γ.Three different configurations are evaluated: i) a random distribution of points; ii) an X-shaped arrangement, which represents an even distribution of the angle γ; and iii) a circular arrangement of points, which represents an even distribution of the angle v (Fig. 2).In reference to the necessary number of control points, the resection model (Eq.14) yields three equations which include an unknown scale factor.Therefore, only three control points are needed to estimate the six positional parameters.To ensure sufficient redundancy and distribution in image-space, 20 control points are used in each experiment.All models are tested with noise levels ranging from σ = 0.1 to 1.5 pixels.Accuracy estimates of the model parameters are the mean of 100 trials for each noise level.
Applying the resection model for the random point distribution (Fig. 2a) yields sub-millimeter accuracy estimates for 0.5 pixels noise or lower; lower than 2 mm estimates for a 1 pixel noise level, and lower than 2.5 mm for a 1.5 pixels noise-level (Table 1).The condition number of Grammian matrix is 830.Notably, throughout the analysis σX of the mirror is omitted as the camera-mirror position along the central-line is related by the distance µ, which is computed in advance as a function of α (Eq.4).The angle α is the outcome of a separate adjustment process, and being accurately estimated, its influence on the accuracy of µ is negligible.As an example, for a 1 pixel measurement noise, σα = ±1×10 −4 rad which translates to σµ = ±0.01mm for µ = 500 mm.The X-shaped point arrangement offers an even distribution in γ along two planes of reflection (Fig. 2b).2Estimates for the resection model are listed in  3. Parameter accuracy measures as a function of the noise level, using X-shaped arrangement (Fig. 2b) The circular point arrangement offers fixed γ values and an even distribution in v (Fig. 2c).Accuracy measures for the resection model are somewhat higher than the previous two, but not on a significant level (  2c) These results show that as long as the point distribution is spread throughout the image their arrangement has a lesser effect on the pose estimation.The correlations among the estimated parameters is not extreme.However, relatively high correlations can be observed between the Y and between the Z values (∼88%).Analysis of the rate of convergence shows similar patterns to those of the conventional central projection model.

Real world experiment
Testing the model on real world data, a spherical mirror was placed inside a box with three surrounding vertical planes covered by a checkerboard pattern.Measurement of the camera pose with the checkerboard related control points, and when reflected from the sphere surface was carried out.While not listed here, accuracy equivalent to the results obtained by the simulated test were reached.These results suggest that the proposed model reflects indeed the real-world settings.

Mapping
Finally, and as noted in Sec.2.3, once the camera and mirror position are estimated, mapping can be performed by intersection of the extended rays − → q 1 x and − → q 2 x (Eqs.20; 21).In that respect, the intersection of the extended rays with the camera axes at e1 and e2, is equivalent to the central perspective baseline, which here becomes e1e2 (Fig. 3).
With this observation in mind, and with the understanding that higher accuracy can be reached with wider baselines ( e1e2 ), we study two mapping scenarios: the first is the conventional one, in which two images are acquired through a stationary mirror; and the second is designed so that the baseline is extended.For this, we not only move the camera but also the spherical mirror between acquisitions (Fig. 3b).
To test the expected accuracy of both settings, we study a setup in which two images are acquired with a distance of c1c2 = 770 mm between them, but where the mirror is also shifted by − − → o1o2 = 280 mm in second setup.Comparing the baseline under both setups show that in the first one e1e2 is equivalent to 114 mm , while in the second one it reaches 366 mm, namely three times wider.The accuracy estimate for a reconstructed point follows from hereon that of standard pinhole cameras model.We study the reconstruction accuracy of a fixed point x for both scenarios.For the stationary case (Fig. 3a) a narrow parallactic angle compared to the second one is obtained, with ∆θ = 45 • (∼ 15 • vs. 60 • ) and leading to a less accurate reconstruction.This is expressed in Table (5) for varying noise levels.The accuracy of the stationary scenario is two-fold lower than that obtained by the second scenario.

CONCLUSIONS
The paper studied pose estimation and mapping from a catadioptric system that consists of a camera and a spherical mirror.As demonstrated, this system forms an axial camera where all extended rays intersect at an axis linking the camera's perspective center and the sphere center.Notably, the system remains axial irrespective of the relative position or orientation between the camera and sphere.Through derivation of measures within and then between planes of reflection a closed form similar to the collinearity principle has been derived, which was then extended into a linear model.Further analysis of the system's geometry has led to an alternative, trilateration-based model that yielded better estimates and proved robust to outliers.Results and analysis show that as long as the control configuration does not introduces degeneracies, high-levels of accuracy can be reached in estimating the pose parameters.Furthermore, the system radius can be calibrated, even at sub-millimeter level of accuracy.Evaluation of the reconstruction with this system has managed to draw resemblance to central perspective cameras, thereby applying known principals in assessing the reconstruction accuracy.This has led to an alternative modeling approach that helps both broadening the 'imaging' baseline thereby having high accuracy levels.

Figure 1 .
Figure1.Geometry of the image formation using catadioptric spherical camera, with focus on the plane of reflection

Figure 2 .
Figure 2. Three control configurations.Each configuration consists of 20 points Figure 3. Mapping using reflection from spherical surfaces

Table 2 .
Accuracy measures for the position and orientation estimation of the central-line

Table 4 )
, while the condition number dropped to 103.

Table 4 .
Parameter accuracy measures as a function of the noise level, using two circles arrangement (Fig.

Table 5 .
Estimated position accuracy measures as a function of the noise level