TOWARDS ROBUST SELF-CALIBRATION FOR HANDHELD 3D LINE LASER SCANNING

This paper studies self-calibration of a structured light system, which reconstructs 3D information using video from a static consumer camera and a handheld cross line laser projector. Intersections between the individual laser curves and geometric constraints on the relative position of the laser planes are exploited to achieve dense 3D reconstruction. This is possible without any prior knowledge of the movement of the projector. However, inaccurrately extracted laser lines introduce noise in the detected intersection positions and therefore distort the reconstruction result. Furthermore, when scanning objects with specular reflections, such as glossy painted or metalic surfaces, the reflections are often extracted from the camera image as erroneous laser curves. In this paper we investiagte how robust estimates of the parameters of the laser planes can be obtained despite of noisy detections.


INTRODUCTION
Triangulation based laser sensors are a popular technique for lowcost rangefinders in mobile robotics (Konolige et al., 2008) and 3D scanning for fabrication (Engelmann, 2011;Winkelbach et al., 2006).In order to calibrate these systems typically calibration fixtures or objects with known shape are used to find the parameters of the laser planes.In this study, we look at a selfcalibration technique for handheld 3D line laser scanning.The proposed structured light scanning system consists of a fixed calibrated camera and a hand-held cross line laser projector as depicted in Fig. 1.Two different colors are employed to facilitate separation of the two laser lines in the camera image.
We calibrate the camera to find the standard parameters for modeling the camera intrinsics and we align the two line projectors, such that the laser planes are orthogonal to each other.Each image from the consumer video camera shows two laser curves.First, finding the intersections of these curves from multiple images and using the orthogonality constraint between all pairs of laser curves that are captured in the same image, yields the parameters of the laser planes.Then, by intersecting these calculated planes with the image rays we create a dense 3D point Figure 1: Fixed consumer camera and handheld cross line projector with a blue and a green laser.cloud up to scale.Our reconstruction approach is based on selfcalibration techniques proposed by Furukawa and Kawasaki (2009).
The proposed method is applicable without knowledge of the position of the cross line laser projector in 3D space.It is a multishot technique since reconstruction from only two laser curves in a single image is not possible.To capture the full 3D geometry many images are necessary, such that the whole scene in the field of view of the camera is illuminated by the lasers.The scene must remain static during capture and the camera needs to be fixed since the reconstruction algorithm depends on the properties associated with a point in 3D space being illuminated from different positions.
Generally, uncalibrated scanning with an unrestrictedly moving laser projector makes it more difficult to obtain accurate scans due to noisy estimates of the laser plane parameters.However, the accuracy of triangulation based depth estimation is also dependent on the baseline.The proposed method has the advantage that we are not limited to a fixed distance and scanning with very large baselines is possible.A suitable baseline depending on the depth range of the scene can be chosen by simply moving the projector away further from the camera.This allows to record details that do not show up in scans with a fixed small baseline.
A disadvantage of the proposed method is that the quality of the individual extracted laser line points significantly affects the accuracy of the whole 3D reconstruction.Inaccuracies of the intersection positions or erroneous detected intersections distort the reconstruction result or make the method fail.Especially in the presence of glossy surfaces laser line extraction from images is degraded due to reflections.In this work we investigate how the self-calibration techniques for line laser scanning by Furukawa and Kawasaki (2009) can be applied in spite of noisy laser line detections.This work tries to mitigate these problems by explicitly detecting outliers and improving accuracy by combining the information reconstructed from multiple laser lines.For 3D scanning using the proposed method only a single consumer video camera and two line laser projectors are necessary, which makes the system very cost efficient and affordable.

RELATED WORK
Employing planarity constraints to recover 3D shape has been studied for diverse applications, such as automatic calibration of structured light scanners (Furukawa et al., 2008), single image 3D reconstruction (Van den Heuvel, 1998) and shape estimation from cast shadows (Bouguet et al., 1999).In this work we look at self-calibrating line laser scanning, which recovers 3D information from the projection of planar curves.Previous work typically either employs a fixed camera and tries to estimate the plane parameters of the laser planes (Zagorchev and Goshtasby, 2006) or works with a setup where camera and laser are mounted rigidly relative to each other and automatically estimates the extrinsic parameters (Jokinen, 1999).
Some methods solve the online calibration problem by placing fixtures or known reference planes in the scene.For example, Winkelbach et al. (2006) proposed a method for a hand-held laser line scanning system which estimates the laser planes by placing the object in front of a corner with two known reference planes.By intersecting the image rays with the two reference planes the 3D point positions of the laser projection on the reference target is reconstructed.Then by fitting a plane to these points the plane parameters of the line laser are computed.This approach became later popular as the Davidscanner.Furukawa and Kawasaki (2006) demonstrated that hand-held 3D laser scanning is possible without the requirement of placing any special objects in the scene.However, their approach requires a known geometric configuration of the laser planes of the projector.Their approach exploits coplanarity constraints and additional metric constraints, e.g., the angle between laser planes, to perform 3D reconstruction.Later, Furukawa and Kawasaki (2009) extended the approach and showed how additional unknowns, such as, the parameters of a pinhole model (without distortion), can be estimated if a suitable initial guess is provided.The work in this paper follows this approach.
The underlying plane parameter estimation problem leads to a linear system of coplanarity constraints for which a direct leastsquares approach does not necessarily yield a unique solution.
For example, projecting all points of all planes in the same common plane fulfills the coplanarity constraints.However, this does not describe the real scene geometry.Ecker et al. (2007) showed how additional constraints on the distance of the points from the best fitting plane can be incorporated in the optimization problem to avoid unmeaningful solutions.

METHODOLOGY
The configuration for creating a 3D scan is visualized in Fig. 2. We scan the scene with a fixed calibrated camera and move the hand-held laser projector in order to project laser crosses in the scene from different positions.In each image of the video camera we observe two laser curves, which we know to have plane normals that are perpendicular to each other due to the cross configuration of the employed laser line projector.By aggregating a sequence of images over time, we extract many different laser curves on the image plane.
Solving for the plane parameters is a two step process: First, we exploit coplanarity constraints from intersections between laser curves.Since the camera is fixed and the scene is static, intersection points correspond to the same 3D point.By extracting many laser curves, we will obtain many more intersections than the number of laser planes.This allows us to build a linear system to solve for the plane parameters up to a scale and an offset.
In the second step we solve for the additional degrees of freedom (DOF) of the parameters by considering the orthogonality constraint between the laser planes in the cross configuration.The plane parameters are found up to a scale by solving a non-linear optimization problem.
Finally, the 3D point positions of each laser curve are computed by intersecting the camera rays with the laser planes.The individual steps and employed models are explained in more detail in the following sections.

Camera Model
We approximate the camera projection function based on the pinhole model with distortion.The point X = (X, Y, Z) T in world coordinates is projected on the image plane according to where x = (x, y) T are the image coordinates of the projection, p = (px, py) T is the principal point and fx, fy are the respective focal lengths.Using the normalized pinhole projection we include radial and tangential distortion defined as follows where (k1, k2, k5) are the radial and (k3, k4) are the tangential distortion parameters.Here, x = (x, ỹ) are the real (distorted) normalized point coordinates and r 2 = x 2 n + y 2 n .
We calibrate the camera using Zhang's method (Zhang, 2000) with a 3D calibration fixture with AprilTags (Olson, 2011) as fiducial markers.This has the advantage that calibration points are extracted automatically even if only part of the structure is visible in the image.In general, a larger calibration structure is beneficial since it can be detected over lager distances, which allows one to take calibration data in the whole measurement range.
After performing laser line extraction we undistort all image coordinates of the detected line points.Therefore, we do not have to consider the distortion parameters during the 3D reconstruction step, which simplifies the equations presented in the following sections.

Object Image Plane
Laser Planes Laser Lines Optical Center Cross Laser Projector The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLII-2/W8, 2017 5th International Workshop LowCost 3D -Sensors, Algorithms, Applications, 28-29 November 2017, Hamburg, Germany

Laser Line Extraction
A simple approach to extracting laser lines from an image is to use maximum detection along horizontal or vertical scanlines in the image.However, this requires a high contrast between the bright pixels of the laser line and the backgorund.Furthermore, in the case of uncalibrated scanning this does not work in all cases since the orientation of the laser line in the image is arbitrary and no clear predominant direction exists.Therefore, we employ a ridge detector for the extraction of the laser lines in the image.For this work we apply Steger's line algorithm (Steger, 1998) since it is very robust and traces the center of the lines with sub-pixel accuracy.
The idea of this algorithm is to find curves in the image that have in the direction perpendicular to the line a characteristic 1D line profile, i.e., a vanishing gradient and high curvature.We apply the line detector to a gray image created by averaging the color channels.If scans are capture in strong ambient illumination, we apply background subtraction to make the laser lines more discriminable from the background.
The direction of the line in the two dimensional image is estimated locally by computing the eigenvalues and eigenvectors of the Hessian matrix ) where gσ(x, y) is the 2D gaussian kernel with standard deviation σ, I(x, y) is the image and rxx, rxy, ryx, ryy are the partial derivatives.The direction perpendicular to the line is the eigenvector (nx, ny) T with (nx, ny) T 2 = 1 corresponding to the eigenvalue with the largest absolute value.For bright lines the eigenvalue needs to be smaller than zero.
Instead of searching directly for the zero-crossing a second-order Taylor expansion is employed to determine the location (qx, qy) T where the first derivative in the direction perpendicular to the line vanishes with sub-pixel accuracy: where t = − rxnx + ryny rxxn 2 x + 2rxynxny + ryyn 2 y .
For valid line points the position must lie within the current pixel.Therefore, (qx, qy) ∈ [−0.5, 0.5] × [−0.5, 0.5] is required.Individual points are then linked together to line segments.This is done by choosing starting points with high responses and tracing along the detected ridge points to from line segments until all detected ridge points have been processed.Double responses are explicitly detected and removed from the final output.
The response of the ridge detector given by the value of the maximum absolute eigenvalue is a good indicator for the saliency of the extracted line points.Only line points with a sufficiently high response are considered.
To distinguish between the two laser lines we use the color information and apply thresholds in the HSV color space.This is implemented using look up tables to speed up color segmentation.We apply only very low thresholds for saturation and brightness of the laser line.Depending on the object surface the laser lines can be barely visible and appear desaturated in the image.

3D Reconstruction Using Light Section
If we know the parameters of the laser plane, we can find the 3D coordinates of the detected laser points by intersecting the image rays with the laser planes.A line laser can be considered as a tool to extract points on the image plane that are projections of object points that lie on the same plane in 3D space.We describe the laser plane πi using the general form where (ai, bi, ci) are the plane parameters and X = (X, Y, Z) T is a point in world coordinates.Using the perspective camera model described in Eq. 1 this is expressed as where x = (x, y) T are the image coordinates of the projection of X on the image plane, p = (px, py) T is the principal point and fx, fy are the respective focal lengths.
If we know the plane and camera parameters, we can compute the coordinates of a 3D object point X = (X, Y, Z) T on the plane from its projection on the image plane x = (x, y) T by intersecting the camera ray with the laser plane:

Self-calibration Approach
Self-calibration in this work is considered as the problem of estimating the parameters of all observed laser planes.From the recorded sequence of images the laser curves are extracted as polygonal chains.We find the points that exist on multiple laser planes by intersecting the polylines.This computation is accelerated by spatial sorting, such that only line segments that possibly intersect are tested for intersections.Moreover, we simplify the polylines to reduce the number of line segments.However, we need to do this with a very low threshold (less than half a pixel) in order not to degrade the accuracy of the extracted intersection positions.
The plane parameters are estimated in a two step process based on the approach described in (Furukawa and Kawasaki, 2009).First, by solving a linear system of coplanarity constraints the laser planes are reconstructed up to 4-DOF indeterminacy.Second, further indeterminacies are recovered from the orthogonality constraints between laser planes in the cross configuration in a non-linear optimization.
First, using Eq. 8 the coplanarity constraint between two laser planes πi and πj are expressed in the perspective system of the camera for an intersection point xij = (xij, yij) T as The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLII-2/W8, 2017 5th International Workshop LowCost 3D -Sensors, Algorithms, Applications, 28-29 November 2017, Hamburg, Germany We combine these linear equations in a homogeneous linear system: where v = (a1, b1, c1, . . ., aN , bN , cN ) T is the combined vector of the planes' parameters and A is a matrix whose rows contain ±(xij − px)fx −1 , ±(yij − px)fx −1 and ±1 at the appropriate columns to form the linear equations of Eq. 10.
This problem has a trivial solution for v, which is the zero vector.Therefore, we solve the system under the constraint v = 1 using Singular Value Decomposition (SVD).If the system is solvable and it is not a degenerate condition, we obtain the perspective solution of the plane parameters (ap, bp, cp) with 4-DOF indeterminacy.The solution is represented by an arbitrary offset o and an arbitrary scale s: A degenerate condition can be caused, e.g., by planes with only collinear intersection points.For example, consider the intersection points one, two and three (visualized by circles) in Fig. 3 that lie one the dashed curve.These intersections are collinear on the image plane, which means that they lie in the same plane in 3D space.In this case the parameters of the dashed line cannot be recovered even if all other planes are determined.Therefore, it is necessary to remove these curves before trying to solve the linear system.
The described problem does not only occur with single laser curves but also with groups of curves.For example, consider the solid lines in Fig. 3 as one group and the dashed and dotted lines as a second group.These two groups are only tied together by the intersection points four, five and six, which are collinear.In this case the group of dashed and dotted planes has indeterminacies even if all solid planes are determined.
The second step is finding all plane parameters up to scale by minimizing a non-linear optimization problem.With the cross line laser configuration we obtain an additional orthogonality constraints between each of the two cross laser planes which is used to recover the offset vector.The offset is computed, such that the error of the orthogonality constraints is minimized.We find the offset vector ô that minimizes the sum of the inner product between planes in the set C = {(i, j)|(πi ⊥ πj )} of orthogonal laser planes:  where n is the normal of the plane computed from the plane parameters and offset vector.The scale cannot be recovered with only two laser planes and needs to be estimated from other measurements, such as a known distance in the scene.
We only use a subset of the laser planes to solve for the plane parameters.The other planes are then reconstructed by fitting a plane to the intersection points with planes of the already solved subset of laser planes.Although it is possible to compute 3D reconstruction with less planes we found empirically that 100 -200 planes are necessary to find a robust solution.

Computing Robust Laser Plane Estimates
In order to choose a solvable subset of planes we remove all planes that have only collinear intersection points, which we test using principal component analysis (PCA).Moreover, we apply heuristics to select planes that have distinct orientations and positions in the image.To do this we pick planes spread apart in time.Consecutive image frames are very similar if we move the projector slowly.Additionally, we reject planes that have more than one intersection with each other.There are situations where it is valid that two planes have multiple intersections.However, in practice this mostly happens for almost identical planes or due to erroneous or noisy intersection detections.We address this problem by explicitly detecting these outliers among the intersections and label them as invalid.To do this we compute the intersection point in 3D space using the plane parameters of both of the intersecting planes.Since the plane parameters are noisy there is an error between the two computed point positions.For each laser curve we label all intersections as invalid based on a threshold that have a higher error than the median error of all intersections of that particular curve.Then we recompute the self-calibration without taking these invalid intersection constraints into account.
Moreover, the line points associated with the reflection cannot be reconstructed correctly, because they do not lie in the original laser plane.These reflections are typically detected as distinct line segments by the laser line extraction algorithm.Removing line segments that have only invalid intersection points is an effective technique for removing these erroneous points from the final point cloud result.
In general, it is difficult to verify that a valid 3D reconstruction is found.We cannot discern if the solution is good or bad by only looking at the residuals of Eq. 11 and Eq. 13 since we are optimizing for these values and they are expected to be small.Therefore, we look at the error of the planes that we did not take in the plane parameter optimization step.Specifically, we compute the root-mean-square angular error for all orthogonal laser planes.
A disadvantage of the presented method compared to other structured light approaches, e.g., gray code projector based systems, is that a high number of images is necessary since only two lines can be reconstructed from a single images.However, in order to achieve real-time reconstruction it is only feasible to compute self-calibration for a subset of the detected laser lines in the images.Moreover, not all laser lines are estimated directly using the proposed method, such as curves with collinear intersections.All other planes are only determined by the intersection points with the subset of solved laser planes.
In future work this initial solution could be further improved by iteratively optimizing an error function which takes the constraints of all planes into account.This is computationally less expensive than directly solving all constraints.However, first experiments show that the influence of noisy detections skews the solution.
Possibly, the influence of these noisy or inaccurate intersections, which do not agree well with the computed estimate, can be reduced by applying a loss function that weights intersection constraints based on the distance from the computed planes solution.For the experiments a consumer camera with an APS-C sized sensor was used in video mode.The scenes were captured with a wide angle lens with a focal length of 16 mm.We record video at 30 fps in Full-HD resolution (1920 x 1080 pixels).A high shutter speed is beneficial since we move the laser by hand.An exposure

Fig
The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLII-2/W8, 2017 5th International Workshop LowCost 3D -Sensors, Algorithms, Applications, 28-29 November 2017, Hamburg, Germany The projector is built from two 450 nm and 520 nm line lasers with a fan angle of 90 • and an adjustable output power of up to 40 mW.The lasers were mounted in a custom-build frame constructed from laser cut wood as depicted in Fig. 6.It is also possible to use diffractive optical elements (DOE) to project a cross with a single laser.However, line lasers typically emit a significantly thiner line, which improves accuracy, and using two lasers with different colors simplifies the separation of the two laser curves in the image.We set the laser focus such that the line is as thin as possible over the whole depth range of the scene.

CONCLUSIONS
In this paper we investigated how uncalibrated structured light using line laser is applied in the presence of noisy detections.We showed how outliers, e.g., from reflections, are detected and removed from the computation of self-calibration in order to improve the results.Choosing good parameters for the reconstruction step is challenging because the scale of the scene is unknown.Scaling the parameters, e.g., by the depth range of the reconstructed point cloud is not possible for all values.Therefore, automatically determining good parameters remains to be investigated in future work.

Figure 2 :
Figure 2: Configuration of the cross line laser projector and a fixed calibrated camera.

Figure 4 :
Figure 4: Reflections of the laser line due to glossy surfaces and resulting distorted point cloud reconstruction.
Problematic for the self-calibration are incorrect intersection constraints.This problem occurs quite often when scanning glossy surfaces.Depending on the incidence angle of the laser light reflections are visible as depicted in the top image in Fig.4.It is not always possible to reduce this effect by discarding detected laser lines with low brightness because this discards also lines on darker surfaces.This means that additional line segments are detected that form erroneous intersection with other laser curves.This significantly distorts the 3D reconstruction as shown in the bottom image in Fig.4.

Figure 5 :
Figure 5: Subset of the extracted laser lines on the left and reconstructed point cloud on the right.
Fig.5shows examples of the achievable results.On the left a subset of the extracted laser lines is depicted.On the right the final reconstructed point cloud is shown with RGB color mapped to the points.The top scene showing the table tennis balls and Lego bricks was reconstructed from 4 minutes of video with 9,065 valid laser planes detected.The final point cloud created from all valid laser curves has a size of 5,522,983 points.The bottom scene showing the hand was reconstructed from 3.5 minutes of video with 7,817 valid laser planes detected.The final point cloud created from all valid laser curves has a size of 11,613,200 points.For both examples the self-calibration was computed using a subset of 400 laser curves.

Figure 6 :
Figure 6: Cross line laser projector in wooden frame employed for the experiments.time in the range of 5 ms to 10 ms was used in the experiments in order to reduce motion blur.