On Object Extraction Using Airborne Laser Scanner Data and Digital Images for 3d Modelling

Airborne laser scanners are effective at extracting the micro topography or ground surface under trees, which cannot be detected by aerial photogrammetry, and are suitable for use in many applications, such as city modelling, DTM generation, monitoring electrical power lines, and detection of forest areas. The most remarkable aspect of these systems is their ability to acquire the 3D coordinates of huge object points in real-time. There are many studies on object extraction using point clouds from airborne laser scanner data, where the shape of an object depends on the density of a point. However, this is generally used for rough shapes or fitted geometric shapes. It is difficult to reconstruct detailed object shapes without many edge points, even if high-density point clouds are obtained. On the other hand, it is possible to acquire detailed object edges from digital camera images if the digital camera is equipped with an airborne laser scanner system. The procedures investigated in this paper for improving rough object shapes using airborne laser scanner data are as follows. Firstly, camera calibration is performed to integrate point clouds and digital images by simultaneous adjustment, such as by bundle adjustment with self-calibration using distance data taken directly from airborne laser scanner data. Secondly, the rough 3D object shape is extracted from the point cloud using normal vectors. Moreover, visualization of normal vectors is used for operator interpretation. Thirdly, the rough 3D object shape is converted into the image coordinates of multiple images by a collinearity condition. The 2D coordinates of detailed image shapes are acquired using characteristic image quantities from around the rough shape. Finally, the detailed 3D shape is computed using the spatial intersection of the 2D coordinates of detailed shapes and the orientation parameters. This paper describes fundamental studies for extracting object shapes for 3D modelling using airborne laser scanner data and digital images.


INTRODUCTION
Approximately 1200 GPS based control stations have been established by the Geospatial Information Authority of Japan (GSI).This infrastructure was utilized by various fields for the wide area crustal deformation caused by "The 2011 off the Pacific coast of Tohoku Earthquake" which occurred on March 11, 2011.Airborne laser scanner (ALS) systems are also used after disasters for topographic surveys using GPS based control stations.The ALS system has the advantage of acquiring detailed terrain data; however, objects such as buildings are generally represented as rough shapes by the discretely obtained point clouds.There are many studies using images to improve rough shapes.For example, Hu et al. (2004) demonstrated the extraction of buildings from LIDAR data and used edges extracted from high-resolution aerial images to refine laser data model accuracy.This approach is limited to buildings with primitive models.Chen et al. (2005) performed building reconstruction using aerial orthoimages and airborne laser scanner data; however, there were issues with the use and creation of orthoimages.
On the other hand, it is possible to acquire digital images if a digital camera is equipped with an ALS system.The digital camera is almost non-metric and needs camera calibration for accurate three-dimensional measurements.When a non-metric digital camera is used, its interior orientation parameters are generally computed beforehand using a test sheet or test target.However, if the digital camera is operated in severe conditions e.g.high-altitudes and low temperatures, camera calibration should be performed sequentially.The authors have been concentrating on developing a practical 3D measurement system for close range photogrammetry using consumer-grade digital cameras.The Image Based Integrated Measurement (IBIM) system is our photogrammetric system, which uses digital cameras and a hand-held laser distance meter (Nakano and Chikatsu, 2010).The orientation parameters of the triplet images are unknown and the pseudo-GCPs are simultaneously calculated by the collinearity condition, distance condition, and geometric constraint condition.It is possible to integrate point clouds and digital images by applying the concept of the IBIM system to ALS system camera calibration.With this motive, simultaneous adjustments, such as bundle adjustments with selfcalibration, are proposed in this paper so that exterior orientation parameters obtained from the GNSS/IMU system, distances and 3D object coordinates acquired from the laser scanner, and the interior orientation parameters are simultaneously adjusted.Combined block adjustment orientations were proposed in the late 1900's (Ackermann et al., 1972, EL-Hakim & Faig, 1981, Chikatsu et al., 1988).The proposed adjustment is widely expected to enable the utilization of the airborne laser scanner in generating large-scale maps and efficient aerial photogrammetry should be accomplished, except for geodetic data such as ground control points and aerial triangulation.Therefore, this paper uses calibration of nonmetric digital cameras to integrate point clouds and digital images.The object extraction procedures using ALS data and digital images are performed in three steps.1) A rough 3D object shape is extracted using a normal vector map that is created from TIN by point clouds.Visualization of normal vectors is useful for operator interpretation.
2) The rough object shapes are converted into multiple image coordinates by a collinearity condition.The 2D shape coordinates of detailed images are acquired using image characteristics from around the rough shape.
3) The detailed 3D shape is computed using the spatial intersection of detailed 2D shape coordinates and orientation parameters.
A flowchart of the object extraction procedure is shown in Figure 1.

CAMERA CALIBRATION
The authors have been concentrating on developing a close range measurement system for consumer grade digital cameras using triplet images (Chikatsu et al., 2006).The measurement system was adopted into digital aerial photogrammetry in this paper because triplet images have following characteristics.
-Triplet images have advantages in generating stereo pairs.-Triplet images have the flexibility for multiple images.
-Triplet images have the ability to increase geometric restriction.
Moreover, the IBIM system of the basic camera calibration concept has distance condition characteristics and also uses pseudo ground control points (GCPs), which are virtual points.Figure 2 shows the measurement concept used in this paper.On the other hand, lens distortion is the most important interior orientation parameter, and many distortion models have been proposed (Brown, 1971, Murai, Matsuoka, Okuda, 1984).This paper uses Brown's 1971 model, which takes the 7th degree of the radial polynomial equation and the tangential distortion into account, and has been widely used in close range photogrammetric fields.

(
) ( ) The exterior parameters (X 0 , Y 0 , Z 0 , ω, φ, κ) and the interior parameters (f [focal length], u 0 , v 0 [principal points], a, b [scale factor, shear factor], K 1 , K 2 , K 3, P 1 , P 2 [lens distortion]) are unknown parameters of the multiple images and the pseudo-GCPs (X i , Y i , Z i ), respectively.These unknown parameters are simultaneously calculated by the collinearity condition, distance condition, and geometric constraint condition under the local coordinate system.Here, the collinearity condition is shown as Equation ( 2) and the distance condition is shown as Equation where D = distance from exposure station to pseudo-GCP X, Y, Z = pseudo-GCP object coordinates X 0 , Y 0 , Z 0 = perspective center ∆X, ∆Y, ∆Z = differences between the laser scanner irradiation point and perspective center The trifocal tensor is a geometric relation of three images containing the same objects from different perspectives (Hartley, 1993).The trifocal tensor is expressed by three square matrixes (3×3), which are T 1 , T 2 , and T 3 , the components of these matrixes are t 1ij , t 2ij , and t 3ij , and the image coordinates of matched points for these three images are (x 1 , y 1 , z 1 ), (x 2 , y 2 , z 2 ), and (x 3 , y 3 , z 3 ).Thus, the following equations are obtained by the geometric relation.where It is understood that image coordinate (x 2 , y 2 ) is the spatial intersection of two epipolar lines on the second image (centre image) and is calculated by Equation ( 5), derived from Equation (4).Therefore, the geometric constraint condition uses Equation ( 5).
where ∆x ij , ∆y ij = residuals for image coordinates ∆D j = residuals for distance ∆xc, ∆yc = residuals for the centre image of the geometric constraint condition ∆X i , ∆Y i , ∆Z i = residuals for pseudo-GCP object coordinates m = number of pseudo-GCPs n = number of images p 1 , p 2 , p 3 , p 4 = weight for each condition

OBJECT EXTRACTION PROCEDURES
Object extraction was performed in the ALS data phase and image phase.Detail procedures are as follows.

Visualization of normal vectors
There have been many approaches to building extractions from ALS data since the late 1990s, such as using height data and normalized DSMs, subtracting DTM from DSM, region growing, slope maps, and normal vectors.In particular, normal vectors are used for terrain or road surface information, rooftops, trees, and so on.Normal vectors are calculated from a TIN generated from random point clouds.A normal vector is managed by the grid for efficiency.Gridding is normalized using a whole value by combining the normal vectors in the grid range.A normal vector map was produced in the X, Y, Z direction assigned to the R, G, B channels.The normal vector map is shown in Figure 3.It seems that normal vector maps indicate the shapes of houses as well as the slope of the roof by color gradation.The X, Y, and Z directions indicate each characteristic, e.g., the X and Y directions show North and East lighting on the map (Figure 4) and the Z direction shows object shapes (Figure 5).

Rough shape extraction
The 3D rough object shape is extracted from the normal vector map via image processing.The image processing procedures are as follows.
-Binarizing using Otsu's threshold method (Otsu, 1980) -Noise reduction for small objects such as telegraph poles and cars -Thinning and line tracing Figure 6 shows the resulting 3D rough shape using image processing.
Figure 6.3D rough shape

Image shape extraction
The rough object shapes are converted into multiple image coordinates by the collinearity conditions.Images around the object are clipped in order to limit edge extraction processing (Figure 7, Figure 8 (a, b)).The flight direction is from west to east and clipped images are rotated 90 degrees, as shown in Figure 7, 8.The Canny operator is used to extract object edges from clipped digital images.The Canny operator result is shown in Figure 9 (a).Please note that the Canny operator result is not binarized at this stage in order to use the threshold, depending on the situation.The edge potential map (Figure 9(b)) was created according to the distance from the point of the edge that is considered to exist when a point around the laser has been converted.Edge candidates (Figure 9(c)) that were computed using the edge potential map and the Canny operator result using Otsu's threshold method were thinned.The thinned edge candidates are reliable; however, more than one object is disconnected.All edges are extracted by minimum threshold value in order to connect the edges (Figure 9(d)).This will be the connected endpoint of two edge candidates that are included in the complete edge.
An object is extracted from triplet images that are taken as multiple images in this paper.Occlusion changes the shapes of objects in images with different perspective centers.Therefore, the intersection image was calculated using triplet images in order to reduce the edge mismatch caused by occlusion.The intersection image is created by affine transformed images that are transformed using converted rough image shape coordinates.The intersection image is shown in Figure 10.A blunder mask is computed for each image using the blue part, which indicates the occlusion location and changes in the intersection image.Finally, a 3D model is created using the extracted edges and feature points, which are calculated using the collinearity conditions.The test field had 117 GCPs, and seven GCPs could be utilized in the triplet area.The GCPs were obtained by static observations using GPS that were set on the edges of road paints.Table 1 shows the data components used in this investigation and Figure 11 shows the centre image used in this investigation.It was taken at about 820 m and the image scale is about 1/13,700.Therefore, the GSD (Ground sampling distance) was about 12 cm. where It can be seen that the vertical coordinate value, Z, is better than the horizontal coordinate values because the vertical coordinates of the pseudo-GCPs are constrained by laser distances as a characteristic of the proposed method.On the other hand, it can be found that the relative calibration accuracy is less than 1/500 and more than 1/1000 in comparison to the permissible error.It is consequently concluded that camera calibration using pseudo GCPs is practical.

Object extraction
Object extraction procedures using experimental data are shown in Chapter 3. It can be said that object shape extraction using the conversion between object space and image space for camera calibration was successful.Figure 12 shows 3D modelling of the object extraction result.Blue line shape is obtained by manual plotting, red dots are rough shape by laser, and green line is object extraction result.Object extraction procedures can create 3D models; however, it produces strange shapes owing to mismatches.

CONCLUSIONS
A camera calibration technique and object extraction procedures were developed in this study in order to achieve 3D modeling using ALS data and digital camera images.It is confirmed that camera calibration using pseudo GCPs and simultaneous adjustment shows GSI restrictions of less than 1/500 and more than 1/1000 in generating each scale map.Therefore, it is concluded that simultaneous adjustment using pseudo GCPs, distance conditions, and geometric constraint conditions is practical because the simultaneous adjustments perform interior and exterior orientations without any GCPs or aerial triangulation.The object extraction procedure was established using ALS data and digital images.The normal vector map is a useful tool for operator interpretation and rough object shape extraction.Moreover, it was effective at extracting object shapes by image processing using ALS data.However, there are some issues requiring further investigation.These problems include accuracy improvement and automatic generation of pseudo GCPs for camera calibration and object shape extraction.

Figure 1 .
Figure 1.Object extraction flowchart performed by calculating these unknown parameters, which can be calculated as values by minimizing the following function, H (Equation (6)), under the least squares method.

Figure 7
Figure 7 Clipped digital camera center images perspective center m ij = rotation matrix elements

Table 1 .
Data componentsTable2shows the root mean square errors for seven check points and permissible errors means restrictions for check points that are established by GSI when generating each scale map.RA (Relative Accuracy) means the value that was computed from equation (7).Standard error was computed under the assumption that image-coordinate pointing is accomplished with one-pixel accuracy.

Table 2 .
RMSE for check points