A METHOD FOR THE REGISTRATION OF HEMISPHERICAL PHOTOGRAPHS AND TLS INTENSITY IMAGES

Terrestrial laser scanners generate dense and accurate 3D point clo uds with minimal effort, which represent the geometry of real objects, while image data contains texture information of object surfaces. Based on the complementary characteristics of both data sets, a combination is very appealing for many applications, including forest-re la ed tasks. In the scope of our research project, independent data sets of a plain b irch stand have been taken by a full-spherical laser scanner and a hemispherical digital camera. Previously, both kinds of data sets have been considered separately: Individual trees were successfully extracted from large 3D point clouds, and so-called forest inventory p a ameters could be determined. Additionally, a simplified tree topology representation was retrieved. From hemispherical images, lea f area index (LAI) values, as a very relevant parameter for describing a stand, have been computed. The objective of our approach is to merge a 3D point cloud with image data in way that RGB values are assigned to each 3D point. So far, segmentation and classification of TLS point clouds in forestry applic ations was mainly based on geometrical aspects of the data set. However, a 3D point cloud with colour information provides valuable c ues exceeding simple statistical evaluation of geometrical object features and thus may facilitate the analysis of the scan data significa ntly.


INTRODUCTION
In the last decade, terrestrial laser scanning (TLS) has become a valuable technique to capture complex 3D geometry of real objects as 3D point cloud.In contrast, digital cameras record colour information with a high visual interpretability and a high resolution.The combination of point clouds and images offers new opportunities for an integrated data analysis in terms of geometry and texture, which will be beneficial for segmentation or classification tasks, for instance.
In case of a laser scanning system with an integrated digital camera, captured images can be linked to the 3D point cloud immediately due to the fix relative orientation between scanner and camera.If separate instruments are being used, the combined evaluation requires a co-registration of the collected data sets based on correspondences.Because of differences in data characteristics, resolution, and perspective, determining correspondences between image data and terrestrial laser scans becomes a rather challenging task.
In this paper, we present a method to generate a mapping of the 3D point cloud to a hemispheric 2D image using laser scanner intensity values.Here, each individual 3D point is projected onto a plane via equidistant projection.By performing image matching between an RGB image and a laser scanner intensity image, the problem of establishing correspondences between 2D and 3D data is reduced to a 2D-2D problem.
The paper is organized as follows: Section 2 gives a brief overview of existing methods to combine RGB images and terrestrial laser scanner data, based on image-to-image registration.In Section 3 the study site is introduced and the data capturing is illustrated.
Our method is described in detail in Section 4. The following section discusses the conducted experiment and its results.Finally, the paper closes in Section 6 with the conclusion.

RELATED WORK
The integrated analysis of TLS data and photographic imagery is a well-researched topic.Co-registration based on artificial markers and central perspective imagery is a common procedure and produces highly accurate results (Kersten, 2006).If the photos are taken with a fisheye lens, the geometric model has to be adapted accordingly, as described in (Schwalbe et al., 2009).As shown experimentally in (Schneider and Schwalbe, 2008), the integration of 3D point clouds with fisheye images yields equally precise results as with standard central perspective.The datasets were co-registered, based on artificial markers in a test field environment.In (Meierhold and Schmich, 2009), experiments were conducted to orient images to TLS data sets on the basis of line features extracted from the image.The methods focused on facades of buildings and performed successfully in several tests.Similarly, in (Forkuo and King, 2005) datasets of buildings were used, but the image-to-image registration was conducted with artificial intensity images from the 3D data.The artificial intensity images were central perspective mappings of the 3D point cloud.Point correspondences were determined by the Harris corner detector and RANSAC.The error was reported to be within two pixels.A similar approach was tested in (Meierhold et al., 2010), but point correspondences were established via SIFT and RANSAC employing the fundamental matrix.Again, the data sets were recordings of urban scenery.
The task of co-registration of 3D data sets and images of forest scenery is more challenging, because the geometry is not equally International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XXXIX-B5, 2012 XXII ISPRS Congress, 25 August -01 September 2012, Melbourne, Australia stable as with buildings due to wind, for instance.Therefore, our experiment is a first step to exploit the benefits of an integrated analysis of TLS data sets and digital imagery in forest-related applications, as well.

STUDY SITE AND INSTRUMENTS
In the scope of our research project, we investigate the threedimensional forest structure in order to gain knowledge of the interaction between biosphere and atmosphere.For this, two different recording methods are utilized: laser scanners and digital cameras.Terrestrial laser scanner systems capture the threedimensional information of the object geometry as a very detailed point cloud.Cameras map the texture information of the object surface as high-resolution image data.Below the two recording methods are explained in detail and the test site is introduced.

Study Site
Our test site comprises a plain stock of more than 350 birch trees (Betula pendula).The area has a size of 1.3 ha (160 m × 80 m) and is located in Wilmsdorf near Dresden, Germany (50 • 58.634 ′ N, 13 • 41.884 ′ E).The birch stock is over 55 years old and has been observed for 40 years by the Chair of Silviculture (TU Dresden).Manual measurements of DBH and tree heights have been conducted of selected trees.The DBH ranges from 16.6 cm to 34.4 cm; tree height varies from 27.8 m up to 28.8 m.All birch trees show an almost straight trunk until half of the entire tree height.

Data Capture
For data acquisition, in addition to a full spherical laser scanner, a high resolution camera with a hemispherical lens, a so-called fisheye lens, has been used.The data collection has been performed on both devices independently on the same twelve viewpoints on the site.Care has been taken to record both kinds of data set without delay, thus the data actually reflects the grow phases.which results in 20, 000 measured points per 360 • .The field of view of the rotating sensor head in horizontal direction amounts to 360 • ; in vertical direction the field of view is limited to 310 • .The test site has been captured with consistent scan settings on the twelve predefined viewpoints and manually registered to each other with the Z+F LaserControl software.For this purpose 40 spherical target were mounted permanently on selected trees.In 2010, the test site has been scanned four times: in leafless condition, while leaves unfolding, with full leaves, and during leaf fall.Despite the efforts to restrict the data recording to calm weather, there are artefacts in the data sets due to movements by wind.Subsequently, each separate scan was limited to a radius of 37 m around the scanner; the average size of a point cloud is ca.145 million points.

Hemispherical Projection Intensity Image
For mapping the raw 3D points to a suitable 2D representation, the geometric relation between the object space and the image plane is established.
The geometry of fisheye lenses does not comply with the central perspective geometry.For fisheye lenses, the incident angle α is not mapped with the same angle β to the image plane (α = β).This means that the projection rays are refracted in the projection centre towards the optical axis.Our fisheye lens is modelled on the basis of an equidistant camera model, as detailed in (Schneider et al., 2009).In this model, the resulting radial distance from the principal point is proportional to the angle of incidence, according to equation 1: where r ′ = distance between image point -optical axis α = angle of incidence c = focal length.
For each laser scanner point, the incident angle α is determined with equation 2, and thus also the corresponding image coordinates x ′ and y ′ on the virtual image plane: (3) where x, y, z = point coordinates in camera coordinate system.
Instead of the laser scanner, a virtual camera is placed in the origin of the 3D point cloud in the scanner coordinate system with the optical axis pointing upwards.Following, the 3D points are converted into the coordinate system of the virtual camera, using these transformation equations: where x, y, z = object points in camera system X, Y, Z = object points in scanner system X0, Y0, Z0 = coordinates of projection center aij = elements of rotation matrix.
The Cartesian coordinates of each laser scanner point in front of the camera are projected onto the virtual image plane by equidistant projection as explained above.
The size of the image plane and thus the predefined pixel size are crucial for the resulting intensity image.In the true intensity image, each pixel corresponds to only one 3D point, because the image width is given by the obtained horizontal angle resolution and the image height by the vertical angle resolution (which is reflected in an image size of 20, 000 × 10, 000 pixels).The size of the real image is limited by the given resolution of the sensor size.It suggests itself to adapt the size of the new image to the size of the real image.Since, several 3D points are mapped to the same pixel; the point with the highest intensity value of this subset has been selected for each pixel, based on the assumption that this point is closest to the viewpoint.In figure 4 a first result of a hemispherical intensity image is demonstrated.As mentioned early, only laser scanner points within a maximum distance of 37 m to the origin of the virtual camera have been used to generate the hemispherical projection image.

Image-to-Image Registration by Spatial Resection
With a known set of corresponding image points, the image orientation between the RHI and the HPI can be obtained by spatial resection.As a result, the exterior orientation parameters, i.e. image position (X0, Y0, Z0) and orientation (ω, φ, κ), of the real camera are obtained in the scanner coordinate system.
The procedure of the spatial resection needs to be adapted to the fisheye camera model, based on model equations 3. We have applied a software tool, which was presented in (Schwalbe, 2005, Schneider et al., 2009), that allows the precise orientation of fisheye images depending on 2D-3D point correspondences.

EXPERIMENT AND RESULTS
For a first experiment, a data set taken from the study site of the plain birch stand has been used.The data is from spring 2010 and shows the forest during the phase of leaves unfolding.
The experiment has been conducted as follows: First, point correspondences between the RHI and the HPI have been established.Second, a spatial resection has been computed to obtain the image orientation between the images.Last, RGB colour value has been assigned to the corresponding 3D points.

Image Matching
The values in the RGB image are determined by the light intensity and the colour of the objects; while the HPI is defined by the recorded intensity values.The two images are rather different in nature regarding composition and spectral wavelength.These differences make it difficult to obtain reliable correspondences using automatic feature matching methods.
For the determination of image orientation in the work presented here, the point correspondences are so far restricted to manually measured points in both images.Up to 50 distinct points have been measured for each image pair.Points have been selected preferentially on tree trunks, and care has been taken to ensure that the points are distributed evenly on the image plane.The corresponding 3D coordinates have been identified using the known relation between intensity image and scanner point cloud.
As additional input variables for the spatial resection, the interior orientation parameters and the radial lens distortion, which are summarized in table 1, have been determined by a camera calibration, as described in detail in (Schwalbe, 2005).The precision of the spatial resection indicates that the colouring of the point cloud suffice for the necessary accuracy in forestrelated tasks.

Colourized Point Cloud
After the exterior orientation in the scanner coordinate system has been computed, RGB colour values from the RHI can be assigned to the 3D point cloud.The procedure is similar to the generation of the HPI, as already detailed in section 4.1.
All points of the 3D data set are transformed into the camera coordinate system.Furthermore, the known camera orientation parameters are employed in the equidistant projection model.Here, the image plane is the RHI.Subsequently, all 3D points, which are lying in front of the camera, are projected.At the obtained image coordinates, RGB colour values of the RHI are accessed and assigned to the corresponding 3D point.Consequently, points lying behind the camera do not receive RGB colour values.In that way, the TLS data set can be coloured with the colour information from the RHI.

Results
In order to evaluate the resulting coloured point cloud visually, a 2D projection of the 3D data set has been utilized again.The 3D point clouds has been mapped to the 2D image plane exactly as before, but here the assigned RGB colour values are used in the image.As a matter of fact, the image content is the same as in figure 4, but the colour information should resemble the expected natural appearance of trees as in the RHI.The resulting coloured mapping is demonstrated in figure 5.The movement of the tree crowns, because of wind during the scanning, is most likely a main reason of this effect.Currently, we have no means for compensating this.The fact that the manually measured points are located mostly in the lower parts of the trunk affects the quality of the projection geometry for those points further away.That means, the further the 3D points are away from the camera, the greater the deviation becomes.Furthermore, equation 2 of the equidistant projection model suggests that he maximum distance to the optical axis is achieved by an object with an incidence angle of 90 • , being projected to the outer border of the image circle.However, the actual opening angle of the employed fisheye lens is slightly larger than 180 • .Although the results of the spatial resection testify a sufficiently precise coregistration of the data sets given manual point measurements, results might be further improved by utilizing a camera model which resembles the actual incidence angle more accurately.
Nevertheless, the experimental results confirm that a co-registration of time-independently recorded fisheye imagery and TLS data is feasible.A next step will be the automatic determination of point correspondences.Preliminary tests showed that feature detection and matching methods, such as SIFT (Lowe, 1999) for instance, do not yield reliable correspondences, when applied on the HPI and greyscale RHI directly.However, we are positive that semi-automatically establishing of correspondences is doable: If it is ensured that the orientation of both images is similar, approximate correspondences could be obtained based on prominent tree trunks in the images.In the context of an iterative optimization process, even effects due to wind movements in the data might be reduced if the actual trunk geometry could be included in the computation.

CONCLUSION
In this paper, a method has been presented which allows the matching between independent hemispherical images and generated 2D representations of terrestrial laser scanner point clouds.
Independent data sets of a plain birch stand have been taken by a full-spherical laser scanner and a hemispherical digital camera.On the study site, there were no artificial markers, which could be used for the image-to-image registration.The matching has been performed on the basis of natural points only.In contrast to an urban environment, it is significantly more demanding to find appropriate points in a forest scene.Furthermore, the establishment of correspondences between the 2D to 3D data sets is a difficult task.Therefore, the 3D point cloud has been mapped to a 2D intensity image, so that its geometry coincides with the hemispherical image.
The experiment with a data set of a forest stand has shown that the registration between the real hemispherical image and the hemispherical projection image is basically possible.Based on the results of the image orientation, the point cloud has been coloured with the RGB values from the RHI.

Figure 1 :
Figure 1: Instruments for data acquisition 3.2.1 Terrestrial Laser Scanner Data: For the 3D recordings we use the terrestrial laser scanner Imager 5006i by Zoller+Fröhlich, shown in figure 1(a).The scanner employs the phase comparison technique and has a maximum possible range of 79 m (Zoller+Fröhlich, 2009).The scanner software records the vertical angle θ and the horizontal angle φ of the direction, the measured distance r, and the intensity value i of the backscattered laser impulse.Afterwards, the obtained polar coordinates are transformed by the scanner software in 3D Cartesian coordinates.Scans were taken with an angular resolution of 0.018 • , Photographs: For image acquisition, the digital camera Nikon D700 with the special lens Nikkor Fisheye 8mm has been used, pictured in figure 1(b).The Nikon D700 is characterized by a high-resolution CMOS sensor (Nikon-FXformat) having a sensor size of 36.0 mm × 23.9 mm.The effective resolution is specified as 12.1 million pixels(Nikon, 2008); which is reflected in an image size of 4256 × 2832 pixels.The result of this hardware configuration is a so-called circular-frame fisheye image where the image sensor format completely contains the image circle, as demonstrated in figure2.Full circular fisheye lenses have an opening angle of 180 • or more, therefore the image geometry cannot be modelled with the central perspective projection.The employed lens type is based on an equidistant projection.In order to take hemispherical forest crown images, the camera is placed on the forest floor with the optical axis pointing upwards, levelled, and oriented to north.For optimal light exposure, the exposure settings are measured above the canopy with an opening angle of 7.5 • .In case of back light, sky appears overexposed while vegetation gets underexposed.Considering the limited colour information present in vegetation, the setting has been optimized by taking an exposure series on every viewpoint.Images have been taken preferably under clear sky or homogeneous overcast conditions.

Figure 5 :
Figure 5: 2D representation of the colorized point cloud

Table 1 :
Parameters of the exterior orientationFor several image pairs, the calculation of the spatial resection has been conducted.In table 2 the parameters of the exterior orientation are listed for one image pair.It can be seen that the standard deviation is very low, which suggests a good fit of RHI and the HPI based on the manual measurements.The obtained accuracy for the image position is determined sufficiently precise.Compromises in the accuracy have to be accepted for the orientation angles.

Table 2 :
Parameters of the exterior orientation