ACCURACY OF 3 D RECONSTRUCTION IN AN ILLUMINATION DOME

The accuracy of 3D surface reconstruction was compared from image sets of a Metric Test Object taken in an illumination dome by two methods: photometric stereo and improved structure-from-motion (SfM), using point cloud data from a 3D colour laser scanner as the reference. Metrics included pointwise height differences over the digital elevation model (DEM), and 3D Euclidean differences between corresponding points. The enhancement of spatial detail was investigated by blending high frequency detail from photometric normals, after a Poisson surface reconstruction, with low frequency detail from a DEM derived from SfM.


INTRODUCTION
The technology of non-contact optical surface recording is well suited to conservation documentation and complements analytical imaging techniques in heritage science.Sustainable 3D spatial and colour imaging of museum objects requires a standardised measurement protocol against which long-term outputs can be judged.Professional 3D recording technology is capable of extremely high quality metric outputs, but there are currently no applicable guidelines for evaluation of 3D colour digital data suited to the needs of heritage users.This indicates the need for a suitable test object and associated protocol to verify recording capabilities and the resulting 3D image quality.
Sensor performance is typically evaluated using quantities like resolution, uncertainty and repeatability, with attention to object material and local surface features.Testing should take into account existing standards and geometric features (Beraldin et al., 2007).Previous research has studied the performance of test objects for the scientific evaluation and verification of geometric accuracy of optical 3D imaging systems: Boehler et al. (2005) investigated laser scanner accuracy with dedicated and calibrated geometric features; Tuominen and Niini (2008) verified a real time optical 3D sensor in a production line; Teutsch et al. (2005) developed methods for geometric inspection and automated correction for laser point clouds.Luhmann (2011) identified parameters as physical representation of object surface, orientation strategies, image processing of homologue features and representation of object or workpiece coordinate systems and object scale.He discussed strategies for obtaining highest accuracy in object space for state-of-the-art in high accuracy close-range photogrammetry for technical applications.
The illumination dome at UCL enables sets of images of an object to be captured from a fixed zenithal camera position, with illumination from 64 flash lights at known coordinate positions on the hemisphere.Image sets acquired by this device, in which all 64 images are in pixel register, have been used primarily for visualisation of cultural heritage objects by the polynomial texture mapping (PTM) technique (Malzbender et al., 2001), but have also proved to be viable for estimation of the surface angular reflectance distribution function (MacDonald, 2014) and 3D reconstruction of a digital elevation model (DEM) by photometric stereo or 'shape from shading' (MacDonald, 2015).
Usually in multi-view photogrammetry the camera is moved to multiple positions around a fixed object.Given the geometric constraints of the illumination dome, however, the question arises whether the same accuracy can be obtained by moving the object systematically within the field of view of the fixed camera.In a previous study two methods of dense surface reconstruction were compared, using sets of images of an ancient Egyptian artefact captured in the dome.The reference dataset was obtained from a 3D colour laser scanner.DEMs generated from stereo image pairs by photogrammetric techniques exhibited surface convexity ('doming'), caused by the parallel imaging geometry, indicating the need to tilt the object in addition to lateral translation.Surface normals derived from dense reconstructions were found to be inferior to normals derived from photometric stereo using the median of selected triplets of lamps.The present study compared the accuracy of 3D reconstruction from an improved structure-from-motion (SfM) workflow with the DEM obtained by a Poisson reconstruction technique.We employed a 3D Metric Test Object, which was developed at UCL on the basis of engineering metrology guidelines, and includes known surface and geometric properties to enable comparison of the performance of different 3D recording systems (Hess and Robson, 2012).Situated around the baseplate of 235 mm square is an irregular array of six 20 mm diameter tooling balls with matte white surfaces, mounted on conical aluminium bases, which provide independent datum points for spatial registration.Onto the baseplate can be fitted three secondary plates for museum artefacts, 2D photographic targets, and 3D geometric forms (Fig. 1).The geometric plate, contains known geometries and all components, geometric features as step, gaps, angles and length gauges, are made of Alcoa aluminium alloy T6061 with an etched surface.The object has previously been used for the quantitative assessment of commercially available photogrammetry technologies with a 3D point cloud of this object (captured by a colour laser scanner) as a reference dataset (Hess et al., 2014).Its design was based on the stated needs of heritage professionals to be portable and usable within their own institutions.Typical questions answered during the evaluation procedure were: "With what confidence can this technology identify the smallest recordable step, such as a brush stroke on a painting, or the smallest gap such as a crack on an object?" and "How do these sensor characteristics compare to other methods or technologies?"Three image sets were taken by a Nikon D200 camera.Raw images of 3900x2600 pixels were captured in NEF format.The 17-55mm 1:2.8GEDNikkor DX zoom lens was set to a nominal focal length of 35mm, focal distance 0.65m and aperture f/8, and the zoom and focus rings were taped to prevent movement during the photography in all three phases.
(1) 27 images of a large 'Manhattan' target object, moving the camera freely around the object and using the built-in camera flash to illuminate the retroreflective targets; (2) 40 images in the dome, with the 3D test object systematically turned, tilted and moved laterally, illuminated by the 16 flash lights in Tier 3 of the dome at 45° elevation in 'ring flash' configuration (Fig. 2); (3) 64 images in the dome, with the 3D test object in a fixed position at the centre of field and illuminated by each one of the 64 dome flash lights in sequence (standard PTM image capture procedure).All images were converted from the Nikon raw format (NEF) to 16-bit linear TIFF via the utility DCRAW, in the 'Adobe RGB' colour space.The object width in the images was 2384 pixels, so the spatial resolution on its baseplate was 10.14 pixels/mm.

IMPROVED DENSE MATCHING METHOD
The first processing step required the creation of masks to isolate the test object from the background in each of the 40 images, since, in order to acquire the images inside the dome, the test object was systematically turned, tilted and moved laterally, while the camera was kept fixed.This was equivalent to moving the camera freely to viewpoints around a stationary object, as shown in Fig. 3, with maximum angles of about ±35° horizontal and ±25° vertical.Tie points were automatically extracted using the SfM software application (Agisoft PhotoScan), but the resulting sparse point cloud was highly noisy.Several possible reasons were: (a) the camera network was constrained by the dome geometry and hence not optimal for photogrammetric reconstruction; (b) the test object, made of shiny elements on a dark planar surface, is very challenging for photogrammetry (Toschi et al., 2015); (c) because of the particular acquisition geometry, with the camera fixed in the dome and lens aperture f/8, the depth of field was rather limited and, consequently, the images were not uniformly sharp.Therefore, the tie points were filtered in order to retain only the most reliable observations.The filtering procedure was carried out by applying a tool 3DOM internally developed at FBK (Nocerino et al., 2014), which reduces the number of image observations, so that they can be efficiently handled by classical photogrammetric bundle adjustment.Furthermore, the tool regularises the point distribution in object space, while preserving connectivity and high multiplicity between observations.In particular, the following criteria were adopted: (i) re-projection error less than 1 pixel; (ii) intersection angle greater than 5 degrees; and (iii) multiplicity (number of intersecting rays) greater than two.Table 1.Interior orientation PhotoModeler resulting from the self-calibrating bundle adjustment.
The corresponding image observations of filtered tie points were imported into the PhotoModeler software package and the target centroid coordinates were automatically identified.The scale ambiguity was resolved by using a distance between two targets, far away from each other on the test object, previously determined by photogrammetric processing (Toschi et al., 2015).A self-calibrating bundle adjustment was performed using simultaneously both types of observations, adequately weighted.0.00E+00 0.00E+00 P1 -1.54E-05 6.00E-07

CORRECTION OF LENS DISTORTION
The internal geometry of the camera-lens combination was determined by the Vision Metrology System (VMS), using retroreflective targets on a calibration test object in a set of images.A 3D test object ('Manhattan') was employed, consisting of a 550x550 mm aluminium baseplate of thickness 10 mm, onto which are affixed 39 anodized aluminium rods of diameter 8 mm with lengths varying from 20 to 305 mm, all perpendicular to the base.Approximately 100 circular retro reflective targets of 2.5 mm diameter are distributed over the baseplate and on the top of each rod.The targets form a rigid array of points in a 3D coordinate space.Under flash illumination the targets are visible in the image from any viewpoint within an incidence angle limit of 50-60°.Eight machine-readable codes are also fixed onto the baseplate to facilitate automatic orientation of the target array in image processing.The targets are conspicuous in the image when the illumination direction is close to the optical axis (Fig. 4).Images of the large Manhattan test object were processed by VMS to determine the 10 lens distortion model parameters (Table 2).All values are of the parameters, not corrections.For example, radial distortion is specified as the actual distortion from the ideal location to the distorted location.The principal distance (PD) is the separation between the lens perspective centre and the focal plane.The main contributor to lens distortion is radial distortion, as can be seen from the displacement vectors in Fig. 5. Radial lens distortion is computed using the following formula: (1) The components of the decentring (tangential) distortion are: 2 2 (2) 2 2 (3)

PHOTOMETRIC STEREO PROCESSING
Using the bounded regression 'shape from shading' technique (MacDonald, 2014), albedo and normal vectors were determined from the set of 64 images taken in the dome, after processing to correct for lens distortion.For each pixel the 64 intensity values were sorted into ascending order and a subset of the cumulative distribution selected to avoid both shadow and specular regions.Fig. 6 (top pair) shows that reflected intensity for one pixel on the surface of one of the white tooling balls follows closely the cosine of the angle of incident light, whereas for one pixel of the top horizontal step of staircase it is much higher for some angles.The explanation is that the white ball is matte (approximately Lambertian) whereas the metal has a substantial specular component.The adaptive method selects a subset of intensities, shown in blue in Fig. 6 (bottom pair), where the slope of the cumulative distribution is similar to the cumulative cosine, before the specular component causes it to increase rapidly.The resulting albedo (Fig. 7 top) is nearly achromatic, except for the beige surface of the tooling balls.The surface normal vectors (Fig. 7 bottom), shown in conventional false colour coding, appear constant in each planar surface of the step targets, and only the spherical tooling balls exhibit a large range of angles.A further correction to the normals is needed to compensate for the wide angle of view of the lens, which with a sensor width of 24.6 mm and principal distance of 35.04 mm is 38.7° across the full image width.The projective imaging geometry means that the normals are computed with respect to the rays converging from the object through the perspective centre of the lens (Fig. 8).To transform them to a parallel imaging geometry, each normal needs to be rotated outwards from the Z axis by an angle corresponding to the distance of the pixel from the centre of the image plane.The operation is conveniently applied as a vector rotation, using the Rodrigues formula, and can be optimised in Matlab by treating the whole image as an array.

DENSE POINT CLOUD ANALYSIS
Internal and external orientation parameters computed with PhotoModeler (Section 2) were imported into PhotoScan, where the dense image matching was subsequently performed.This was carried out using the second-level image pyramid, corresponding to a quarter of the original full image resolution.Thus, the derived point cloud contained 2,342,510 points (Fig. 9), with an inter-point spacing or lateral resolution of about 0.15 mm.Because this was constructed from images taken from a zenith viewpoint, the point cloud is dense and well defined on horizontal surfaces, but sparse on vertical surfaces.The reconstruction of the six spheres is incomplete and their surfaces are eroded and very noisy due to their absence of features and texture.As expected dense image matching fails in areas poor in texture and when signal-to-noise ratio is low (e.g.blurred and out-of-focus images), producing noisy point clouds or no reconstruction.The results showed a close match everywhere except in the six white balls, with an overall RMS error of 1.61 mm.When the spheres were excluded (i.e.all points with errors exceeding 1 mm) the RMS error for the remainder of the two point clouds was 0.15 mm, with a mean distance between points of 0.23 mm and stdev of 0.17 mm (Fig. 10).The comparatively larger errors in the lower half of the angle fan, indicated by the green region in Fig. 10, led us to investigate the accuracy of the fan and staircase structures.The segmented surfaces on the 16 steps and 11 angles, subtending different distances and angles to the baseplate, were compared through calculated best-fit planes, by a least-squares method (Gaussian fit, 3 Sigma), using GOM Inspect software.When comparing standard deviation, residuals and maximum absolute deviation for both datasets, some revealing trends can be observed.Values of the seven steps closer to the base plate show an average stdev of 0.09 and maximum of 0.14 mm, whereas the top eight steps show higher values of stdev 0.14 mm and max 0.41 mm (Fig. 11).A similar trend can be observed for angle direction in relation to the baseplate: angles between 0° and 4° show an average stdev of 0.06 mm and max 0.24 mm, whilst angles between 5° and 30° have an average stdev of 0.14 mm and max 0.45 mm.This leads to the conclusion that elements closer to the baseplate were imaged with more accuracy, probably because the lens was focussed at that distance and the higher elements suffered from less sharpness in the images.Taking a cross-sectional slice of thickness 0.5 mm for constant Y through the point clouds from the scanner and the dense matching process gave subsets of 8,652 and 5,118 points respectively, plotted together in Fig. 12 as elevations of Z vs X.The alignment of the forms is close, and the staircase, gap gauge and angle fan elements are conspicuous.Enlarging a detail of the horizontal top surface of the staircase (left) shows that points from the dense matcher are scattered in Z, with a stdev of 0.079 mm, whereas the reference data from the scanner is much closer to a straight line, with a stdev of only 0.013 mm.There are evidently problems for the dense matching process with the definition of the vertical faces of the steps of the angle fan (right).

DIGITAL ELEVATION MODELS
As a pre-requisite to the merging operation described in the next section, a digital elevation map (DEM) was needed to match the view of the camera that produced the 2D surface normals (Fig. 7).The DEM was generated from the respective 3D point clouds for both the scanner and the dense matcher, by projecting the point data onto a pixel grid on the X-Y plane with a spatial resolution of 10 pixels/mm.At each point the highest of the candidate points was selected to represent the upper surface.The projection geometry (Fig. 13) simulated the image formation process of the Nikon lens in the fixed geometry of the dome, with the test object at the centre of the baseboard.The effect on the image was to displace the higher points (closer to the camera) outwards from the optical axis relative to points on the baseplane.By similarity of triangles in Fig. 13, the point P with coordinates (x,z) is projected onto the sensor to a pixel at address i by: (6) where: is the centre of the image plane; is the pixel size; is the principal distance; is the centre of the baseplane (intersection of optical axis); and is height of the baseplane.In the dome with the Nikon zoom lens set to 35mm focal length, the height of the perspective centre above the baseplane was 645 mm and the pixel size on the sensor was 6.05 µm.(Fig. 14) shows many 'holes' in the surfaces because of the sparseness of the point cloud.A subsequent filtering operation, replacing each pixel by the maximum of its neighbours, served to fill the holes.The outward displacement of the higher points can be seen in Fig. 14 by comparison with Fig. 12 (top), in which the mapping for all points was parallel to the Z axis.

ALIGNMENT AND MERGING OF DATA SETS
The photographic image from the camera in the dome and the pseudo-image DEM generated by projecting the 3D point cloud from the dense matcher are in general of different sizes and at different orientations.In order to be able to merge and compare the images, an efficient method is needed to determine both scale factor and rotation angle to bring them into alignment.Although this could be done on the 2D images by a search-and-correlate algorithm such as SIFT, the method preferred in this study was to identify the outline of the baseplate of the test object, which was square with rounded corners (Fig. 1).The algorithm for generating the outline for the photographic images taken in the dome, as illustrated in Fig. 15, made use of the good edge contrast against the white card placed underneath.First was computed the mean of the 16 images taken with Tier 3 illumination, i.e. lamps 33 to 48.The green channel was extracted and a 5x5 median filter applied.The mean intensities in the outside region (corner) and inside region (centre) were computed and the intensity threshold established.This enabled a binary mask to be made at the same size as the original image, with 0 for background (intensity < threshold) and 1 for foreground (intensity > threshold).Coordinate points around the image outline were then determined by scanning all rows from both left and right and all columns from both top and bottom.The resulting point sets were filtered to remove duplicate points and sorted into order of angle.This gave 9167 pixels, which at a resolution of 10.14 pixels/mm represented a perimeter length of 904 mm.
To 'square up' the outline so that straight edges were aligned with X and Y axes, a Hough transform was applied, giving the angle for rotation of the image.The outlines of the photometric image and the DEM image were correlated by interpolating the radius of each as a function of angle relative to the centroid, then 'sliding' one against the other in increments of 0.01°.The maximum correlation coefficient gave the angle of best fit, while the ratio of the outline lengths gave the scaling factor, from which a 2x2 matrix was constructed.Finally lines of best fit through the outline masks of the transformed images were used to obtain translational offsets to align the two images to the nearest pixel.
The two datasets were then merged, using a technique previously demonstrated for a terracotta relief (MacDonald, 2015).The low spatial frequencies from the point cloud generated by the dense matcher were combined with the high spatial frequencies from the photometric surface normals.First the DEM created from the projected point cloud, as described in the previous section, was treated as a monochrome image and transformed into the spatial frequency domain by a 2D FFT.Then the horizontal and vertical gradients computed from the photometric normals were transformed by FFT and integrated by the Frankot-Chellappa function.The log(power) spectra (Fig. 16) show more noise in the spectrum of the DEM.The merging used a smooth function based on the Hann filter to make a bilinear interpolation in radial frequency between the two distributions.The reconstructed DEM was obtained by an inverse FFT of the merged power spectrum.A cross-section through the 'gap gauge', the vertical structure in the centre of the Metric Test Object, shows the additional high frequency detail derived from the photometric normals, compared with the DEM derived from the dense matcher and a slice of the point cloud from the Arius laser scanner (Fig. 17).The gap gauge is constructed from eight individual blocks of the same height, which present seven slots with reference depth (pit) of 7.5 mm and varying widths of: 0.1, 0.2, 0.3, 0.5, 1.0, 2.0 and 3.0 mm.In effect the photometric high frequency detail has modulated the underlying geometric structure represented by the point cloud from the dense matcher.This modulation has a tendency to overshoot, producing a 'ringing' at the corners of the gaps, and it does not penetrate down to the bottom of any of the pits in the gap gauge.However it adds definition to the edges, even for the narrowest gap (0.1 mm) in the gauge, and thus enhances the visibility of fine detail, akin to an unsharp masking (USM) filter in image reproduction.Thus it produces a better-looking rendering of the test object, even though it contributes little to metrological accuracy.
Figure 17.Elevation along a vertical section of the gap gauge in reconstructed height (black), compared with the DEM from the dense matcher (green) and the reference height from the laser scanner (red).The section in the rectangle is enlarged above.

CONCLUSION
This study has shown that it is possible to make a reasonable 3D reconstruction of a test object, from a single camera at a fixed viewpoint in an illumination dome, by taking a series of images while systematically translating and tilting the object within the field of view.The network of images is equivalent to those obtained when moving a camera around a stationary object, and can be processed by a dense matching workflow.Factors limiting the quality of the point cloud included non-uniformity of the illumination, limited depth of field of the lens, and constraints on physical object movement within the hemisphere.Moreover the Metric Test Object used in this study proved to be very challenging because of its metallic surface finish and fine surface structures down to 10 micron steps and 0.5 degree angles.
Illumination domes, which are intended mainly for 2.5D visualisation methods such as PTM, RTI and photometric stereo, are generally not used in conjunction with photogrammetric image matching techniques, and are not commonly considered for these strategies.The results of this experiment prove that high-quality point clouds can be achieved, of course with a preference for upward-facing geometric features and planes.This technique could produce repeatable 3D digital representations of small artefacts with controlled lighting, for dimensional monitoring.The enhancement of spatial detail by blending the high spatial frequencies obtained from surface normals by photometric stereo processing, gives best results for small objects where the spatial resolution from the camera is significantly higher, i.e. at least double, the resolution of a laser scanner.In the present study the photometric detail at 10 pixels/mm (5 line pairs per mm) was effectively added as an overlay to modulate the geometric surface from an improved dense matching method.
The benefits of using the Metric Test Object are to facilitate quality control and verification of specifications for 3D imaging methodologies through a rigorous procedure for the nonengineering user.This opens the way for integration of 3D imaging into museum workflows and so can assist heritage professionals, documentation specialists and practitioners in the creative industries.Although specific sensing devices have short development cycles, their underlying physical principles (light, transduction, electronics and signal processing) will endure.Both the test object and the methodology for its use will therefore remain relevant for the evaluation of emerging state-of-the-art sensors for close-range imaging.Currently under development is a new Metric Test Object v2.0, optimised for 3D image-matching and processing methods in cultural heritage, such as photogrammetry and Structure from Motion.

Figure 2 .
Figure 2. Positioning 3D test object on baseboard of dome.

Figure 3 .
Figure 3. Equivalent camera network for movements of object.

Figure 5 .
Figure 5. Radial distortion in pixels (top) and distortion vectors of length x5 (bottom) for the Nikkor zoom lens set to 35 mm.The actual image location relative to the ideal location is:⁄

Fig. 5
Fig. 5 (top)  shows the radial distortion of the lens calculated by Eq. (1) as a function of distance from the centre of the image plane, with values at the extremities of the Y and X axes of 8.6 and 30.8 pixels respectively.The radial distortion is the dominant component of the overall lens distortion, as shown by the distortion vectors in Fig.5(bottom), magnified by a factor of 5 for clarity.Geometric correction of the image requires inward movement, i.e. the value of each pixel in the output image has to be interpolated from the nearest neighbouring pixels at the outer end of the distortion vector.

Figure 6 .
Figure 6.Intensity distributions for one pixel across 64 images for: (top) white ball and (bottom) planar metallic surface.Cosine for a Lambertian surface of same albedo is shown in magenta.

Figure 7 .
Figure 7. Albedo (top) and normal vectors (bottom) for the 3D test object, generated by the bounded regression technique.

Figure 8 .
Figure 8. (top) Cosine value and (bottom) angle of projection vector across the image width.

Figure 9 .
Figure 9. 3D point cloud produced by PhotoScan from the improved SfM procedure.The test-object was scanned with an Arius3D Foundation model 150 laser scanner (mounted on a CMM), in order to produce a reference 3D dataset for the accuracy evaluation.The sampling grid is 0.1x0.1 mm with measurement uncertainty of ±0.035 mm in depth.After a best-fit alignment, using ICP, of the reference point cloud to computed point cloud in CloudCompare, geometric analysis was performed on the point cloud.No prior filtering process was performed on the photogrammetric dense point clouds.The only pre-processing step consisted of an automated segmentation of patches resulting in consistent and identical-size point clouds in identical locations in relation to the point cloud to enable a fair comparison with the reference dataset.

Figure 10 .
Figure 10.Colour coded error map and histogram from CloudCompare, with error values limited to 1 mm.

Figure 11 .
Figure 11.Best-fit plane parameters plotted against (top) step heights and (bottom) angle directions in relation to the baseplate of the Metric Test Object.

Figure 12 .
Figure 12. (top) DEM derived from top-down parallel projection (flattening) of scanner point cloud, represented as a grey-scale image; (middle) cross-sectional slices through point clouds from scanner (red) and dense image matching (black); (bottom) details of horizontal surface of staircase (left) and angle fan (right).

Figure 13 .
Figure 13.Geometry of projection of points onto image sensor.

Figure 14 .
Figure 14.Point cloud from Dense Matcher projected to DEM by camera geometry, before filling.The operation is equivalent to a resection of the point cloud, assuming a pin-hole lens with no distortion and external orientation of the camera aligned with the axes of the object.The resulting DEM image for the point cloud from the dense matcher

Figure 15 .
Figure 15.Stages in generating an outline: (top left) mean image with cross-section; (top right) elevation of intensity; (bottom left) binary mask; (bottom right) outline point set.

Figure 16 .
Figure 16.(top left) log power spectrum of DEM generated from dense matcher point cloud; (top right) log power spectrum of photometric gradients; (bottom) slices of log spectra and alpha.

Table 2 .
Values of lens distortion parameters fitted by VMS.