SFM TECHNIQUE AND FOCUS STACKING FOR DIGITAL DOCUMENTATION OF ARCHAEOLOGICAL ARTIFACTS

Digital documentation and high-quality 3D representation are always more requested in many disciplines and areas due to the large amount of technologies and data available for fast, detailed and quick documentation. This work aims to investigate the area of medium and small sized artefacts and presents a fast and low cost acquisition system that guarantees the creation of 3D models with an high level of detail, making the digitalization of cultural heritage a simply and fast procedure. The 3D models of the artefacts are created with the photogrammetric technique Structure From Motion that makes it possible to obtain, in addition to three-dimensional models, high-definition images for a deepened study and understanding of the artefacts. For the survey of small objects (only few centimetres) it is used a macro lens and the focus stacking, a photographic technique that consists in capturing a stack of images at different focus planes for each camera pose so that is possible to obtain a final image with a higher depth of field. The acquisition with focus stacking technique has been finally validated with an acquisition with laser triangulation scanner Minolta that demonstrates the validity compatible with the allowable error in relation to the expected precision.


INTRODUCTION
As digital technologies advance rapidly, digital libraries (DL) for cultural heritage (CH) assets have evolved.A DL of high quality three-dimensional models would constitute a great improvement.Main benefits are that DL are durable and unalterable, they can be used for heritage protection, to create museum digital archives, documenting a great number of pieces, they allow experts to a check possible accidental or man-made alterations of the work and finally they can adopted to create novel models of cultural heritage fruition.The documentation of CH employing digital techniques aimed at obtaining 3D models of high photo-realistic quality is now considered a necessary and vital practise in order to provide a complete monitoring, protection and maintenance plan.Small artefacts have always been a real challenge when it comes to 3D modelling.They usually present severe difficulties for their 3D reconstruction.Lately, the demand for the production of 3D models of small artefacts has dramatically increased.Today many different techniques are available for completing this task.The success rate of all these methods depends on the equipment used, on the methodology itself and, of course, on the object properties.Currently we can distinguish three main approaches for the optical recording, documentation and visualisation: image-based, range-based and the combination of previous methods.In this paper we report an experience with the photogrammetric technique Structure from Motion with focus stacking image acquisition, and the validation by triangulation laser scanning.Imaging cameras, particularly those with long focal lengths, usually have only a finite depth of field.In an image captured by those cameras, only those objects within the depth of field of the camera are in focus, while other objects are blurred.To obtain an image that is in focus everywhere, i.e., an image with every object in focus, usually we need to fuse the images taken from the same point of view under different focal settings.The aim of image fusion is to integrate complementary and redundant information from multiple images to create a composite that contains a 'better' description of the scene than any of the individual source images.Image fusion plays important roles in many different fields such as remote sensing, biomedical imaging, computer vision and defence system.Multi-focus image fusion is a key research field of image fusion.

STATE OF ART
Nowadays, the most common techniques used for threedimensional (3D) modelling are based on images (including photogrammetry) and range data (structured light, coded light or laser light).Some authors combine both techniques together to enhance qualities of each one, achieving interesting results when applied to cultural heritage.In (Rizzi, et al., 2007), the objective is to give an overview of the different techniques and the used methodology for the digital recording of detailed objects, monuments and large structures.Regarding to focus role on photogrammetry process, (Huang & Zhongliang, 2007) proposed a method to assess focus measures according to focus measures' capability of distinguishing focused image blocks from defocused image blocks.In (Sanz, et al., 2010), some digital image correlation techniques for 3D modelling are briefly reported and discussed; that work describes in detail the low-cost photo-scanner methodology employed for the recording, modelling and virtual visualization of objects.
For small object, (Girardi, 2011) has tested the digital photogrammetry techniques with macro lenses and laser scanners, in particular the triangulating type, in order to assess their potential in the Cultural Heritage.For a three-dimensional reconstruction, (Atsushi, et al., 2011) employ the shape-fromsilhouette (SFS) method to construct a voxel-based 3D model from silhouette images.The results of (Gallo, et al., 2012) show that it is possible to obtain high quality textured 3D models of objects with dimensions ranging from few millimetres to few centimetres, which can be usable both for interactive measurements and virtual presentations, by multi-focus image fusion.After, the same authors (Gallo, et al., 2014) present a new methodology for the 3D reconstruction of small sized objects based on a multiview passive stereo technique applied on a sequence of macro images.Their approach solves this issue by using an image fusion algorithm to extend the depth of field of the images used in the photogrammetric process.In (Plisson & Zotkina, 2015) the authors have tested in open air and underground sites in France, Portugal and Russia, the potential of photogrammetry and focus stacking for 3D recording of millimetric and submillimetric details of prehistoric petroglyphs and paintings, along with original simple optical solutions.(Evgenikou & Georgopoulos, 2015) shows different methodology for the three dimensional reconstruction of small artefacts taking into account the special properties of the objects, such as their complex geometry and shape and their material and colour properties.Even if the potentialities of the image-based 3D reconstruction approach are nowadays very well known in terms of reliability, accuracy and flexibility, there is still a lack of low-cost, opensource and automated solutions for collecting mass of archaeological findings, especially if we consider the real contextual aspects of a digitization campaign in situ.(Gattet, et al., 2015)

ACQUISITION AND MODELING
This section presents the Structure From Motion (SFM) technique and the GSD definition with the rotation of the artefact; the focus stacking and multi-images acquisition; and the laser scanning processing.The validation method is describes on the following sections 4 and we conclude this work in section 5.

Structure from Motion
SFM allows to build three-dimensional models in a semiautomatic way from a set of photographs.However, the phase of acquisition of the images sometimes requires an amount of time that, depending on the location of the artefact to be detected, on the expected scale of the digital model and on the instrumentation used, can be very significant.Digital three-dimensional models can be used for different purposes: the communication of the findings in the museum or via web, or as instruments for the protection and preservation of cultural heritage.In an archaeological museum, where the number of artefacts is very high and their digitization can take a long time, the acquisition phase must be optimized and made expeditious in order to reduce the subsequent times of data processing.So we have developed a quickly and low cost acquisition system that guarantee an excellent photographic quality.This system is suitable for the acquisition of medium-small sized items that can be moved from their natural location and it is composed of a white photographic box (80x80x80 cm), two lamps (13W, 5500K) and a turntable (diameter 39 cm).(Figure 1) The archaeological find to be digitized is placed inside the photographic box that has the task of spreading the light of the two lamps so not to create sharp shadows on the artefact.The box is also used to isolate the finding from the objects and the space around it so that the digital reconstruction takes place more quickly and easily.The use of the turntable allows, once the project of the acquisition is made, to rotate the artefact between two consecutive pictures keeping the camera fixed.The turntable is then equipped with a graduated scale to measure the angle of rotations between pictures, two metric scales useful to orient and scale the final model and a colorchecker for the white balance and the calibration of the camera colours.It is essential in the project phase to choose the expected scale of the final 3D model.It depends on the distance of the camera from the artefact, on the focal length used and on the dimension of the pixel, so the characteristics of the sensor of the camera.It is necessary to determine the Ground Sample Distance (GSD).The digital image is measured in pixels, and the GSD is the distance between two consecutive pixel centres measured on the artefact.Once we have chosen the acquisition distance and the focal length to obtain a certain GSD and we have set the camera parameters to have a sharp and in focus picture, all that remains is to decide the number of shots to be made for a total coverage of the artefact and thus obtain a final 3D model complete in all its parts.Therefore, we need to choose how many degrees we rotate the turntable between two consecutive pictures and at what heights we put the camera.
SFM needs a large number of data to be able to reconstruct a good quality model, but the use of a too high number of photos can lead to very long calculation times and at the same time be useless.Since we know that between two pictures there must be at least an overlap of about 60%, we try to optimize the acquisition phase by calculating the angle of rotation required to satisfy this need.
It has been developed a simplified model of the acquisition phase in which the artefact is represented in a cylindrical form, the form that best represents the morphology of the detected artefacts (Figure 2).Taking into account the characteristics of camera sensor, the acquisition distance, the focal length, the size of the artefact and the overlap between two pictures, it is automatically calculated the number of degrees to rotate the turntable (1).
where α = degrees of rotation between two pictures D = distance from the artefact r = radius of the artefact % = percentage of overlapping between two consecutive pictures The percentage of overlap in architectural scale reliefs must be at least 60% but for small objects such as archaeological finds it is recommended an overlap of at least 80%.In fact, in the side areas, where the straight lines of the theoretical model are tangent, the faces of the object are almost parallel to the direction of shooting and the details are difficult to distinguish.Furthermore, the artefact will certainly have morphological irregularities that the theoretical model does not possess: in this part it is difficult to find homologous points for the alignment and the reconstruction of the model.It is also possible to notice that the portion of the artefact visible from the camera (the circumferential arc determined by the central angle β) only depends on the distance D from the object: the closer we are to the artefact, less portion of it is visible.So, we have built a spreadsheet in which, by entering as input data the size and the resolution of the camera sensor, the acquisition distance, the focal length used, the size of the artefact and the desired percentage of overlap between shots, returns the value of GSD and the angle of rotation of the turntable.
Figure 3.Some of the artefacts digitized at National Archaeological Museum of Marche Regarding the vertical movement of the camera between two revolutions around the artefact, it has been decided to choose it depending on the morphology of the object: objects with concaves and convex or protruding parts require a greater number of photographs to cover the shadow areas.In this way the acquisition phase is very expeditious, with a time for the photographic acquisition of about 20 minutes for each artefact.
(Figure 3) For the survey of small objects (only few centimetres) it is necessary to increase the focal length and to get close to the artefact with the camera.There are two main problems operating in this way: standard lenses has a minimum working distance that does not allow getting too much closer and the depth of field in these conditions is so small that only a small portion of the image appears sharp enough to be used for 3D reconstruction.The first problem can be solved by using a macro lens that have a smaller working distance than standard lenses, while to solve the problem related to the depth of field it is used the focus stacking technique.

Focus stacking
Focus stacking is a photographic technique that consists in capturing a stack of images at different focus planes for each camera pose (Figure 4).The main problem encountered in the macro photography is the short Depth Of Field (DOF).This allows you to have in focus, even by setting the lens to minimum aperture available, only a small portion of the artefact.The problem depends on the relationship between the physical size of the subject and those of the sensor; it is emphasized with increasing magnification.Furthermore, the DOF is parallel to the sensor plane so, photographing a subject that is not parallel to the sensor, the sharp area of the artefact will be even smaller.To increase the DOF while maintaining unaltered the GSD is possible to increase the acquisition distance from the object using a photo sensor with a very high resolution.This, however, in addition to an increase in costs, involves a loss of image quality due to diffraction (smaller the photodiodes of the sensor are, larger is the diffraction at the same aperture).
To overcome the problem of the short DOF the only solution is, in fact, to use the focus stacking.The acquisition, and therefore the movement of the focal plane between the various shots, can take place with two different methods:  keeping the camera fixed on a tripod and changing, manually or automatically with a software, the focal plane;  keeping fixed the focus on the lens and moving stiffly the entire system camera / lens on a micrometre slide so that also the focal plane is moved.In the first case, each change of the focus carries a variation of the focal length, which changes the field of view and perspective.In the second, by moving the objective on a slide, the field of view does not change and thus increases the final image quality and reduces the occurrence of image problems.On the other hand, the first method is cheaper and faster because everything can be automated by software that, nowadays, also manages to eliminate the majority of the problems created by the (1) stitching of the photos.The number of photos needed to cover the distance between the nearest and the farthest focal plane depends on the extension of the depth of field of each shot.These methods are based on pixel variance algorithm, edge detection, contrast measurements and multi-resolution approaches.These methods align the images by performing scale transformations (in order to compensate camera movements towards the object or changes in-focus settings), translations and rotations (in order to compensate camera shake).In the acquisition phase, the camera has been connected to a notebook and it has been used Helicon Remote software, that has the Live View option and allows to set the near and far focal planes for the focus stack and the subsequent shooting in an automatic way.We used this technique for the acquisition of the Venus of Frasassi (Figure 5), a calcareous little statue (8,7 x 2,6 cm) of the Palaeolithic and today one of the most important artefact of National Archaeological Museum of Marche.The fortuitous discovery of "suffering Venus" statue dates back to 2007, when a speleologist, during an excursion, found the small statue in the cave of "Beata Vergine".This cave is also called Shrine's cave because in 19th century there was a small temple by the architect Valadier.The site is located in Frasassi clefts' central part, that were dug by Esino river in the Apennines of the Marches.The presence of the statue, that is included in the series of female figures so called "Venus" because of its iconographical features, means that the site had a religious value in the past.The small statue is one of the rare evidence of Upper Palaeolithic's art.This kind of manufactures were called "arte mobiliare" because of the small dimension of the products extracted by stones, bones, pebbles.The statue is obtained by stalactite and, despite its small dimensions, it appears majestic probably due to the component's proportion.As it was already pointed out, this extraordinary statue, although it in included in a common classification, has unique characteristics.For what concern the style we notice a unique mix of naturalistic elements and abstractions and sketchiness.Clearly the musealization of this find has huge problems concerning the exposure, because the unique elements can't be appreciated unaided eye due to the small dimension.That's why this statue requires the study and the use of new technologies.The pictures has been taken with a Nikon D810, a full frame reflex camera with 36 megapixels, equipped with a Nikkor 105 mm macro lens, designed for magnification up to 1:1 with a minimum working distance of 31 cm.The pictures are taken from a distance of 43 cm, with a focal length of 105 mm, obtaining a GSD of 0,020 mm.The rotation of the turntable was set on 15° and three different heights have been chosen for a total of 72 images for the photogrammetric project within Agisoft PhotoScan 1.1.6.All the picture were taken at ISO 100, time 1.6 sec and aperture f/32 with resulting depth of field of 2,44 cm.Although using a larger aperture would have guaranteed sharper and more detailed pictures avoiding or reducing the diffraction problems, it has been decided to use f/32 aperture to considerably reduce the number of shots to take and make more expeditious the acquisition phase.In Helicon Remote, once set the near and far focal planes, must be chosen the number of pictures to take at different focal planes and that will be then fused together.The number of pictures depends on two factors: the depth of field that has to be covered to have a whole in focus artefact and the effective DOF of each picture taken.The DOF to be covered depends on the dimensions and morphology of the artefact and on the position from where the pictures are taken.The DOF of each shot depends on the camera sensor, the focal length, the aperture and the distance from the artefact.These parameters do not change during the whole acquisition phase so the DOF of each shot will be always the same.
To make the acquisition phase more expeditious we have decided not to choose the near and far focal plane for each shooting position, but to choose the worst condition for each revolution, where the DOF to be cover is bigger.This resulted in a higher number of shots than necessary, however, has led to a considerable speeding up of the work.The number of total shots taken for the whole acquisition of the Venus are around 600, with 6 shots for the acquisition perpendicular to the Venus and 10 shots for the upper and lower revolution.Once the acquisition phase is ended, the shots have been fused and rendered to obtain the whole artefact sharp and in focus (Figure 6).After careful analysis of the results obtained with different software, we have chosen Helicon Focus for the image fusion because the quality of the obtained images was better and there were less blur due to stitching.Helicon Focus has three different methods for the image fusion and each one works the best depending on the type of image, the number of images in the stack, and whether the images were shot in random or consecutive order:  method A: computes the weight for each pixel based on its contrast, after which all the pixels from all the source images are averaged according to their weights  method B: finds the source image where the sharpest pixel is located and creates a "depth map" from this information.This method requires that the images be shot in consecutive order from front to back or vice versa  method C: uses a pyramid approach to image representation.It gives good results in complex cases (intersecting objects, edges, deep stacks) but increases contrast and glare It has been chosen the method C for the image fusion because the other methods created problems of blur and stitching in the border area between the Venus and the background of the photographic box.The result of the fusion of all the images was 72 images exported from Helicon Focus in DNG format.(Figure 7) Table 1.Parameters about the artefact dimensions, the photographic acquisition and the point cloud The photographic box and the small depth of field allowed to import in Photoscan the JPEG files without the creation of masks for the elimination of disturbing elements.Even the dense clouds created does not need to be cleaned from scattered points except in the part of the Venus pedestal.So, we have a complete automation in the creation of the 3D digital model.The alignment of the images, which produced a sparse cloud of 98 k points, has an average alignment error of 0,674 pix (err.max 0,722 and min 0,630 pix) and 250 k projections.The dense clouds, obtained using the highest quality parameters so using the full resolution images, has 42 mln points with a ground resolution of 1,44 x 10 -5 m / pix.(Table 1; Figure 8) Figure 8.The 42 mln dense point cloud

Triangulation laser scanning
The focus stacking technique is not theoretically compatible with the photogrammetric theory, in fact the input image to the SFM calculation is the result of a combination of several pictures, and shooting parameters (exif data) can no longer be considered good for the internal orientation.Even if we did not change camera and lens parameters of the pictures of the same stack, we have to take into account that the macro lens presents a behaviour similar to a zoom lens and the internal parameters change from one picture to another one.So the acquisition with focus stacking technique has been validated with an acquisition with triangulation laser scanner Konika Minolta Range 7 to demonstrate the validity compatible with the allowable error in relation to the expected precision.The accuracy of the laser scanner is ±40 µm and the precision is 4 µm.The exposure and the focus were set on the automatic mode while were chosen three passes that the scanner make on each component in a single scan.
Figure 9.The overlapping between different scans after the alignment We have made 13 scans from a variable distance (min 50 cm, max 68 cm).During the acquisition, it has also made a first alignment for three points directly from the software that governs the scanner, Range Viewer.In this way, we have obtained a preview of the scans to verify the surface portions acquired up to that moment.After the acquisition, the clouds were aligned within Polyworks software: it was used the best-fit alignment, a technique consisting in an iterative algorithm that calculates the best alignment designed to minimize the distance between surfaces superposed in a group of scans, where point of acquisition is unknown.(Figure 9) The alignment has been made using only 10 of the 13 scans and it has produced a mean standard deviation of 0,0257 mm (min 0,0198; max 0,0338).The result of the alignment of the scan is a 1,53 mln point cloud.The overlap between different scans is then been reduced, obtaining a final point cloud of 898 k points with a mean distance of 0,0928 mm.Only at this point it has been generated a mesh model in which we decided to close all the holes smaller than 0,4 mm.(Figure 10)

VALIDATION
The point cloud from SFM and the mesh model have too different resolution so the point cloud was decimated to have a number of points comparable to the laser model.Within the open source software Cloud Compare 2.6.2 the point cloud from SFM has been decimated using the space method, setting the minimum distance between points.This distance was set at 0,0928 mm, the same distance between laser point cloud.The SFM point cloud has been decimated from 42 mln to 900 k points.Only at this moment, the point cloud and the mesh model were aligned in Cloud Compare.Firstly, they were manually aligned picking the same four points in both two entities.It was applied a roto-traslation matrix to the SFM point cloud.Later it was used the iterative ICP algorithm to improve and perfect the alignment.In this phase, in addiction to a roto-traslaction matrix, the point cloud was also scaled to fit as best the laser mesh model.The scale factor applied is 0,9972.11) and the gauss distribution (Figure 12) of the distances between the models show that the mean distance is -0,010 mm and the standard deviation (σ) is 0,075 mm.These values, however, are negatively affected by the areas of the Venus that the laser could not acquire.(Figure 13) In fact, the most deviation points correspond to the areas where the mesh is missing due to the slightly reflective material of Venus.So these points were excluded from the calculation of the distances.Considering that in the range (-3σ, +3σ) is contained 99.73% of the points, the distances were calculated again, but setting as the maximum distance 3σ from the mean value, so 0,0235 mm (= 3 * std.dev.+ mean).Keeping only these points, the mean distance decreases to -0,009 mm and the standard deviation further enhances and measures 0,050 mm.(Figure 14, Figure 15) The mesh model is very detailed and it allows to discover particulars otherwise difficult or impossible to detect.Illuminating the mesh model without texture it was possible to find special engravings on the Venus that seem to indicate a face.These are just suppositions, but only with a high-precision digitization it was possible to deeply study the artefact.

CONCLUSIONS
The proposed study shows the efficacy of a semiautomatic, low cost and high speed system for the acquisition in the photogrammetric survey of medium-small sized artefacts with SFM technique.It has also tested the use of a particular photographic technique named focus stacking for the survey of small objects offering great results and high definition 3D models.
Today the preservation and the promotion of cultural heritage cannot go through a technological update and a digital communication, and it is important to acquire a large amount of data with quick and low cost systems in a campaign of digital documentation of the archaeological heritage.This work shows that the main scale for 3D modelling in archaeology studies is macroscopy and can be addressed with photogrammetry and by using consumers digital and optical devices.High magnification, when relevant, requests only one specialized element to be added to the ordinary photo equipment a macro lensbut involves another imaging processfocus stackingwhich generates a completely focused image and a 3D model by compiling the elevation data of the surface relief.

Figure 1 .
Figure 1.Instrumentation used for the photographic acquisition

Figure 2 .
Figure 2. Model to calculate the rotation of the artefact during the acquisition phase

Figure 4 .
Figure 4. Movement of the focal plane during the acquisition with focus stacking

Figure 6 .
Figure 6.A single shot with the focus on the upper part of the Venus, and the result of the fused pictures

Figure 7 .
Figure 7.The pictures are taken with focus stacking, then are fused together and finally are used in Photoscan After processing the DNG files with the plug-in Adobe Photoshop Camera Raw, all the 72 images were saved in JPEG format with maximum quality and were imported in Agisoft PhotoScan 1.1.6.Here the model was built following the typical pipeline for 3D reconstruction: alignment of images, scaling and orientation of the model, construction of dense clouds, mesh model and final texturing.Venus dimensions

Figure 11 .
Figure 11.Colour map of the distances between point cloud from SFM and model from laser

Table 2 .
Comparison between point clouds