THE FUSION OF EXTERNAL AND INTERNAL 3D PHOTOGRAMMETRIC MODELS AS A TOOL TO INVESTIGATE THE ANCIENT HUMAN/CAVE INTERACTION: THE LA SASSA CASE STUDY

Caves have been used by humans and animals for several thousand years until present but, at these time scales, their structures can rapidly change due to erosion and concretion processes. For this reason, the availability of precise 3D models improves the data quality and quantity allowing the reconstruction of their ancient appearance, structure and origin. However, caves are usually characterised by lack of light, high percentage of relative humidity, narrow spaces and complex morphology. Thus, quite often the traditional topographic instruments cannot be employed. In the La Sassa cave (Sonnino, Italy) a huge deposit ranging from Pleistocene to the Second World War has been found and stratigraphic evidence suggested that the shape of the cave and its entrance might have been different. In this paper, the fusion of the internal and external 3D photogrammetric models of the La Sassa, made to support the archaeological excavations, is presented, A Nikon camera with a fisheye lens and a smartphone camera have been used to survey the internal part of the cave, while an aerial drone has been employed for the external area. The two models have been georeferenced and scaled using GCPs acquired by a double frequency GNSS (GPS and GLONASS) receiver. A low-resolution DTM derived from a previous aerial laser scanning survey and the 3D models have been elaborated in CloudCompare environment to highlight the complete morphology of the cave and its surroundings.


INTRODUCTION
Caves represent a unique challenge to scientific study. The systematic exploration and documentation of these structures provide an essential foundation for cave research. These environments are typically difficult to access and their entrances are usually not visible on maps, satellite or aerial images. Karst features can be considered as windows into the subsurface, providing the opportunity to study the environment for many purposes. From an archaeological point of view, they often represent exceptional evidence of the human presence in the areas. Caves have been used as burial places or temporary shelters for millennia and their complex structures often arise several questions regarding, for example, their relationship with the surface features or the surrounding caves. Moreover, the morphology of caves derives from a combination of lithological and structural settings, other than being related to the possible water flows. However, cave appearance is subjected to relatively rapid changes due to erosion and concretion processes (Ford and Williams, 2007), and the hypotheses about its development are essential in the research about ancient human/cave interaction. The entire underground environment, together with morphologies and deposits, has to be considered and analysed to provide a complete and accurate definition of the cave structure, origin and development. Consequently, the overall data acquisition has to be made from a three-dimensional point of view and with the appropriate equipment: the use of a highresolution 3D model can improve the quality and quantity of data, ending in a fully-developed study. It is also important to acquire the fullest possible documentation of the area in short times, typically during the archaeological excavation campaign which usually is a seasonal activity. The implementation of a three-dimensional model of a cave usually poses some substantial issues. The most challenging problems are related to the physical obstacles of the environment (Trimmis 2018). The operators usually work in harsh scenarios, characterized by lack of light, humidity, cold temperatures and possibly dangerous areas which require specific training. DGPS and geodetic stations usually employed in surface surveys cannot be used in caves. The GPS does not work properly inside a cave and the use of total stations is limited by space and environmental constraints. Modern technologies are now available to survey these environments, other than the traditional methods based on the use of compass, tapes, etc. Those based on tachometric surveying instruments are the most complex, due to their intensive acquisition process and data processing. The first scanning devices, whose use in challenging cave environments was still troublesome and restricted to easily accessible caves, have been improved to give high resolution and precision results (Mohammed Oludare and Pradhan, 2016). However, the generally high costs of Terrestrial Laser Scanners (TSL) and the difficulties arising while using them in restricted space made the digital photogrammetry increasingly used in these environments. This technology is based on digital cameras images and constitutes a cheaper and more flexible method to be applied in various environments (Stocchi et al., 2017). The cameras are lightweight and can be embedded on mobile and remote shooting systems (i.e. aerial drones), and the results can be elaborated with commercial software packages, (e.g. Agisoft Metashape and Pix4D Mapper Pro) (Caroti et al., 2015;Colomina and Molina, 2014;Masiero et al., 2014;Nex and Remondino, 2014).

Figure 1 -The GCP acquired for the internal (orange) and outside (blue) survey
The digitalization of this process also allows to continuously update the 3D model by adding new data acquisition to the initial outcome. This advantage proved to be fundamental both when new branches become accessible and during the archaeological excavations when the cave literally changes its shape due to deposits and rock debris removal. This is the case of the La Sassa Cave, where a Pleistocene fauna deposit, a Copper Age burial place and Middle Bronze Age (supposedly) cultic activities have been investigated (Alessandri et al., 2019a;Alessandri and Rolfo, 2015). In the cave, the acquisition of the inner part, the so-called Area RA, could not be originally performed due to its limited accessibility. The early model has been updated at a later stage based on further acquisitions. In this paper, we detailed the overall acquisition process which leads to a complete 3D model of the La Sassa Cave through the fusion of the external and internal 3D models of the area. The 3D model, together with a low-resolution DTM obtained by an older aerial laser scanning survey, have finally been elaborated in CloudCompare 2.10.2 environment to highlight the complete morphology of the cave and its external surroundings.

The ground control points network
Both the internal and the external relief needed photogrammetric markers placed on the surfaces to be surveyed: the walls of the cave and the ground surface, respectively. Unfortunately, in both cases, the photogrammetric markers could not ensure stability over time due to possible external interventions (e.g. by atmospheric events). In such a case, the markers are usually placed and surveyed simultaneously but for organizational and logistic reasons the two photogrammetric surveys could not be carried out at the same time. Thus, it was necessary to pay particular attention to the geodetic datum which has to be the same in both surveys. For the internal survey (which was carried out first) three points were acquired with a Topcon Legacy-E GNSS receiver on tripods ( Figure 1, orange dots). They were used to georeference the photogrammetric cloud of the internal survey, which started from the outside area where the three GNSS points were placed. The survey of GCPs for the Drone flight allowed to acquire 10 points ( Figure 1, blue dots) using the same Legacy-E GNSS receiver, always in post-processing mode due to the reduced coverage of the 4G mobile network in that area. Some points have been temporarily materialized with markers in the proximity of the cave. Further away, natural points have been detected on artefacts such as fence walls. These points could be reused for subsequent surveys.
The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLIII-B2-2020, 2020 XXIV ISPRS Congress (2020 edition) In both cases, the post-processing was carried out with RTKLIB (ver 2.4.2) for the network of the permanent GNSS stations managed by the Lazio Region. The closest permanent GNSS station is located in Fondi (FOND) at a distance of about 17 Km from the survey site ( Figure 2). The permanent stations of the GNSS network of the Lazio Region are set in the ETRF-89 datum, framed in the national network (EPSG: 6708). Where possible, the VRS (Virtual Reference Station) approach was used, obtaining fixed solutions with final accuracies on baseline components in the order of 50 mm.

The cave environment and survey setup
The correct exploitation of a survey in a cave presents many issues which can affect the outcome and lead to unpredictable effects. Subterranean, deep and/or narrow spaces are usually characterized by high humidity percentage, raindrops, lack of light and absence of GNSS signal, especially in the most inner areas. Some problems may arise in the acquisition phase (instrument failures are not considered here) affecting the process and elaboration stages. The operator may come across some obstacles during his passage through tunnels or low chambers, so care should be taken while performing the acquisition. Quite often, limited resources and time are available for the survey, so the use of classic methods, very reliable but expensive, could be prohibitive. Nonetheless, the process of cave exploration and study require a proper representation of the entire environment complexity (Trimmis, 2018). Thanks to more advanced technologies, the effect of these problems can be mitigated, and accurate and reliable 3D models can be obtained. Terrestrial laser scanning (TLS) and closerange photogrammetry constitute a substantial improvement to traditional methods, often representing a faster solution, keeping the output unbiased and scientifically repeatable (Jordan, 2017). TLS is recognized as a precise and reliable method to scan caves. LiDAR systems allow the scanning of wide areas in total darkness and with very high precision. However, this technology is very expensive and not easy to use in narrow spaces.
More versatile methods based on digital photogrammetry help overcome these problems, assuring remarkable robustness and versatility in many fields. The use of photogrammetric software to elaborate the images acquired by a camera can speed up the process, providing a scalable 3D model. In this context, the recently developed Structure from Motion (SfM) algorithms allow implementing a highly redundant auto-calibration through the automatic detection of points of interest on the set of images (Alessandri et al. 2019a). In this way, the external orientation parameters of the image and the 3D coordinates of the extracted points can be simultaneously evaluated, reducing the time consumption (Del Pizzo, Troisi, 2011). Close-range photogrammetry and specifically the SfM algorithms have proven their complete reliability, especially in underground environments (Troisi et al, 2015, Troisi et al, 2017. SLAM (Simultaneous Localization and Mapping) techniques provide optimal results too, both in terms of speed and accuracy, in realtime. The improvement behind such solution is an optimization of the SfM algorithms (Mouragnon et al., 2009), being recently able to integrate sensors as range cameras and laser scanning (Biber et al., 2004;Cole et al., 2006). Even though certain problems are still to be resolved, this latter methodology is becoming widely employed, especially in harsh environments; since the acquisition can be made with lightweight cameras and is usually faster since does not need significant preliminary preparations. In this regard, some considerations on the camera lens should be made. If a standard lens is used, very accurate 3D models can be obtained but a lot of images are required to guarantee an adequate overlap between adjacent shots since the field of view (FOV) is limited. Fisheye lens, on the contrary, enlarges the FOV (up to 180°) and the consequent images overlap, allowing the acquisition of more information in a single image. In this case, the drawback is the introduction of an inevitable distortion. The images are elaborated by software packages which use well-established algorithms to extract the geometry of the 3D model according to shared points of interest between two adjacent images. A loss of information could occur during the acquisition, hence a minimum overlapping percentage of 65-85% should be assured (Alessandri et al, 2020). Some targets could be placed inside and outside the cave to help the orientation of the images. They can be used to constrain the model deformations using their distances, other than strengthen the camera network (Alessandri et al. 2019b). The software processes the tie points to build the sparse point cloud. After the densification process, the final 3D model is obtained, defined in an arbitrary reference system with an arbitrary scale. Further processing of the GCPs and the CPs (automatically recognized by the software) allows to georeference and scale the final model, now ready to be analysed. This efficient and robust technique is widely used since it automatizes the photogrammetric procedures and lowers the costs, especially if compared to other methods.

Drone survey setup
The outside area has been surveyed using the photogrammetric approach, employing a quadcopter drone model "DJI Phantom 4 pro". This vehicle was chosen according to the size of the survey area, mean terrain slope, flight time and range. Flight altitude was set at 50 metres to avoid trees and other obstacles, such as power lines, and to assure a Ground Sample Distance of 1.4 cm. The drone was equipped with a digital camera optimized for aerial acquisitions with the following characteristics: CMOS sensor with a pixel size of 2.4 μm and wide-angle lens (equivalent focal length of 24 mm). An accurate flight planning was carried out using the software tool Pix4DCapture, due to the presence of obstacles such as trees and a power line. At the end of the planning process, the survey area was fully covered using about 350 images with a longitudinal overlap of 80% and a transverse overlap of 20%. To overcome the low accuracy of the drone GNSS receiver, the coordinates of 10 non-coded cross targets were acquired using a Topcon Legacy-E double frequency GNSS (GPS and GLONASS) receiver and its antenna (figure 1). The GCPs previously elaborated in post-processing mode were inserted in Pix4DMapperPRO software (ver. 3.1.23) allowing to georeference and correctly dimension the model.

Cave survey
The 3D model of the cave (Figure 3) was obtained using two photogrammetric surveys carried out in two different periods. The first survey of the La Sassa Cave was made through a fullframe DSLR Nikon D800E camera equipped with a Nikkor 16 mm fisheye lens. The use of fisheye lens is particularly recommended when the length of the environment outweighs the other two dimensions, for example in tunnels (Troisi et al, 2017;Perfetti et al, 2017). The camera was set in 1080p video mode, allowing both to automatically extract the photos from the recording (with a substantial decrease in the survey time) and to enlarge the pixel size to improve the signal-to-noise ratio per pixel. A first check was made to evaluate the proper disposition of the lights sources since the light conditions constitute a crucial part of the overall result. Shadows could be caused by lights placed behind the moving operator, or saturated images could derive from a light directed towards the camera. Several targets were then placed inside and outside the cave, and some distances between them were measured. The GCPs were acquired in the first part of the video to georeference the model. The transition between the outer bright environment and the inner dark location had been gradually performed to avoid unexpected light variations. To obtain convergent images in line with the close-range configuration, the operator followed specific instructions during the recording phase.
About one year later from the first survey a new important room (Area RA) was discovered by the archaeologists. A new survey was then planned to add the Area RA to the 3D model. The operator could not reach this area since the access was very tight and proper speleological training was essential. Moreover, due to limited operating space, a Xiaomi Mi9 smartphone camera was used instead of the Nikon camera. It was set in video mode (16:9, 1080p, 30fps) and the survey was made using the light produced by the smartphone lamp only.

Internal images elaboration
The acquired videos were processed extracting a single frame to obtain a minimum overlap of 80% between two adjacent images. The dataset obtained was then processed in free network adjustment using Agisoft Metashape 1.6.0. The bundle block adjustment of the first survey was restarted including the previously acquired GCPs coordinates and the scale constraints provided by the distances between the targets measured inside the cave. The second block was aligned to the first one using some manually recognized natural points and two common circular coded targets, whose position was not modified over time.
The alignment of the two resulting dense point clouds was subsequently refined using the ICP (Iterative Closest Point) algorithm on specific overlapping areas. The image blocks were aligned using the ETRF-89 system.

Drone image elaboration
The image drone acquisition provided 343 nadiral images with a geometric resolution of 5472x3648 pixel. The orientation of the image set was computed using Pix4D Mapper PRO software. Specifically, the bundle block adjustment was performed using the coordinate of 5 GCPs with an accuracy of 0.05 meters ( Figure 4). The process used 363262 tie point to align all images with an RMS reprojection error of 1.90 pixels, with an average point multiplicity of 2.8. On the other hand, the RMS reprojection error on the GCP is 0.79 pixel, and the RMS on the solution is 0.03 metres. Five GNSS points were added to the project as Check Points (CP). We obtained an RMS on reprojection error of 0.82 pixel and spatial RMSE of 0.09 meters. Afterwards, point cloud densification was performed getting a dense cloud of about 25 million points (Figure 4).

Figure 4 -The Image block oriented with Pix4D
The DTM was extracted from the dense point cloud using the CANUPO plugin of CloudCompare (Brodu and Lague, 2012). The final DTM was linked to the ETRF-89 reference frame.

RESULTS
The procedure described in the previous section allowed to obtain three 3D models. A detailed inspection of the evolution of the cave was performed by merging the two internal models. This operation has been made using the internal common markers of the overlapping area, identified on both the two images sets and on the relative dense point clouds ( Figure 5); the distances measured between the internal target allowed to control the deformations. An RMS on the differences of only 12 mm derived from the alignment refining procedure made with the ICP algorithm. A further merging was processed to link this resulting model to the external one, using the previously employed GCPs to refer the outcome to the ETRF-89 frame.
To extend the investigation area, a low-resolution Digital Terrain Model (DTM) obtained by an older aerial laser scanning survey available from the Italian Ministero dell'Ambiente e della Tutela del Territorio e del Mare was inserted in the project ( Figure 6). A more detailed analysis of the morphology of the cave and its surroundings has been made in CloudCompare environment, where a common and helpful tool allowed to obtain several cross-sections and slices of the project. Figure 7 shows one of the planes used to slice the model and Figure 8 represents the corresponding slice two meters wide. The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLIII-B2-2020, 2020 XXIV ISPRS Congress (2020 edition)

CONCLUSION
In this paper, an approach to merge several 3D models in a unique reference frame is presented. Indeed, the availability of a unique 3D model constituted a major step in the research process around La Sassa cave. Some points can be made. • The 3D model, coupled with archaeological and geological observation, allows a detailed interpretation of the origin and development of the cave and its archaeological deposit; • The availability of the model, together with the possibility of a near real-time update during the excavation, constituted a valid aid to plan the type and extent of the successive investigation; • The use of a commercial camera allows the realisation of the 3D model in the inner portion of the cave, where the TLS cannot be used for space constraints (and not even a reflex camera); • The use of SfM algorithms allows to minimize the acquisition time and to delegate it to inexperienced operators (in terms of 3D skills). This is a major advantage since the 3D experts do not need to be on-site to update the model.

Figure 7 -One of the slide planes used to investigate the cave evolution
The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLIII-B2-2020, 2020 XXIV ISPRS Congress (2020 edition)