COMPARISON OF DIVER-OPERATED UNDERWATER PHOTOGRAMMETRIC SYSTEMS FOR CORAL REEF MONITORING

Underwater photogrammetry is a well-established technique for measuring and modelling the subaquatic environment in fields ranging from archaeology to marine ecology. While for simple tasks the acquisition and processing of images have become straightforward, applications requiring relative accuracy better then 1:1000 are still considered challenging. This study focuses on the metric evaluation of different off-the-shelf camera systems for making high resolution and high accuracy measurements of coral reefs monitoring through time, where the variations to be measured are in the range of a few centimeters per year. High quality and low-cost systems (reflex and mirrorless vs action cameras, i.e. GoPro) with multiple lenses (prime and zoom), different fields of views (from fisheye to moderate wide angle), pressure housing materials and lens ports (dome and flat) are compared. Tests are repeated at different camera to object distances to investigate distance dependent induced errors and assess the accuracy of the photogrammetrically derived models. An extensive statistical analysis of the different systems is performed and comparisons against reference control point measured through a high precision underwater geodetic network are reported.


INTRODUCTION 1.1 Underwater photogrammetry
Underwater photogrammetry is widely employed for exploration and mapping of the marine environment.Its flexibility and lowcost, along with the availability of easy-to-use processing tools has made it very popular among scientists and practitioners in several fields, including archaeology (Menna et al., 2018) and marine ecology (Figueira et al., 2015;Storlazzi et al., 2016).Numerous examples in the literature describe research and experiments presenting the results of photogrammetry carried out by divers (Piazza et al., 2018;Capra et al., 2015, Guo et al., 2016), as well as by remotely operated or autonomous underwater vehicles (Drap et al., 2015).Some of these studies have examined factors limiting the wider use of photogrammetry in underwater environments including water turbidity that may significantly affect the image quality and light absorption in water, thus influencing the colour appearance of the images (Mangeruga et al., 2018).The presence of an underwater port in front of the lens alters the image formation geometry, introducing optical aberrations (Menna et al., 2016) and, in the case of flat ports, also introduces refractions/distortions, which translates into a departure from the classic photogrammetric mathematical model (Maas, 2015).In addition, practical limitations arise when working underwater.Previous papers (Neyer et al., 2018;Capra et al., 2017;Skarlatos et al., 2017) have discussed the issues of establishing highly accurate geodetic networks underwater.

Background
This study is a continuation of the work presented in Guo et al. (2016) andNeyer et al. (2018), and represents a portion of the Moorea Island Digital Ecosystem Avatar (IDEA) project (https://mooreaidea.ethz.ch/).Promoted by an inter-disciplinary and international team of researchers, the IDEA project aims to digitize an entire island ecosystem at different scales from island to microbes.Within this broad context, photogrammetry is carried out at different epochs to provide not only a digital representation of the underwater ecosystem, but also to add time as the fourth dimension to the classic 3D representation.The multi-temporal modelling approach constitutes the base to study how physical, chemical, biological, economic and social processes interact.

Study outline
The paper will present the results of efforts made to significantly improve the measurements of the underwater reference network.It also will focus on a comparative analysis of different underwater camera systems, with the aim of investigating the accuracy potential of single vs multi-camera systems and high quality off-the-shelf (i.e., digital single-lens reflex -DSLR and 5-GoPro: 5-head camera system with GoPro cameras named GoPro41 to GoPro45, where GoPro45 is the nadir looking camera.
Table 1 summarises the full specifications of the employed cameras systems, which are shown in Figure 1.
Results from a test area (Figure 2) with a size of roughly 5mx5m, a maximum height difference of about 1 m and average depth of about 12 m will be presented.The plot has been surveyed with all the systems at two different heights above the reef or working distances (2 m and 5 m, except for the D300 which was used at a working distance of 2m only) with the purpose of investigating distance dependent induced errors and assessing the accuracy of the photogrammetrically derived products.

ESTABLISHMENT OF THE COORDINATE REFERENCE SYSTEM
Nine points have been established within the test area using photogrammetric coded targets (Figure 3-a), placed on top of 30cm high poles to assure their visibility during the image recording process and for automatic recognition and image coordinate measurement.The poles are screwed into threaded, stainless steel anchors drilled into the coral reef matrix.Due to the limited working time underwater, only five points (reference points RP-1 to RP-5 in Figure 2) were measured (Capra et al., 2017) through trilateration (Figure 3b) and relative height differences (Figure 3c).
These five points then were used to establish a local coordinate reference system.The coordinates of the additional four points (P-21 to P-24) were not measured within the geodetic network, but were used to compare the results of the photogrammetric processing (Section 4).The local geodetic network was solved using Trinet+ software (Guillaume et al., 2008), as a free network solution.This approach provides optimal results in terms of inner coordinate accuracy, minimizing the mean variance of point coordinates (i.e., the cofactor matrix Qxx has minimal trace compared to all others adjustments with minimum datum).In an additional step, the results of the free network adjustment were transformed via a rigid 3D Helmert transformation onto a control point network computed with minimal constraints to define the consistent common datum (for details see Neyer et al., 2018).This procedure removes the bias of the free network result but preserves the optimal inner coordinate accuracy.We were able to obtain final average standard errors of 1.3 mm in planimetry and 1.5 mm in height.

CAMERA SYSTEMS SET-UP AND IMAGE PRE-PROCESSING
Table 2 reports the main photographic settings selected to assure capturing optimal quality images with the different employed cameras.
The three single camera systems (PL41, N750 and N300), were configured in single shot mode.To collect synchronised data, the PL51-PL52 stereo and 5-GoPro systems were used in time lapse and video mode, respectively.The stereo system synchronization was achieved by manually and simultaneously initialising the image acquisition for the two cameras.
For the 5-GoPro systems, the video mode was selected because a waterproof multi-camera hardware-based synchronization approach would require the modification of the factory pressure housing and the development of an in-house system.Video synchronization was then achieved via cross-correlation of the audio signals.The nadir looking camera (GoPro45) was selected as master and the delays of the other four cameras were estimated.Frames were extracted from each video stream at a fixed time rate (1 fps) in the lossless png format.The png frames were then converted in jpg at the highest possible quality; relevant exif tags were also embedded allowing photogrammetric software applications to automatically recognise images coming from different cameras and estimate the initial values for camera calibration (Nocerino et al., 2018).To improve the colour appearance and contrast of the GoPro frames, a red filter was used (Figure 3 a and b).Images from the high-quality off-the-shelf cameras (PL41, PL51, PL52, N750 and N300) were acquired in RAW format.RAW files contain uncompressed and minimally processed data captured by the image sensor, making it possible to perform white balance adjustment before converting the images to the jpg format (Figure 3 c and d).The black part of the photogrammetric coded targets visible in the images is employed as neutral reference for the white balance process.For the PL51, PL52, N750 and N300 cameras, each image acquisition was carried out with fixed focus, set for the first image of the sequence.A +4 dioptre was mounted on the N300 to allow the camera to properly focus underwater at the shortest focal length (i.e.@ 18 mm).

PHOTOGRAMMETRIC PROCESSING
The collected image datasets were processed following a free network self-calibrating bundle adjustment approach, using both Agisoft Metashape (V 1.5) and DBAT (V.0.8.5.1 1 ; Börlin and Grussenmeyer, 2013).The two software tools produced results that were not significantly different.Eight different cases for two working distances, 2 m and 5 m, were considered, i.e. the five high quality off-the-shelf cameras, nadir looking GoPro, stereo and 5-GoPro systems.
The coordinates of the five RPs were used a-posteriori to define the datum and served as check points (CPs) to assess the achieved accuracy.The accuracy of the object space coordinates was then computed empirically as follows: 1 https://github.com/niclasborlin/dbat/ (1) (2) (3) (5) Where: • the subscripts Photo and RP indicate the photogrammetrically derived and geodetic network point coordinates, respectively • X and Y define the horizontal plane and Z is along the vertical direction.

Image observation residuals
The maps in Fig. 5 show the size of image observations residuals (or reprojection errors) r: where (  ,   ) represent the image observation coordinates in the image plane and (̅  ,  ̅  ) are the re-projections of the 3D coordinates estimated within the adjustment procedure (image coordinate residuals).
A similar systematic pattern is observed for the PL41 and N750 with higher residuals arranged in a circular shape around the image centre and towards the borders.The residual systematic effect for the N750 was already reported in (Menna et al., 2017) and it is assumed to be related to optical effects introduced by the dome port.
Although PL51 and PL52 are nominally the same camera system, they show very different residuals maps, with higher values for the PL52.This performance also is consistently observed in sections 5.2 and 5.3.
A peculiar residual systematic effect is visible for N300 and very likely due to local defects of the optical elements (the lens, the dome, the dioptre or a combination of them).
The image residuals are quite high in magnitude for the GoPro.This is not surprising due to poorer image quality caused by a combination of the cheaper sensor and lens and the presence of a flat port (Menna et al., 2017).Comparing the reprojection errors, the GoPros produced values that were greater than the higher quality systems by a factor of 2. This is in agreement with the results in Guo et al., 2016.4: Comparison of photogrammetric BA for different camera systems at the two working distances: RMSEXY|RMSEZ|3D_RMSEXYZ of differences computed on the nine coded targets 3D coordinates (values are in mm).

Errors with respect to the geodetic network and standard deviations of the object space points
Table 3 summarizes the results of the free network selfcalibrating bundle adjustment for the different camera systems at the two working distances.Although the networks are not the same, high consistency among the photogrammetric systems can be observed.The horizontal errors are larger than the vertical component at both working distances, with the exception of PL52, PL51-PL52 and N300.This is not in accordance with theory (see the standard deviations in Table 3) and can be attributed to the fact that there are still, even after self-calibration, small systematic errors in the system (see Fig. 5).The maximum error is consistently, except in two cases, on the same reference point.Standard deviations of the object space points (σX, σY, σZ) were computed in dbat following a soft-constrained BA approach by introducing the RPs with their standard deviations as obtained from the geodetic network adjustment (section 3).The standard deviations are not reported for the multi-camera systems because it is not possible to perform a multi-camera BA in dbat.As expected, σZ is generally larger than σX and σY and the highest values are observed for the GoPro camera.Interestingly, the values are roughly the same for the two working distances across all of the camera systems.

Comparisons of the different camera systems
Table 4 shows the results of the comparisons between the different camera systems for the working distances of 2 m and 5 m.Table 5 reports the results of the comparison for each camera system at the two working distances.In this case, the analysis is performed on the photogrammetrically derived coordinates of the nine coded targets (RPs + Ps, Figure 2).The RMSEs are then computed according to equations 1 to 6, where the point coordinates from the same camera systems at the two working distances (Table 4) or two different camera systems (Table 5 in Appendix) are introduced.As expected, greater differences are observed in the vertical direction and the differences are smaller at the shorter working distance.

DISCUSSION AND OUTLOOK
In this study we evaluated the metric performances of several different off-the-shelf camera systems for underwater photogrammetry when used under real environmental conditions of scientific diving campaigns carried out for the purpose of quantifying coral growth over several years.An extensive statistical analysis of comparisons against reference control point measured through a high precision underwater geodetic network is reported.Multiple lenses (prime and zoom) with different fields of views (from fisheye to moderate wide angle), pressure housing materials, ports and sensor sizes (from the smallest 1/2.3-inchGoPro action camera sensor to full frame) were utilized.Some systems PL51-PL52 and GoPro, were utilized in a multi camera rig configuration.As general trend, high consistency was observed among the photogrammetric systems, especially for the higher quality camera systems (PL41 and D750).For the PL51 and PL52 image data were processed in single camera configuration to evaluate the effect of differences arising from manufacturing tolerances (centering of both camera lens elements and alignment of the camera within the pressure housing).Interestingly, the two identical camera model and lens systems PL51 and PL52 showed different results.Surprisingly, the 5-GoPro system performed well in comparison with the other higher quality cameras as far as the RMSEs from the check points are concerned.However, the reprojection errors of the GoPros were greater than the other systems by a factor 2. The N300 generally showed the highest errors, likely due to a combination of the lens, dioptre and port.Tests were repeated at different distances (2m and 5m) from the coral reef to investigate distance dependent induced errors.Nevertheless, based on the maps of image observation residuals shown in Figure 5, the general behaviour of residuals did not change between the two tested distances and, excluding the GoPros, the RMS reprojection error improved at 5m. Contrary to theoretical expectations, the accuracy as computed from check points and the object points standard deviations differ from one another by quite a bit.This is an indication that there are still systematic errors in the systems.The different systems all performed within the accuracy required for quantifying the growth of several species of corals commonly found on coral reefs in the South Pacific.It must be noted that the tests performed used a redundant network of images acquired with nadir and oblique optical axes and used a relatively small area of approximately 5x5m 2 .Under these circumstances the use of stereo or multi-camera system does not seem to further improve the triangulation results.The authors are currently investigating how remaining systematic effects, not compensated by the camera mathematical model, will affect the results.Also, we would like to extend our investigations to larger areas.The accuracy achieved by the different systems is assessed for circular targets triangulated from several viewpoints and ideally, it represents the potential accuracy achievable by the systems under the described conditions.Further photogrammetric products, such as a point clouds generated through dense image matching, may not necessarily achieve the same accuracy, especially in areas where corals are self-occluding.This specific topic is currently under investigations.The quality control for measurements of natural, not-signalized points is a serious problem.Another remaining challenge is the establishment of a highly accuracy geodetic control field at the same (or even better) level of accuracy as the expected photogrammetric observations (~1-2 millimetres).Menna, F., Agrafiotis, P. and Georgopoulos, A., 2018

Figure 2 .
Figure 2. (a) Orthoimage of the test area no.18. Marked are 9 signalized (coded targets) points.(b, c, d) Particulars of the dense point cloud.Accurate reference networks are required for environmental change detection and monitoring.They are crucial when the variations to be measured are in the range of a few centimeters per year, typical of highly dynamic environments such as oceanic coral reefs where 3D landscape elements are continuously changing over time.Corals may grow or shrink; sand is accumulated or dispersed.Divers and underwater vehicles may themselves cause changes to the reef architecture.
Figure 3. (a) Underwater reference point (RP).(b) RP-to-RP distance measurement (image acquired a moment before the tape was straightened for measurement reading); (c) RP-to-RP relative height difference measurement by leveling, using an underwater green laser pointer mounted on a tripod.

Figure 4 .
Figure 4. Frames extracted from a GoPro video recorded without (a) and with (b) the red filter.RAW images before (c) and after (d) WB process.

Figure 5 .
Figure 5. Maps of image observation residuals

Table 1 :
Key parameters of the used underwater camera systems.

Table 3 :
The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLII-2/W10, 2019 Underwater 3D Recording and Modelling "A Tool for Modern Applications and CH Recording", 2-3 May 2019, Limassol, Cyprus Results from self-calibrating BA in free network mode.