IMAGE QUALITY IMPROVEMENTS IN LOW-COST UNDERWATER PHOTOGRAMMETRY

This study presents an evaluation of a cheap consumer-grade camera used for modelling a coral reef section. We evaluate the quality of a reconstructed coral reef using GoPro cameras and a high-end camera with data from an actual coral reef dataset. We also investigate components of the processing pipeline (like image quality) separate from the final results. Because our GoPro images suffer from severe chromatic aberration, we apply different image pre-processing steps to improve their quality and show its effects on the reconstructed object points. Bundle adjustment is carried out as free networks in all cases, with a follow-up rigid 3D Helmert transformation onto a geodetic control network, carried out to define the common datum and to remove the bias from the free network results.


INTRODUCTION
Underwater photogrammetry is a popular and relatively cheap method for modelling underwater areas at different scales and at different levels of accuracy. One of the reasons this technique is used for various applications is the quality of camera sensors and their optical components (lenses and domes) that has rapidly improved over the last years. While DSLR or high-end mirrorless cameras are the typical choices for demanding tasks, the quality of video cameras has increased accordingly and therefore might also be an option for underwater reconstruction tasks. Action cameras like GoPro or Sony that are equipped with underwater protection covers are cheap alternatives to high-end cameras like full-frame DSLR cameras with dedicated underwater housings. While action cameras typically have a considerably smaller entrance pupil, smaller pixel dimensions (i.e., 1.55 µm for GoPro Hero 7), and a flat dome port, dedicated underwater housings with DSLR cameras chosen for optimal performance under lowlight conditions seem to be better suited for the task of highaccuracy underwater modelling. In a previous study the accuracy performance of GoPro and Lumix cameras were compared in air and underwater (Guo et al., 2016). This was done by comparing the final results of a controlled object reconstruction by using a calibration frame with signalized points. Some problematic aspects when dealing with high accuracy underwater control point frames where addressed in Neyer et al. (2018).
Our study site is located in Moorea, Tahiti, French Polynesia. It is part of the Moorea Island Digital Ecosystem Avatar (IDEA) project (https://mooreaidea.ethz.ch/) with an international team of researchers. While the IDEA project includes many aspects of digitizing the whole island ecosystem, our task and ultimate goal here is to provide an easy-to-use procedure for underwater coral reef change detection at the cm to mm scale.
In our evaluation of GoPro image quality improvements, we use a high-end camera (Panasonic Lumix GH4) as reference. All measurements were acquired in August 2018.

Reference Frame
For change detection, a control network was set up at one of our test sites (5 x 5 m, Figure 1a). The control network was established using a dedicated construction of aluminum targets, anchored into dead corals or rocks. We use coded targets for automatic detection of the corresponding image coordinates in our datasets. For estimating the object coordinates in a local reference frame, multiple distance and leveling measurements (with a green light laser pointer) were used as raw observations. Applying the principles of trilateration, the geodetic network was optimized using Trinet+ software (Guillaume et al., 2008) as a free network. Careful evaluation yielded accuracy in the order of 1.5mm for all components. Details of the procedure are given in Neyer et al. (2018) and Nocerino et al. (2019).
The geodetic reference frame is used to anchor the different models such that comparisons can be carried out without any additional alignment step of the dense photogrammetric point clouds.

Image preparation
The two principle image datasets used in this study are images taken with a GoPro camera (with an average GSD of 1.2 mm) and a Panasonic Lumix GH4 camera (average GSD of 0.6 mm), furthermore named reference data. An experienced diver acquired both datasets of the 5 x 5 m test field, one after the other. In both cases, the cameras were pointing in nadir direction. Table  1 summarizes the acquisition details.
Besides the difference in GSD, obvious image quality differences are visible ( Figure 2): GoPro images suffer from • severe chromatic aberration • image compression artifacts (visible in a full resolution) • lack of contrast in some areas Because the reference images were stored as RAW files, a global white balance adjustment could be applied digitally before converting the images to the jpg format.

Reduction of Chromatic Aberration
Image quality in GoPro cameras is most severely affected by chromatic aberration (CA) and a blurred projection increasing further away from the image center ( Figure 2a). Blurred image parts cannot be fully recovered and as such we only concentrate on the improvement of CA. For the pure aim of bundle adjustment and 3D modelling, a single color channel can be used to overcome this error. Usually, however, color is an important information (for example for classification) and generally contains more structural information for the matching process than a single color channel can provide. CA correction can in principle be conducted by an individual calibration of the different color channels. Images can thereafter be undistorted channel-wise and recombined to a full RGB image. An alternative is to correct two color channels with respect to the third. Because the latter option is more flexible, i.e., an independent CA correction model can be applied, we chose to align the red and blue channels with respect to the green. This task involves three steps: (1) Displacements of the red and blue channels with respect to the green channel are estimated.
(2) A correction model is defined. The displacements between the channels are used to estimate the parameters (an independent set of parameters for the red and blue channel): a) Brown model, Brown (1971) b) Collocation model, Moritz (1973) (3) The red and blue channels are corrected and the color image is re-build.

Estimation of relative displacements
Relative displacements among the channels are estimated using an optical flow procedure introduced by Farneback (2003). Pixel displacements from red and blue with respect to the green channel are estimated for all images in the survey (431). In a next step, the median displacements for all image coordinates are taken for both the red and blue channel. This dataset is then reduced to 4000 uniformly distributed measurement coordinates for both channels. More measurements were not necessary to reliably estimated the parameters of the different correction models.

Collocation
Least-squares collocation is a well-known method for the differentiation between measurement noise and signal based on assigned neighborhood relations (i.e., correlations). In the geodetic context, this procedure is well described in Moritz, 1970 andMoritz, 1973. A linear estimator, in our case the displacements at the different image coordinates, ( , ) is combined with an empirical estimate of correlations in a stochastic field . If the noise component can be assumed to be uncorrelated with the signal and ~(0; ) and ~(0; ), the following solution can be formulated: with (. . ) indicating an estimated component and being a deterministic model. The least-squares system minimizes −1 + −1 . More details can be found in Neyer, 2016.
The solution in (2) can be computed once the deterministic model and the stochastic matrices and are defined. in its most primitive form may be chosen as simple mean to centralize the measurements (required due to ~(0; ) and ~(0; )). Here we chose a polynomial of 2 nd order. The choice of the deterministic model is not critical here as we have continuous dense measurements. is a diagonal matrix (i.e., no correlations between noise) and equal for all displacement measurements.
, the correlation matrix, is empirically estimated using the following model (using as the position vector): (3) is the correlation function with indicating the measurement location, i.e., pixel coordinates. represents the correlation parameter related to the correlation length given in (4). represents the distance with 50% correlation. The choice of this function is not arbitrary, as it has to fulfill a series of properties (details in Geiger, 1996or Neyer, 2016. In (3) there are three parameters to be estimated: 2 , u, and . These parameters are computed based on a least-squares adjustment of the autocorrelation of the computed relative displacements between the color channels. The model assumes isotropy, i.e., no directional dependency.
With all parameters of (3) determined (see result section), relative displacements of the red and blue channel with respect to the green can be computed as: ̂ is the estimated displacement positions ′ , ′, being the sum of the deterministic and stochastic parts. ′ gives the link between measured and interpolated pixel coordinates.

Photogrammetric Network
We use Agisoft Metashape (2019) for processing the different image datasets in a standard approach: Image features are detected, matched, and used in bundle adjustment with selfcalibration to create a sparse point cloud. The point cloud was further filtered by points seen in at least three images and with reprojection errors not larger than one pixel. Following this processing, coded targets were detected and the free network was transformed (similarity transformation) onto the geodetic reference frame. Finally, a dense point cloud in its highest resolution with mild filtering was computed for a 2 x 2m section. The procedure was applied to five datasets: (1) The reference dataset (Lumix images) (2) The GoPro dataset without CA correction

Dense Point Cloud Comparison
The comparisons of the point clouds is performed in CloudCompare (2017). The dense point clouds are triangulated into polygonal mesh models using the Poisson algorithm implemented in CloudCompare, preserving the original point space resolution (better than 1 mm).
The geometric difference between the different models is measured as mesh-to point distance, i.e. distances are computed for each vertex of one model relatively to the polygons of other (reference) mesh.
We compare the differences on two selected areas, i.e., a coral and sandy ocean floor, both in the 2 x 2 m section (Figure 1).

Reduction of Chromatic Aberration
Presented here are the solutions obtained for correcting the chromatic aberration with (1) the Brown model parameterization and (2) the collocation approach. The estimated chromatic aberration, i.e., relative displacements of the red and blue channel with respect to the green, are shown in Figure 3.
First, the parameters for the Brown correction model are estimated directly from the estimated displacement components among the channels. Parameters are tested for their significance whereas only the shift parameters and the radial distortion parameters are found to be significant.
Second, the parameters for the correlation function are estimated for both the red and blue channels respectively. A deterministic trend (polynomial of 2 nd order) is removed prior to the computation of the autocorrelation. Figure 4 shows that the respective correlations are between 300 and 520 pixels for the two channels. An additional feature that can be seen in Figure 3 is the total variance (indicated by the brown data point at = 0) and the signal variance given by the first parameter of the correlation function, i.e., 2 . The closer 2 to the total variance, the less noise can be expected in the vector field to be collocated.
Here we see that the red channel has more relative noise contribution, or in other words, the remaining stochastic signal is much weaker compared to the blue channel. The blue channel on the other hand, has a much higher absolute total variance that indicates more residual signal (and noise).
Using equations (2) and (6), the amount of relative displacements are estimated for the red and blue channels respectively.  . Estimated displacements of red (left) and blue (right) channel features with respect to the green channel. Note the scale difference: a maximum displacement of about 4 pixel was obtained for the red channel, whereas a maximum displacement of more than 20 pixel was found for the blue channel. The block-like structure in the red channel is an effect of the median filtering of all estimated displacements in this dataset.
The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLII-2/W10, 2019 Underwater 3D Recording and Modelling "A Tool for Modern Applications and CH Recording", 2-3 May 2019, Limassol, Cyprus Figure 5 shows the residuals obtained for the Brown and collocation models. In case of the Brown model, residuals up to 2 pixel remain, whereas for the collocation model, residuals in the subpixel regime are obtained. The Brown correction model resulted in a posteriori 0 of 0.31 and 0.54 pixel for the red and blue channel respectively. Using the collocation approach, the respective a posteriori 0 of the red and blue channels were 0.01 and 0.02 pixels. While there is an obvious difference between the two models, practically the corrected images are hard to differentiate. Because our GoPro images have a soft and (in the corners) blurred appearance in general, differences on the level of 1 to 2 pixel cannot be detected visually. An example of the correction effect is shown in Figure 6 with a clear improvement of CA.

Photogrammetric Network
All models are found to perform equally well in the bundle adjustment. Table 2

Dense Point Cloud Comparisons
In contrast to the empirical results obtained from bundle adjustment, obvious differences exist between the dense point clouds of the different datasets. We select two representative regions to illustrate the results in more detail.   The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLII-2/W10, 2019 Underwater 3D Recording and Modelling "A Tool for Modern Applications and CH Recording", 2-3 May 2019, Limassol, Cyprus Figure 7 shows the differences obtained for a (living) coral. The coral image is shown in Figure 8. Among the presented discrepancies (mapped on the respective meshes), there are systematic variations of up to 1.2 mm with standard deviations of 1.3 mm. Because the geometrical setting is identical in all cases presented in Figure 7, these differences can directly be related to the effects of the color channels and there CA. Using only the green channel for building the model seems to result in steeper gradients that can be observed as positive differences on coral branches (Figure 7a). Point clouds based on the collocated images as well as the images corrected by the Brown model show similar differences with respect to the uncorrected dataset ( Figure  7b and 7c). The difference among the latter two also has a standard deviation in the order of 1 mm (Figure 7c).

Ocean Floor
Differences between the point clouds for a model section with mostly ocean floor is given in Figure 9. The area is within the blue rectangle seen in Figure 1. Because the entire model is a densely populated coral area, the ocean floor section is not flat but rather characterized by an accumulation of various items (sand, rocks, dead coral debris, etc.). As in case of the coral comparison, a similar pattern is visible, whereas there are more areas of extreme differences (≥ 5 colored in red, ≤ −5 colored in blue). Again, the results of the two correction models show some level of agreement. The point cloud generated using the collocated images shows higher extremes: Positive differences are primarily seen on top of model peaks whereas negative differences are located mostly in valleys (Figure 9c).

Comparison with the reference model
While comparisons among the GoPro datasets with different correction models only show differences due effects of the color channel combinations, a comparison with our reference model turned out to be difficult for interpretation ( Figure 10). Although the reference dataset has better image quality and a smaller GSD, the photogrammetric network is also different: Due to the nature of acquisition, images were not captured at the same location, with the same orientation, and the same field of view. Consequently, differences of more than 5 mm are visible, especially near vertical structures (side of coral, Figure 10a and Figure 7. Comparison of dense point clouds for a single coral. a) to c) compare models generated by the green channel, the Brown dataset, and the collocated dataset with the uncorrected GoPro dataset. In d) the difference between the collocation and Brown datasets are shown. All numbers in millimeters. Figure 9. Comparison of dense point clouds for an ocean floor area. As in Fig.7, a) to c) compare models generated by the green channel, the Brown dataset, and the collocated dataset with the uncorrected GoPro dataset. In d), the difference between the collocation and Brown datasets are shown. All numbers in millimeters. 10b) or at small structures (f.e. ocean floor debris, Figure 10c and 10d). Blue colored areas in Figure 10 represent locations where the reference model is more extended (or higher) and red areas represent locations where the GoPro model indicates higher elevation. While there is no significant change seen in case of the coral differences between the uncorrected and corrected GoPro models, a shift of the offset and a slight increase in the standard deviation (and RMS) is observed for the ocean floor differences when comparing the uncorrected and the collocated models.

DISCUSSION
In this study, we first presented an approach of correcting severe chromatic aberration seen in underwater images of GoPro cameras. Two models (Brown and collocation) were used to correct (align) the blue and red channels with respect to the green. While estimated differences reached up to 20 and 4 pixel for the blue and red channels respectively, the Brown model showed a remaining systematic error of 1-2 pixel. By using the presented approach of collocation, the residuals of estimated channel differences reached the sub-pixel domain in all areas of the image.
We also noted in all our computations small but significant residual error patterns in image space after bundle adjustment of the type seen in Figure 5. These are caused by unknown factors of the optical system and cannot be compensated by the parameters of the Brown model. Therefore, the theoretical expectations (standard deviations of object space coordinates) could not be reached. We are still working on this issue.
In the second part, the study presented the effects of CA correction on the resulting 3D models. Interestingly, no significant difference in the bundle adjustment between the different correction methods could be observed. With an average GSD of 1.2 mm, a CA correction of 10 pixel (average in the outer areas of the blue channel), implies a shift of 12 mm in object space. This however is only true for image border areas in the blue channel. Because image quality towards the image borders is severely degraded anyway, most contrast information used to detect and match tie points may be retrieved from the green and red channels. The exact procedure, however, is not accessible.
For dense image matching, the situation looks different: Model deviations in the order of a few millimeters can be observed at various locations. Judgement of which model is closer to reality turned out to be difficult as there is no real known ground truth. The comparison with our (photogrammetric) reference model shows similar deviations with both, the uncorrected and the corrected models. One of the principal problems in dense image matching is the unknown uncertainty involved in the generated point clouds. As such, we cannot judge the quality of dense point clouds due to the improvements of image quality (and contrast) directly. The main differences visible in Figure 10 are mostly related to the difference in resolution (finer details result in higher model peaks or deeper valleys), image contrast (less smoothing between peaks and valleys), and acquisition geometry (different coverage). In addition, there is also an uncertainty in the generation of the dense point cloud itself: when comparing two dense point clouds generated from an identical processing stage (a simple re-computation), results are not identical (see Figure  11). Although deviations are in the sub-millimeter range, deviations of up to 1 mm can be observed at some isolated locations.
While it remains unclear which GoPro dataset proves to be the most accurate representation of the object space, deviations in the order of four times the GSD were found at sharp object edges. We therefore conclude that the presence of CA significantly influences the dense point clouds, i.e., the estimated 3D models.
In summary, the achievement of very high (subpixel) accuracies of underwater photogrammetry, comparable with "in air" Figure 11. Comparison of point clouds generated by repeating the dense image matching procedure for a) the selected coral, and b) the ocean floor. The blue circle in b) highlights an area where deviations in the order of 1 mm can be observed. Figure 10. Comparison of dense point clouds for a single coral and ocean floor. The difference between the uncorrected GoPro and the collocated GoPro with the reference model are shown in a) and b). The same respective differences for the ocean floor are illustrated in c) and d).
applications, seems not to be possible, at least not at present. There are many factors responsible for this fact. Man and equipment is not made for underwater work. Nevertheless, photogrammetry can play an important role in Ocean Science in different ways, if applied with expertise and with a realistic sense of what is possible.