MODELLING COLOUR ABSORPTION OF UNDERWATER IMAGES USING SFM-MVS GENERATED DEPTH MAPS

The problem of colour correction of underwater images concerns not only surveyors, who primarily use images for photogrammetric purposes, but also archaeologists, marine biologists, and many other domains experts whose aim is to study objects and lifeforms underwater. Different methods exist in the literature; some of them provide outstanding results but works involving physical models that take into account additional information and variables (light conditions, depths, camera to objects distances, water properties) that are not always available or can be measured using expensive equipment or calculated using more complicated models. Some other methods have the advantages of working with basically all kinds of dataset, but without considering any geometric information, therefore applying corrections that work only in very generic conditions that most of the time differs from the real-world applications. This paper presents an easy and fast method for restoring the colour information on images captured underwater. The compelling idea is to model light backscattering and absorption variation according to the distance of the surveyed object. This information is always obtainable in photogrammetric datasets, as the model utilises the scene's 3D geometry by creating and using SfM-MVS generated depth maps, which are crucial for implementing the proposed methodology. The results presented visually and quantitatively are promising since they are an excellent compromise to provide a straightforward and easily adaptable workflow to restore the colour information in underwater images


INTRODUCTION
Underwater images are often affected by inconsistency in radiometry. Due to the optical properties of water, when light propagates in a body of water, all (but significantly higher wavelengths) colours are affected by a degradation in intensity. This degradation changes based on the examined wavelength and mainly in function to the acquisition depth, the camera to object distance, and the water physical characteristics and conditions for a given site at the specific acquisition time frame. For different disciplines, such as (but not only) underwater archaeology and marine biology, domain experts and scientists aim to obtain images with consistent colour compared to the real object of the scenery. These needs fostered the research towards automatic or semi-automatic colour correction methods and algorithms for underwater images, mainly used in pre or post-processing phases of the photogrammetric process. Objects photographed underwater appear to have a false blue/green tone. That is a result of several effects that are caused due to the water where in the air these effects are not present. Water causes significant attenuation of light while it passes through it, making its intensity exponentially weaker the more it travels (Jaffe, 1990). The attenuation of light underwater is frequency-dependent meaning that red light is attenuated over much shorter distances than blue light as well as the backscattering of blue and green, resulting in a change in the observed colour of an object at different distances from the camera and light source (Bryson et al., 2016). In other works, many researchers have addressed this issue and have developed algorithms to counter this effect and restore the 'true' colour in Underwater (UW) images (Akkaynak and Treibitz, * marinos.vlachos@cut.ac.cy 2019; Bianco et al., 2015;Bryson et al., 2016Bryson et al., , 2013Roznere and Li, 2019;Wu et al., 2017).
All the aforementioned studies and many more tried to resolve the issue of UW colour attenuation with the use of specific calibrated equipment and a-priori knowledge that occurred from measurements such as reflectance measurements in the surface, measurements for water attenuation coefficients, obtaining image intensity reference values using calibrated colour charts, spectrometers etc. But what happens when this kind of information and equipment is not available or when we deal with archive datasets? Some algorithms exist such as the gray world assumption, Lab, CLAHE, etc in order to correct the colours of UW imagery, all with pros and cons. The biggest disadvantage of these methods is that their performance is heavily dependent on the amount of colour that is present in the scenery, since they assume for the overall scene average colour.

Optical properties of water
As described by Wang et al., UW images always show a greenbluish colour cast, which is driven by different red, green and blue light attenuation ratios (Wang et al., 2019). The water properties that control light attenuation of water and thus the scene appearance is dependent on scattering and absorption. Attenuation coefficients control how the light decays exponentially as a function of the distance that it travels (Bekerman et al., 2020). Pure waters are optically clear mediums with no suspended particles; only the interaction of light with molecules and ions causes light to be absorbed in pure water (Morel, 1974). Short visible wavelengths, such as blue, are absorbed first, followed by green and then red. As a result, just 1% of the light reaching the water's surface reaches a depth of 100 meters. (Menna et al., 2018).
This research aims to understand how the intensity of absorbed light for the three RGB channels changes based on the camera to object distance variation. Based on that, a quick and fast method for restoring most of the true colours of the scene will be proposed based on a mathematical model that describes how the colour intensity is absorbed and backscattered, with the final aim of restoring the true information. The model utilises the camera to object distance for every pixel of the image that yields this information, in contrast to image enhancement methods like Lab or CLAHE algorithms. The camera to object distances are obtained from SfM-MVS derived depth maps of the images.

RELATED WORK
Various image enhancement and restoration methods have been proposed in the past decade. This section will give an overview of such methods and their relevance to our work.
A crucial topic that has kept scientists engaged over the years is the absorption and scattering coefficients of water. Jerlov classified waters into three distinct oceanic kinds and five unique coastal kinds in 1951 (Jerlov et al., 1951). Following Jerlov's contributions, various methodologies aim to determine the inherent optical properties of Jerlov water types (Akkaynak et al., 2017;Solonenko and Mobley, 2015).
A mathematical model for spectral analysis of water characteristics was proposed in 2014, instead of a colour correction technique in the RGB space (Blasinski et al., 2014). Akkaynak et al., (2017) utilized natural water bodies and categorized them to determine the positions of all physically important RGB attenuation coefficients for UW imaging. The authors here showcased that the range of wideband attenuation coefficients in the ocean is restricted and demonstrated that the normal transition from wavelength-dependent attenuation β(λ) to wideband attenuation β(c) is more complex than initially was demonstrated by standard image formation models. Ancuti et al., (2012) suggested a simple fusion-based approach for enhancing UW photos using a single input while blending multiple well-known filters. As the authors support, this method successfully improved UW footage of dynamic situations.
A first proposal for colour correction of UW images by using the lαβ colour space is presented in (Bianco et al., 2015). To increase image contrast, chromatic components' distributions are white balanced, and histogram cut-off and stretching of the luminance component are done. Their results show that this pipeline is thriving under the assumption of a grey world and homogeneous lighting of the scene. These assumptions are acceptable only for close-range acquisition in a downward direction, such as seabed mapping or UW photography and under situations with slight light changes. Bryson et al., (2013) proposed an automated method for rectifying colour discrepancy in UW photos gathered from diverse angles while building 3D structure-from-motion derived models. This contribution aimed to image large scale biological environments, which prohibited the use of colour charts due to the sensitivity of marine ecosystems to seabed disturbances. The authors deployed a "gray-world" colour distribution to focus on colour constancy. This means that surface reflectance has a gray-scale distribution that is independent of scene geometry. The same authors in 2016 proposed a formation model for calculating the true colour of scenery taken from an UW automated vehicle with strobes. This methodology required a unique setup of camera and strobes which subsequently allowed for the creation and proposal of a unique image formation model that considered this setup to provide the necessary colour restoration to the images. Akkaynak and Treibitz, (2018) modified the current UW image formation model. They derived the physically valid space of backscatter using oceanographic measurements, demonstrating that the wideband coefficients of backscatter differ from those of direct transmission, even though the current model portrays them as the same. As a result, they proposed a revised UW image formation model that takes these deviations into account and validated it using in situ UW experiments. The same authors implemented their work in (Akkaynak and Treibitz, 2019) creating the Sea-thru pipeline for colour reconstruction. While the revised model is physically more accurate, it contains more parameters that make it challenging to use. This methodology explains how to estimate these parameters for improved scene recovery.

MATERIALS AND METHODS
In the framework of this research, a compact underwater camera (Olympus Stylus TG-6) and a couple of colour calibration charts (ColorChecker® Classic | X-Rite) mounted on a PVC parallelepiped structure have been used ( Figure 1).

In Situ Tests
In order to investigate how different wavelengths, degrade in intensity, two tests have been performed related to acquiring RAW images at varying camera-to-object distance. The test has been carried out by acquiring multiple images of a selected object at a specific depth and performing a photogrammetric acquisition, placing inside the surveyed scene a 40 cm side Lshaped scale bar and the employed structure (without the camera) with the outer calibration chart exposed. In this way, it was possible to obtain depth maps (and therefore the related COD distance) for each of the acquired images. Additionally, for both sites, images of the colour chart were acquired in varying depths (one image every ~ 0,5 m) while descending and ascending from and to the surface. This was possible due to the setup that was mentioned above.
The tests mentioned earlier have been performed in two different sites in Cyprus. Test site 1 is the Green Bay diving site (Protaras).
The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLIII-B2-2022 XXIV ISPRS Congress (2022 edition), 6-11 June 2022, Nice, France The test here was performed at a maximum depth of 10 m. Here a photogrammetric acquisition of one of the several stone sculptures located around the diving site was performed. Test site 2 is the Lady Thetis Shipwreck diving site (Limassol). The test here was performed at a maximum depth of 13m. Here a photogrammetric acquisition of the shipwreck's upper deck's metal roof was performed.
The eight primary colour patches from the colour chart (cyan, magenta, yellow, red, green, blue, black, white) have been selected. For each channel (red, green, and blue), and based on the change of underwater depth (test A) and of camera-to-object distance (test B), absorption diagrams have been created for the test performed in the two test sites. The intensity value reported on each diagram has been obtained by applying a median filter to a sample area for each patch and normalising the resulting value (0-65535 to 0-1 value).

Data Utilization for Colour Correction Method
Initially, multiple images in varying depths were captured and we tried to model the absorption based on the depth. Unfortunately, this test was not successful. The main reason was that it wasn't possible to satisfactorily fit any of the tested functions and match the last image captured at the seabed. The functions tested, in both experiments were polynomial 1st, 2nd, 3rd and 4th order, exponential 1st and 2nd order. Hence it was decided to not utilize any of the dataset collected from this test on either site. Instead, we decided to focus on the camera to object distance related datasets and colour corrections.
Since UW image acquisition for 3D reconstruction is done exclusively at constant depth with the camera to object distance being the varying constant, we decided to focus on a colour correction method utilizing the data collected at the seafloor since they comply with the standard UW image acquisition for 3D reconstruction approach.

Photogrammetric Workflow
For the processing, the RAW images were transformed into 16bit uncompressed TIFF images in order to process them while keeping the raw information recorded by the camera, undistorted.
Multiple tests were done in order to finalize a proper fitting function to model the colour absorption and the backscattering in UW images. These tests were performed on the data that were suited for photogrammetric 3D reconstruction. In the Protaras site, the dataset contained images of a statue captured in varying distances and angles, while for the 2 nd site in Limassol, the dataset contained images of the shipwreck's upper deck's metal roof, suited for photogrammetric 3D reconstruction. The Xrite colour chart was visible alongside the aforementioned scale bar in both datasets.
The two datasets were photogrammetrically processed using Agisoft Metashape to extract the depth maps for the image frames. The 40 cm scale bar was used to bring the 3D models into proper scaling and for the depth maps to provide the proper camera to object distances for each image pixel.

FUNCTION FITTING FOR COLOUR CORRECTION BASED ON THE COLLECTED DATASETS
After the photogrammetric processing, the intensity values for the eight primary colour patches of the Xrite colour chart were extracted alongside their respective camera to object distances. This was done for all the images that the colour chart was clearly visible in order to ensure the proper acquisition of the colour intensities. For the intensities, instead of extracting the values given from 1 single pixel on the colour cell, it was decided to extract the median value from a 20 by 20 pixel box inside the colour cell. This ensured that possible noise was neutralized and did not affect the experiment in the later stages.
In Figure 2 we see that when the camera to object distance increases the red colour image intensity values of the eight primary Xrite colour patches decay almost exponentially. In contrast the decay is smoother as expected for green and blue. This is also showed when the Limassol site dataset is evaluated (Figure 3).

Absorption and Backscattering
Observing the derived graphs, it is noticeable how different the colour decay is for the various colour patches. For example, we notice that the colour intensity values of the green and blue patches for their respective channels are decaying based on the The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLIII-B2-2022 XXIV ISPRS Congress (2022 edition), 6-11 June 2022, Nice, France camera to object distance. On the other hand, when we observe how the black patch behaves, we notice that for green and blue, the values are increasing. This observation led to breaking down the modelling into two parts. To properly model and correct the colour distortion in the acquired images, the proposed method introduces a model that includes both absorption and backscattering. This idea is also supported by the fact that various image formation models that exist in the literature Treibitz, 2019, 2018;Bryson et al., 2016) account for both absorption and backscattering with the latter being an additive element to the former.
Having that in mind we proceeded by modelling backscattering and absorption. To do so we decided to use the data from the two neutral patches, white and black. More specifically, the colour values based on the camera to object distances of Red, Green and Blue for the white and black colour patches were plotted ( Figure  4). Using linear regression, fitting functions were introduced in order to model the absorption and backscattering. Having observed the above, multiple tests were implemented to introduce the best fitting function for the data. Multiple fitting functions were tested such us polynomial of various degrees, exponential of 1 st and 2 nd degree and trigonometric functions. Based on the various tests, the backscattering is best modelled as a 2 nd degree polynomial for blue and green where the backscattering for red is negligible thus it can be considered 0.
Regarding the absorption, the best fitting function that can be modelled is either a 2 nd degree polynomial or a 2 nd degree exponential fitting. For both datasets 2 nd degree exponential fitting was used for the absorption where 2 nd degree polynomial fitting was used for the backscattering since they provided the best visual results. Figure 5 shows the results of the proposed method and the results provided by CLAHE and Lab algorithms. In Figure 5 Lab and the proposed method show the most visually pleasing results. However, the proposed method provides results only in the areas of the image where there is 3D information or in other words where we have obtained the camera to object distances. In our method after the colour correction, a white balancing based on the white patch of the colour checker is applied. Although we could argue that Lab gives somewhat of an acceptable result, the problem is that it does not take into consideration any geometry related information. The result may differ significantly if for the same dataset Lab is applied on multiple images. CLAHE on the other hand did not manage to improve the colours of the scene. The only noticeable difference is a contrast change.
The corresponding results for the Limassol dataset are presented in Figure 6 & Figure 7. The behaviour although was expected to be similar, it appears to be very different regarding the blue colour absorption on the white patch. Instead of the colour intensity reducing, we observe that as the COD changes, the blue The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLIII-B2-2022 XXIV ISPRS Congress (2022 edition), 6-11 June 2022, Nice, France intensity stays equal to 1. This can be explained by the shear dominance of blue colour at that particular depth (13-15m) where the dataset was captured. Figure 6. For Limassol dataset: from top to bottom is the colour value behaviour for Red, Green and Blue channels as the camera to object distance increases for the white and black colour patch respectively.
The modelling for absorption and backscattering remained the same for the Limassol dataset, meaning that backscattering was modelled as a 2 nd degree polynomial where absorption was modelled as a 2 nd degree exponential function. Figure 7 shows the results provided by the proposed method, Lab and CLAHE algorithms correction and the original image.
Overall, the visual result of Lab and CLAHE is similar to the one provided for Protaras image dataset. Regarding the proposed method, visually the colours of the scene are improved but some noise is present around the poles of the frame. This noise is byproduct of the depth map of the image since the SfM-MVS derived models include noise which is transferred into the depth maps.

RESULTS
Having tested the above on various images, the results vary depending on the images and more specifically, the camera to object distances. In particular, the colour reconstruction suffers more visually in images where the object was further away from the camera. The following figure shows the results of various images when the proposed method is applied and when lab and CLAHE colour corrections. Results on the various images are shown in Figure 8 & Figure 9 below.
The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLIII-B2-2022 XXIV ISPRS Congress (2022 edition), 6-11 June 2022, Nice, France Based on the figures above, the visual result on the produced images differs from image to image. Those differences are primarily due to the camera-to-object distances. From what was evaluated, images taken 3-5m away of the objects are shown more red, since the model that is applied, primarily tries to compensate for the lack of red. Since it is distance dependent it can overcompensate in some areas of the image. This effect appears in dark and shadowed areas of the image where the objects appear to have a reddish tone. The results produced for the two datasets differ due to the different environment, depth, seabed characteristics, and water particles. These differences were expected and the goal was to see and evaluate how impactful the algorithm could be and if the model could restore the missing colour information to a satisfying degree.
Additionally, compared with the image enhancement algorithms applied to the images (CLAHE & LAB), the proposed model manages to relieve the scene off the fog effect that is caused due to backscattering. This is the major primary benefit of algorithms utilising scene geometry over common standard enhancement tools. Since there is information about the scene geometry, it can be utilized in order to compensate and remove backscattering and thus the fog effect on the images. Furthermore, the colour restoration of both CLAHE and LAB fails in both sites due to the lack of enough light on the scene. Such algorithms require a significant amount of light to be present in order to efficiently restore the colour information of UW images and this is usually done using strobes that are mounted on the cameras. In this study that was not the case since the goal was the development of an algorithm that can restore the colour of the scene where only natural light is present as it is the case with shallow depths.
As it is demonstrated, the results of the proposed algorithm are visually pleasing and promising, since the applied model in each case managed to restore the majority of the missing colour information of the scene. Unfortunately, if we look closely, we observe that the tones of certain colour patches are not fully restored. For example, the red patch is shown to be darker than it is outside of water. This could be considered a weakness of the model.
To further evaluate the algorithm's performance besides the qualitative results and the visuals of the images, various quantitative metrics were applied for several images to determine the algorithm's effectiveness. These particular images contain the colour patches, but they were not used to fit the model thus they could be used for evaluation later on. The first measure of the evaluation was the use of the fitting score 2 (Eq1). This is a coefficient used to evaluate a model in cases where a percentage of the dataset is used for training and the rest has been used for testing. This fitting score is mainly used to evaluate ML algorithms but can be used for simple regression models such as the proposed one. The coefficient's best possible score is 1 and it also can be negative (Pedregosa et al., 2011). is the ground truth colour value of a colour patch for a particular colour channel. In these cases, as true values we used the colour intensities from the colour patch shown in an image captured in land before the dive.
is the predicted colour value after the implementation of the algorithm in that particular colour patch.
. is the mean colour value from all 24 patches for a particular colour channel. Unfortunately, since this metric does not have a fixed coefficient range, we decided to modify it to have a more comprehensive range of values. The modified metric 2 is shown below (Eq2). This modification does not include . and that allows the coefficient range to change from 0 to 1 with the former being the worst possible value where the latter is the best. Another metric used for evaluation was metric D. N is the number of colour patches which in our case was 24. This metric is produced by subtracting the minimized sum of squared residuals from 1 as is shown (Eq3) and the best possible value can be 1. Finally, as an additional evaluation metric we adopted the mean Euclidean distance produced taking into account all 24 colour patches for all 3 colour channels together. For this metric, the closer the value is to 0, the better the evaluation is. The results of these quantitative metrics are shown below in Table 1.  Table 1, we realize the variation between the values of each metric used as well as their consistency for specific colour channels in almost all the images. For instance, metric R 2 presents the lowest values for almost all the images but especially images that the colour chart is further away from the camera (IMGL1 & IMGL2). Additionally, we observe that all metrics related with the red channel are at their best value for images that the colour chart is closest to the camera (IMGP4 & IMGL4). Overall, the colour channel that appears to be the least reliably restored based on the metrics, is red which was expected since it is the colour that suffers the most from UW attenuation. Furthermore, the metrics appear overall to be worse for the images of the Limassol dataset. That can indicate the impact of the depth but can also be simply due to the different and thus worse environmental water conditions of the site.

CONCLUSIONS
This paper presented a fast method for restoring the colour information on images captured underwater. The model utilizes the scenes 3D geometry with the creation and use of SfM-MVS generated depth maps which is crucial for implementing the proposed methodology. The results presented visually and quantitatively are promising, since no other algorithm can be implemented in such a straightforward manner to restore the colour information. Of course, more accurate algorithms that utilize the scene's geometry exist in the literature. However, the disadvantage is that they need additional information regarding the physical water properties that can be measured using expensive equipment or calculated using more complicated physical models.
It must be specified that the model cannot be reused for different datasets meaning that for every new dataset, the presence of colour charts is mandatory to fit the model and evaluate it using the metrics shown in Section 5. Furthermore, different datasets are affected by different light and water conditions which makes the reusability of the model not ideal. Additionally, the model needs 3D information in order to work. The algorithm is not a simple image enhancement, but a 3D based image colour restoration algorithm.
The work presented in this paper only utilized a 2nd degree polynomial fitting for the backscattering and a 2 nd degree exponential fitting for the absorption as they provided the best visual results for these particular datasets. That does not mean that these are the only fitting functions that can be used. The fitting can be modified to provide the best possible results depending on the dataset.