EXTRACTING RELEVANCE FROM SAR TEMPORAL PROFILES ON A GLACIER AND AN ALPINE WATERSHED BY A DEEP AUTOENCODER

: This paper proposes to use methods for compressing the temporal proﬁles of Sentinel-1 images, in order to be able to evaluate and analyze the richness of the temporal dynamics, both on a glacier and on a watershed. We propose to use unsupervised deep learning to auto-encode the temporal information in 3 dimensions, allowing to use the three descriptors as three RGB components to produce a colored composition synthesizing the information. We compare this Convolutional AutoEncoder (CAE) approach with a dimensionality reduction based on a Principal Component Analysis (PCA) of the temporal proﬁles. The two methods, CAE and PCA, are applied to a time series over the Kyagar Glacier before and after a surge event, and on an alpine watershed to compare the differences in dynamic evolution associated with different terrain classes with and without snow. On the one hand, on the glacier, the stacks of 10 images used are too short for CAE to extract more than two really signiﬁcant axes. On the other hand, with longer proﬁles available over the alpine watershed, the CAE is interesting to improve the clustering results obtained from the decomposition.


INTRODUCTION
Temporal analysis of glacier and snow dynamics is crucial to analyze since snow and glacier evolutions are related to natural hazards, sea level rise and changes in water resources. In this context, radar images provide a valuable information which is available regardless of the weather conditions during acquisition. SAR radiometric changes of the surface can be monitored to get information on snow covers (Nagler et al., 2016), glaciers crevasses (Leclercq et al., 2021) or glaciers outlines evolution (Winsvold et al., 2018) for instance.
In this paper, we wish to compare approaches for fast extraction of relevant temporal information contained in SAR image sets. A solution can be to reduce the dimension of the temporal information to be able to vizualize it into a color composition. Since a color is coded in a 3-dimensional space, creating a 3-component image from an N-component time series means reducing the dimension of the information to 3.
To quickly visualize changes from a SAR time series, the work described in (Colin Koeniguer and Nicolas, 2020) proposes a visualization of change areas in colors, while pixels with stable radiometry appear in grayscale. But this colored representation is more suitable for the visualization of very short events occurring at a given date because a color is associated with a single date. It is also possible to detect long events, but sufficient contrast is rarely achieved in practice on natural areas (Colin Koeniguer and Nicolas, 2020). Therefore, to visualize a temporal stack of images of natural areas, it is preferable to consider another method.
Another classical method to reduce the dimensionality of the data is the Principal Component Analysis (PCA). This method can be applied to each temporal profile associated with a pixel. * Corresponding author: elise.colin@onera.fr The eigenvalues relating to the 3 components can then be used to code the 3 RGB channels of the image. The limitation of such a method is that the principal components are linear combinations of the initial variables, and they are necessarily perpendicular to each other. Moreover, it is a method that is very sensitive to outliers. In addition, as shown in (Di Martino et al., 2021), some variations between SAR backscattering profiles are very subtle, and are missed by the PCA decomposition. Therefore, it is relevant to explore other solutions.
Unsupervised deep learning using autoencoding networks can be used to learn how to decompose a signal over main axes. These unsupervised deep learning techniques are today affordable thanks to the increasing availability of SAR temporal profiles, in particular via the open source Sentinel-1 data of the Copernicus Program. Hence, in this paper, we consider using the autoencoding method presented in (Di Martino et al., 2021), which is a method for reducing the dimension of temporal profiles by autoencoding. We will call this method CAE for Convolutional AutoEncoder.
In a first case, we consider the Kyagar glacier, for which a surge event occurred in 2016(Round et al., 2017. In order to consider the contribution of the visualization method, we apply it to two image stacks, one constituted during the surge event, and the other after. The comparison of the behavior of the PCA and CAE methods carried out in this case study will be the subject of section 3.
In a second case, we consider the Guil valley in the Alps, to evaluate the contribution of the temporal dimension.

THE VISUALIZATION PROCESS AND COMPARISONS
We assume to have a stack of SAR images, containing T coregistered images of size N = nx × ny pixels.
Each pixel i of the image contains a profile of radiometric in- i ], gathered in a list l of profiles: l = {pi, ∀i ∈ 1, N } where N is the number of pixels in a single image.
In the case of images containing several polarimetric channels, as is the case for the Sentinel-1 images, p (t) i can be a vector of values containing the respective backscatter values of the different polarizations.
The objective is to compress the information contained in each intensity profile into a vector of dimension 3.
The first method considered is the classical PCA method. The second method considered is based on an auto-encoding neural network, which can be viewed as an extension of the PCA, leaning towards a non-linear representation. Autoencoder uses convolutional filters, but unlike CNN-type networks traditionally applied to images, here these filters operate along the temporal axis. Note that this choice implies an intrinsic consideration of causality in the network structure, since successive images of the sequence are connected to each other.

Visualization by PCA: Principal Component Analysis
The first method evaluted is the PCA. It is applied on a set of T temporal variables p (1) i , p i , ..., p (T ) i from the N realizations of these variables, where each pixel corresponds to one realization.
The image of these T random variables can be structured in a matrix M, with N rows and T columns.
The PCA then becomes the diagonalization of the square matrix A = M t M of size T × T , where t corresponds to the matrix transposition. Since A is a symmetric real matrix, it is diagonalizable in a basis of orthogonal eigenvectors. We then retain the 3 eigenvectors of this matrix associated to the 3 main eigenvalues.
The 3 values retained in each pixel i correspond then to the projection of the vector pi on these 3 eigenvectors.
In practice, it turns out that the projection on the first axis seems to be close to the temporal average of the signal. This average corresponds to a projection on the constant vector (1, 1, 1). This simply means that no date is privileged over the others, and that the first distinction between profiles is essentially between average radiometric levels. It is then the following axes that will focus on detecting non-stationary behavior.

Visualization by CAE: Convolutional Auto Encoder
The solution considered by autoencoder is based on the architecture presented in (Di Martino et al., 2021). This architecture includes the use of a temporal convolutional filter that relies on the sequential nature of the data. This sequentiality information does not appear in the PCA since the results are invariant by permutation of the order of the vector elements. This architecture has been successfully applied to agricultural plot areas (Di Martino et al., 2021). The clustering taking as input the embeddings generated by the autoencoder has shown very interesting properties of spatial segmentation, even though no spatial criterion is involved in the learning process. It is for this reason that we consider it in this article as a support for a colored visualization of temporal information.
Unlike PCA, the embeddings learned by this Convolutional Auto Encoder (CAE) are not ordered by importance. Moreover, they can be negative. Therefore, in order to be able to compare them to PCA, we spread the dynamics of each of the PCA axes and of the different embeddings between 0 and 1, then we compute the correlations between the pairs taken two by two. An example of such correlation combinations is shown in Fig. 1 , which corresponds to the case of a data set studied in the following section on Kyagar glacier. The first embedding will be selected to be the most correlated to the first PCA axis in absolute value; if the correlation is negative, we then operate a change of sign on the value of the generated embedding. Then, the second embedding will be selected using the same scheme to correlate to the second PCA axis.
For the previous example, Fig. 2 shows the correlations between axes once the selection is made. Both approaches have been applied to different image stacks: a first stack over the Kyagar glacier, then a second on an alpine watershed in France.

Glacier particularities
The mountain glacier Kyagar is located in the Karakoram Mountains and covers about 100 km2. It overlooks the Shaksgam valley, at the frontier between China, India, and Pakistan. It is a polythermal glacier spanning from 4750 to over 7000 m, consisting of three upper glacier tributaries 6-10 km in length which converge to form an 8 km long glacier tongue, approximately 1.5 km wide. This glacier plays a key role on the river situated dowstream of the glacier. During periods of glacier surge, the glacier velocity suddenly increases, snow and ice accumulate downstream of the glacier, leading to the damming of the river and the formation of a lake. The lake impounded behind this ice dam is known to fill and empty repeatedly causing dramatic floods known as Lake Outburst Floods (GLOF) (Round et al., 2017, Zhang et al., 2020, Haemmig et al., 2014. The latest surge of the glacier has happened from May 2014 to March 2016, leading to a GLOF in July 2015 (Round et al., 2017, Charrier et al., 2022. Due to the difficulty of access, most observations of the glacier are made through satellite remote sensing.

Dataset preparation
Our analysis was performed on Sentinel-1 time stacks acquired between late 2014 and 2020. The Sentinel-1 images were coregistered with each other by (Round et al., 2017). The glacier surge which occurs in 2015 opens crevasses which are characterized by a strong backscattering signal (Leclercq et al., 2021). Therefore, we have split the time stacks in two: one during and one after the surge. In order to focus on the glacier itself, we defined a mask using the glacier outlines given in the Randolph Glacier Inventory V6.0 (Consortium et al., 2017).

Comparison between pre-surge and post-surge observation
Two separate training scenarios were tested. In the first, we computed the PCA axes and embeddings on the 2015 data set. Then we applied this data encoding to the 2018 images. We call this scenario Single, because only one single calculation of the cAE network or of the main PCA axes was performed.
In a second scenario, we recalculated the embedding or PCA axis decompositions on the images of the second period: we call this mode Dual because the calculations are performed twice: once for 2015, the second for 2018.
The visualization results obtained are shown in Fig. 3.
Globally, we note a majority of the two complementary colors green and magenta, especially on the CAE visualisation. We can verify that the colored channels corresponding to the second and third axes of CAE are highly correlated (0.90). The second and third axes are therefore very close.
Very little variability is observed between the different scenarios of the CAE method, either between the surge and postsurge representations in the single mode, or between the postsurge representations from the single or dual approach.
However, both types of visualization allow to observe crevasses with a colored contrast on the upper left part of the ice tongue. These crevasses formed in 2015 during the surge period, and they are a valuable indicator for a surge event (Leclercq et al., 2021).
The images obtained by the PCA contain a greater variety of hues in the colors. This result can be explained by the fact that the PCA seeks to maximize the distances between axes, since the main axes are searched to be orthogonal to the previous one. On the contrary, PCA does not impose such conditions.
Cross-channel correlation calculations confirm these trends: In the dual scenario, the correlation between the first CAE axis during surge and the first axis after surge is very high (0.92), higher than in the case of the first PCA axis (0.84).
For a given period, the 3 axes of the CAE found are all 3 highly correlated. As an example for the post-surging period, the correlation between axis 1 and axis 2 is 0.94, and the correlation between axis 1 and 3 is 0.90.
In summary, since the axes of the CAE are less constrained in their mathematical construction, they lead to finding axes that are more correlated with each other. On this example, they highlight the low number of degrees of freedom required to represent the temporal profiles. Self-encoding networks lead to comparable results, whether they are learned on the period during or after the surge, contrary to the PCA which exacerbates the contrasts.

Observation of the whole dynamics
When the scenario is performed on the entire image, and over a much longer acquisition period, then the conclusions become very different from the previous ones. The color compositions found for the glacier image are shown in Fig. 4.
If it is difficult to make a quantitative evaluation, it appears that the CAE type visualization brings out as much information as the PCA, with a good spatial homogeneity of the zones, even though no spatial information was used in the learning architecture. This time, we can verify that the 3 axes of the CAE are equally well decorrelated, and therefore provide non-redundant information.
In conclusion on this dataset, it turns out that on very short time series, both types of visualization are sensitive to differences between the observed periods, such as glacier crevasses. However, the CAE struggles to find embeddings that are very different from each other. For longer stacks, CAE results in a visualization with strong color contrast between the different components.
As the PCA tries to find decorrelated axes, and the CAE is free from this constraint, using the calculation of correlations between the different embeddings of the CAE would probably allow to develop a criterion of complexity of the temporal profiles: indeed, in the first case of short time-series, only one axis seems to be representative of the information, while in the second case, the 3 axes allow to show more decorrelated behaviors.

The Guil watershed
The second study area of this article concerns the Guil basin, located in the French Alps. The Guil basin is an area of about 700 km² centered around the Guil valley which crosses the basin from east to west. We considered an area of 1024x1024 pixels centered around the village of Abriès, which corresponds to an area of about 200 km². We consider the period between 2018-06-26 and 2019-08-20, with data of descending mode from Orbit 139.
The detection of dry and wet snow in this type of basin is of major interest to EDF, the French Electricity Group. The optimization of the exploitation of the hydraulic resource for the production of electricity through the dams and the hydraulic power plants installed along the canals and the rivers, is an important issue. For this purpose, it is necessary to be able to make a good estimate of the stock of snow formed during the winter that will feed the dams during the melt, in order to improve the model for the forecast of flows.
Methods for detecting dry snow on SAR images, especially the Lievens method (Lievens et al., 2019), are strongly impacted by vegetation on detection results, as vegetation has a polarimetric signature close snow at C-band.
To determine the presence of vegetation, one possible solution is to determine if this information can be found in the temporal profile. Therefore, we will investigate the principle of visualization applied to these images, to analyze the contrasts obtained, particularly in relation to the presence of vegetation.  6 shows the two visualization results obtained. In order to improve the contrast of the images, a histogram equalization has been applied to each channel.
Both types of visualization show a colored contrast, much richer than the reactiv composition applied to the same data set and represented on Fig. 5. The Reativ representation shows only one green color, corresponding to the snowy areas. These are indeed the only areas whose radiometric variations are sufficiently different from a decorrelated speckle to be detected as a change. Unlike the REACTIV composition, the representations show a large number of shades. These different hues seem to correspond to different types of terrain, as can be seen in the corresponding optical image Sentinel-2 image given in Fig. 7.
We can observe different hues on the different terrain elements: notably the forested areas in purple color, and the bare terrain areas in shades of yellow or green, probably depending on their snow types.
More locally, we observe better color contrast. In order to make sure that the choice of the order of the colors is not a bias of our visual interpretation, we proceeded to all possible different combinations in the order of the colored channels. These permutations, of which three examples are given in Fig. 8, do not change our observations.
In order to evaluate the potential of CAE in a spatial segmentation approach and compare it to a segmentation performed on PCA, we focused on the analysis of a sub-area containing 3 small lakes and various terrain features. We applied the Quickshift spatial segmentation algorithm (Vedaldi and Soatto, 2008), a relatively recent 2D image segmentation algorithm, based on an approximation of kernelized mean-shift. Therefore it belongs to the family of local mode-seeking algorithms and is applied to the 5D space consisting of color information and image location.
Applied to the image colored by CAE, the segmentation algorithm manages to delineate the largest lake well, which the same algorithm fails to do when applied to the image from the PCA-type visualization (see Fig. reffig:segmentation). In other areas, the results of segmentation applied to both visualization schemes differ, as illustrated in Fig. 10.
In parallel to the segmentation, we also applied a K-means clus- The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLIII-B3-2022 XXIV ISPRS Congress (2022 edition), 6-11 June 2022, Nice, France tering on the CAE embedding space, in 10 classes. The result of this clustering is illustrated in Figure 11. The typical intensity profiles obtained are shown in Figure 12. They illustrate the diversity of the typical profiles thus obtained. This clustering manages to separate behaviors where the relative dynamic variations are very similar but the radiometric mean levels differ, as for clusters 3 and 5, but it also separates profiles whose mean levels are very similar, while presenting phenomenological differences in the associated dynamics, as for clusters 8 and 9.

CONCLUSION
This paper proposed to apply deep autoencoding of Sentinel-1 temporal radiometric profiles as a basis for data compression in dimension 3, so that it can also be used for a colored representation. The autoencoding approach was compared to that of a PCA-based information compression. The colored compositions have been applied to two mountain sites: a glacier in Karakoram having undergone a surge event, and the Guil watershed in the Alps. At this stage, only a qualitative analysis has been carried out. Both visualization schemes highlight the diversity of temporal radiometric profiles encountered on natural areas.
This analysis reveals that the obtained behaviors differ greatly depending on the size of the input data stack.
The main conclusion is that deep autoencoding could bring a relevant added value compared to PCA, when the size of the stacks is sufficient, typically about 40 dates. CAE seems to obtain better contrasts and better segmented areas corresponding to different generic profiles, which tends to confirm our previous results obtained on agricultural plots. The richness and complexity of the observed temporal profiles justify the use of deep learning algorithm to extract all the information as well as possible, and not necessarily in a linear way.
On shorter events such as 10 dates, CAE struggles to find three degrees of freedom. Therefore, a CAE could be used to establish a criterion to determine the number of degrees of freedom necessary to describe the diversity of temporal profiles. For example, the computation of correlations between the different components of the output of an autoencoding could be a valuable a posteriori tool to determine the optimal size of the data compression space.
The main difficulty in further validating the method is the lack of ground truth. In order to confirm these very preliminary results, in particular the ability of the CAE to segment vegetated areas from those covered by snow, we plan to define a scenario dedicated to this study, supported by data different from Sentinel-1.
Finally, it should be noted that our treatment approach here is a purely temporal approach that does not take into account a spatial criterion. This has indeed two advantages: first, the complexity of the algorithm remains limited. Secondly, we are sure not to lose in resolution, and the efficiency of the temporal autoencoder to segment spatial contours with precision has already been demonstrated on agricultural areas. But of course, it will be interesting in the future to consider spatiotemporal algorithms for a better exploitation of the data or a better morphological management of the areas of interest. The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLIII-B3-2022 XXIV ISPRS Congress (2022 edition), 6-11 June 2022, Nice, France