PHOTOGRAMMETRIC MONITORING OF GRAVITATIONAL MASS MOVEMENTS IN ALPINE REGIONS BY MARKERLESS 3D MOTION CAPTURE

Gravitational mass movements represent a significant hazard potential in Alpine regions. Due to climate change and an associated increases in extreme weather events, this risk is growing. For a better predictability of such events and the monitoring of affected areas, a precise determination of the ongoing movements is necessary. In this paper, a method for monitoring of Alpine slope movements based on image sequences is presented. By means of SIFT features, corresponding key points in the images of different epochs are found. Then, using forward section, the object coordinates of the points are computed. By using these coordinates and the detected correspondences, three-dimensional motion vectors can be determined. The calculated vectors are checked for significance based on their accuracy. The vectors found have a high spatial density compared to manually marked points and are detected automatically. In order to detect even small-scale movements, they are determined with an accuracy of a few millimeters. The data basis is a sequence of images of an active landslide on the Hochvogel mountain (Alps, Germany) which were taken in 2018 and 2021. On average, the calculated motion vectors show a movement of 75 mm.


INTRODUCTION
A significant hazard potential in Alpine regions are gravitational mass movements such as landslides or rockfalls. In particular, due to the increase of extreme weather events such as heavy precipitation expected as a result of climate change, it is likely that such happenings may occur more frequently (Clague and Stead, 2012). The serious consequences of such events can be seen in the massive rockfall of Piz Cengalo, which resulted not only in extensive material damage but also in fatalities (Mergili et al., 2020). For a better prediction of the occurrence and impact of such events, continuous monitoring of potential hazard areas is necessary.
In order to observe the movement of slopes and crevices, representative points are often located in the affected area to determine the movement of the entire domain. These points are often equipped with sensors such as extensometers (Malet et al., 2002, Leinauer et al., 2020, interferometry (Delacourt et al., 2007), time domain reflectometry (Thuro et al., 2014) or GNNS sensors (Squarzoni et al., 2005). Alternatively, the points are signaled with targets, which can then be measured in with sensors like total stations (Raffl and Wunderlich, 2020). Instead of observing single points, laser scanning systems (Kasperski et al., 2010, Pesci et al., 2011, Mayr et al., 2019 or photogrammetric approaches based on Structure from Motion (SfM) and Multi-View Stereo (MVS) (Eltner et al., 2016, Warrick et al., 2016, Kromer et al., 2019 offer the possibility to observe areal changes so that larger areas can be surveyed efficiently. While comparing point clouds of different epochs, the correspondences between the points are unknown. To determine these, the minimum distances between the points of the two point * Corresponding author clouds are often used. The distance can be defined in different ways (Holst et al., 2017b). One possibility is the use of the point-plane distances (Pfeiffer et al., 2018). However, these are often sensitive to outliers and measurement noise. A common alternative is the use of Multiscale Model-to-Model Cloud (M3C2) distances (Dinkel et al., 2020, Holst et al., 2017a. Here, the distance is calculated on the basis of key points and their local normals. The key points result in a smoothing of the point cloud, which suppresses noise. The disadvantage is that these distances can only detect the movements along the local normal. Changes in other directions cannot be detected. An interesting method to solve this problem can be found at (Holst et al., 2021). To compare two epochs of point clouds, they are first converted into a 2.5D elevation model. For these rasters, KAZE feature descriptors are calculated, from which the correspondences between the epochs are derived.
An extended method is presented in this paper. The aim is the determination of 3D motion vectors from image sequences for the observation of Alpine slope movements. The vectors to be determined should meet the following requirements: • High spatial resolution compared to manually signed targets points • Automatically detectable • Sensitive for the detection of movements in the range of several millimeters to centimeters • Statistically tested for significance.
The underlying concept in the presented method is to perform a feature-based co-registration between epochs for unique key points within the images. We test and evaluate our method on the basis of terrestrial image sequences of a the Hochvogel mountain (Alps, Germany). The summit is characterized by a large crevice which divides the summit into a stable and an unstable part (Figure 1). The unstable part is in danger of collapse and there is a risk that it will fall into the valley.

METHOD
In this section, the individual steps of our procedure are described in detail. An overview of the workflow is given in Figure 2. Different epochs of overlapping image sequences are used as input data. In addition to these, ground control points (GCP) are given which allow a transformation of the data into a global coordinate system. For the three-dimensional reconstruction of the study area, the images of each epoch are registered using bundle adjustment. For this purpose, key points and their correspondences are calculated using SIFT (Scale-Invariant Feature Transform) algorithm. This results in a sparse point cloud where for every point the associated SIFT descriptors are known. Additionally, the bundle adjustment delivers the internal and external camera parameters. In a next step dense matching can be performed to obtain a more detailed point cloud of the study area. These data form the basis for the change analysis. However, before this can be carried out, the data from the different epochs must be co-registered. Two different strategies can be used for this.
One possibility is based on the already mentioned M3C2 distances (Lague et al., 2013). These are calculated between the dense point clouds of the individual epochs. Correspondence and changes are determined as follows: In a first step, the point cloud is reduced to so-called core points, which leads to a smoothing of the point cloud and a reduction of noise. In a second step, a local normal vector is determined for each core point based on the points within a defined radius around the core point. Finally, the points are projected along their normals into the other point cloud. The distances between the core points and the projected points are used as a measure of change. A more detailed description of the M3C2 distances and their use in the context of rock slope monitoring can be found in (Dinkel et al., 2020).
The second strategy uses a feature-based matching approach for the co-registration between the epochs. The point correspondences are not determined on the basis of the point clouds, instead they are computed between the images of the epochs with help of SIFT descriptors. This enables the tracking of individual points between the epochs, for which 3D motion vectors can consequently be derived. This method is in the main focus of this work. The necessary steps, especially the co-registration and the calculation of the motion vectors will be explained in more detail below.

Feature-based co-registration
The aim is to calculate point correspondences between the images of an epoch I and a second epoch J. For the detection and description of these key points, we use SIFT feature descriptors (Lowe, 1999). Beside the descriptors, the object coordinates (sparse point cloud), as well as the positions and orientations of the cameras are known from the 3D reconstruction step. One possibility for the matching would be to compare the key points and descriptors from epoch I with those of the J. This would assume that the key points that are best suitable for bundle adjustment are also best suitable for co-registration. However, this is not necessarily the case, so we use a different strategy. Each key point of epoch I is projected into the images of epoch J.
Since we assume a movement between the epochs, the projected points are only approximately at the correct position in the images of J. To find the correct position, we select those key points which lie within a certain radius around the projected points. The size of the radius is defined by the maximum expected motion. The key point at which the SIFT descriptor best matches the descriptor from epoch I is assumed to be the corresponding point. To determine the similarity between these descriptors, we use the Euclidean distance. To be more robust against mismatches and to only select unique key points, we apply a ratio test between the closest and the second closest matches. According to (Lowe, 2004), we discard all matches where the shortest distance is greater than 80% of the second smallest distance. Following those steps we get for each point of epoch I a list of image coordinates from epoch J. From these we can compute the object coordinates in the epoch J using a forward section (Section 2.2).

Determination of object coordinates using forward section
To determine the 3D coordinates of a point from a set of image coordinates, we use a forward section. According to (Förstner and Wrobel, 2016), the object coordinates X, Y, Z of a point can be calculated from the corresponding image coordinates x and y , as long as it was observed in at least two images. The relationship of image coordinates and object points is described by the collinearity equation. Additionally, the internal and external orientation parameters of the respective camera must be known. In this work these are already known by the bundle adjustment.
The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLIII-B2-2022 XXIV ISPRS Congress (2022 edition), 6-11 June 2022, Nice, France To estimate X, Y, Z, we use a least-squares optimization based on the Gauss-Markov model. A detailed description of the used optimization method, as well as all necessary equations are described in (Niemeier, 2008) or (Luhmann et al., 2019), so that we provide here only a brief overview.
The observation vector contains the image coordinates of the images in which the point was observed. The unknown vector contains the object coordinates. For the stochastic model of the observations, we assume that the observations are equally accurate and uncorrelated. For the apriori variance factor σ0 we assume an accuracy of one pixel for all images. The Jacobian matrix contains the derivatives of the observation equation with respect to the unknowns, in this case the derivative of the collinearity equation according to the objects coordinates. For the optimization, approximate values for the unknown X, Y, Z are needed. These can be calculated by using the Direct Linear Transformation (DLT) and the coordinates of one image pair (Förstner and Wrobel, 2016).
We want to perform the estimation of the object coordinates separately for several points. To make sure that the orientation parameters remain the same for all points in a first step the stochastic model only contains the uncertainties of the image measurements. For a realistic statement about the uncertainties of the calculated points, however, it is necessary to consider these uncertainties of the orientation parameters. For this purpose, we introduce the parameters of the external camera parameters as fictive observations. This expands the observation vector and unknown vector by the coordinates of the projection center and the orientation angle of each camera, as well as the stochastic model of the observation and the Jacobian matrix. In contrast to the previous step, we use the posterior variance factor for the stochastic model of the image coordinates. The final uncertainties of the object coordinates result from the co variance matrix of the estimated unknowns.
We assume that we are using a calibrated camera. This means that the internal camera parameters have been previously determined in the laboratory, for example. If this is not the case, the optimization can easily be extended by these parameters.

Calculation of the motion vectors between the epochs
Let the vectors p = [pX , pY , pZ ] T and q = [qX , qY , qZ ] T be the object coordinates of the same point seen in the epochs I and J. The searched motion vector d is calculated from the difference of these points and the length D is given by the norm: The uncertainty σD can be calculated from variance propagation taking into account the uncertainty of the object coordinates. To check whether the calculated differences D are significant, we perform a Student's test (Niemeier, 2008). We check whether the calculated differences D are significantly different from zero in relation to the calculated uncertainty.

EXPERIMENTS: EXPECTED ROCK SLOPE FAILURE HOCHVOGEL
To evaluate our method we use two different image sequences of the summit of the Hochvogel mountain. The mountain is located on the border between Austria and Germany and has an elevation of 2592 m. The summit is crossed by several crevices with a width of two to six meters and a measurable depth of up to 60 m. The deeper parts of the crevice are filled with loose rock, so that only the first 10 meters are visible. The crevice divides the summit into a stable and an unstable part. The unstable part has a movement rate of about 2 cm per year. The expected rock slope failure has an estimated volume of possibly 260,000 m 3 (Leinauer et al., 2020).
The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLIII-B2-2022 XXIV ISPRS Congress (2022 edition), 6-11 June 2022, Nice, France The data used are parts of larger data sets obtained during photogrammetric measurement campaigns in 2018 and 2021. In this work we use 30 images for each epoch showing the unstable part of the summit. All image were captured by using a Sony Alpha 7RII camera with 42 megapixels and 25 mm focal length. To capture the images of the inside of the crevice and of inaccessible areas, the camera was mounted on a stick with a length of about 7 m. The camera was moved parallel to the crevice and images were taken approximately every 0.5 m, so that an overlap of approx. 80 % between the images was achieved.
For the camera and for each camera position, the parameters of the external orientation are determined by means of bundle adjustment using the Pix4D c software. For each calculated value, the corresponding accuracy is indicated. Using the same software we also compute a sparse and dense point cloud according to Figure 2. Since this is a commercial software, not all necessary partial results can be exported. For the featurebased matching between the epochs, we recalculate the SIFT descriptors of the sparse point cloud using OpenCV (Bradski, 2000).
The images used in this work were captured in such a way, that they contain parts of the stable and unstable area of the summit (Figure 3). To transform the image position into a global coordinate system, temporary targets were installed on the stable side of the summit. These are used as (GCP) during the bundle adjustment. The coordinates from GCPs were measured by using a GNSS receiver and a total station with an accuracy of 4 mm.

RESULTS AND DISCUSSION
In the following, we first discuss the results of the correspondence determination between the epochs. Then, the results of the computed motion vectors are shown. Finally we compare the motion vector with the M3C2 distances.

Correspondence recognition between the epochs
In Figure 4, the results of the correspondence recognition are shown. Since we are only interested in the movement of the unstable part, the points in the foreground and background were manually cut out. The yellow dots in Figure 4a show all key points in an image from 2021. The points are distributed over the entire side wall of the crevice. However, it can be seen that smooth rock areas have significantly more points than areas covered with debris or snow. Moreover, highly reflective surfaces barely show points. A total of 13000 potential key points were found for the image of 2021. Of these, only 1500 points were recognized in the images of 2018. In Figure 4a-b, these points are marked in red. This low number can mainly be explained by the strict testing of the uniqueness of the point, where many of the points were rejected during the matching. Furthermore, some points cannot be detected due to changes caused by erosion or snowfall.

Resulting motion vectors
The motion vectors can be calculated from the object coordinates of the points and the known point correspondences. In order to obtain a good redundancy for the calculated points and, thus, a low uncertainty, we consider only the points which could be observed in at least 10 images in each epoch. The distribution of the points shows that most of the points are distributed around this value. Assuming a movement rate of approx. 20 mm per year (s. Section 3), the result is in line with the expected motion during three years. However, some outliers with significantly higher movements can also be recognized. From a geological point of view, it can be assumed that the sidewall moves constantly in one direction. Therefore, it can be assumed that there are still a few outliers in the data.  All determined movements are tested for significance. The average accuracy of the computed motions used for testing is approx. 8 mm. The minimum and maximum accuracy achieved is about 4 mm and 19 mm. At a confidence interval of 95 %, about 8 % of the computed motions were assigned to be nonsignificant and therefore rejected.
The 3D motion vectors are shown in Figure 5 and Figure 6b.
In red are the vectors identified as significant and in blue are the rejected vectors. For illustration purposes, a point cloud of the rock surface from 2021 is also shown. When looking at the vectors, it can be seen that the motion have a constant length and direction. This is especially true for the vectors belonging to solid and large rock structures. Vectors belonging to loose boulders or debris-covered areas deviate from this direction. As also shown in the histogram (Figure 7), a few outliers can be recognized.

Comparison of M3C2 distances and motion vector
In this section we compare the results of the motion vectors with the results of the M3C2 distances. To that aim, we calculate the M3C2 distances between the dense point cloud from 2018 and 2021 (Section 2). To compare the methods we take these core points of the M3C2 distances, which have the closest distance to the points of the motion vectors. For a subset of the points the results can be seen in Figure 6a. The corresponding motion vectors are shown in Figure 6b. A clear difference can be observed in the orientation of the vectors. The majority of the vectors of the M3C2 distances point in a similar direction, but overall the result is significantly more inhomogeneous than the calculated motion vectors. Furthermore, it is noticeable that the points on the upper side of the sidewall point downwards due to their local normal direction. As expected, these points are not suitable to directly reflect the main direction of motion of the slope.
Differences also arise if you consider the length of the vectors. For this purpose, the differences of the calculated displacements for each point are given in Figure 8. It can be seen that, with the exception of some outliers, the M3C2 distances are always smaller than the length of the calculated motion vectors. In the median they deviate from each other by −0.01 m. This result was expected, considering that the M3C2 distances can only reveal the differences in the direction of their local normals. All in all, it can be stated that the calculated motion vectors seem to be better suited to describe directly the amplitude as well as the direction of the motion.

CONCLUSION AND OUTLOOK
This paper presents a method for the calculation of 3D motion vectors for the monitoring of rock slope failure from image sequences. The key points between two epochs were computed automatically, as well as their associations. By means of forward section, the object coordinates in both epochs could be calculated and, thus, the motion of the points could be determined. Finally, the length of the calculated vectors were statistically analyzed for their significance. The described method is suitable to detect movements of a few centimeters. However, there are still some outliers in the data. Furthermore, the detection of key points is strongly related to the characteristics of the surface. Only a few key points are detected for loose blocks and debris-covered areas. Future works should address this problem.