ABOUT PHOTOGRAMMETRIC UAV-MAPPING: WHICH ACCURACY FOR WHICH APPLICATION?

: UAV surveys have become more and more popular over the last few years, driven by manufacturers and software suppliers who promise high accuracy at low cost. But, what are the real possibilities offered by this kind of sensor? In this article, we investigate in detail the possibilities offered by photogrammetric UAV mapping solutions through numerous practical experiments and compare them to a reference high grade LiDAR-Photogrammetric acquisition. This paper ﬁrst focuses on aerial triangulation and dense matching accuracy comparison of different data acquisition units (2 types of camera) and processing softwares (1 open source and 2 proprietary softwares). Finally, the opportunities offered by these different approaches are studied in detail on standard aerial applications such as power lines detection, forest and urban areas mapping, in comparison with our reference dataset.


INTRODUCTION
Since now half a decade, UAV survey has become a must have for mapping purpose, promising high accuracy at low cost (Colomina and Molina, 2014). Beyond the attractive message of some manufacturers and software suppliers, what are the real possibilities offered by this kind of sensor?
Through numerous practical experiments, we will try to draw a realistic portrait of what could be expected from photogrammetric UAV mapping solution. The paper focuses on the comparison of performance of different data acquisition units (2 types of camera) and processing softwares (1 open source and 2 proprietary softwares). For each configuration, aerial-triangulation and dense matching accuracy will be benchmarked relatively to a high grade LiDAR-Photogrammetric acquisition. Finally, we will study the opportunities offered by these different approaches on standard aerial applications such as power lines, forest or urban areas mapping.

Photogrammetric UAV equipment
Our choice focuses on Wingtra VTOL drone with both high end and consumer grade digital camera. Wingtra aircraft offers an autonomy of about 45-55 min of flight ( Figure 1).
The choice relies first on a full frame Sony RX1RII with 35 mm Zeiss lens. Image shot can be accurately synchronised with Xsync flash signal which is mandatory to use RTK/PPK L1/L2 GNSS option available. CMOS sensor offers 42 Mpix with a good dynamic range even in low light.
The second camera is a consumer grade Sony QX1 camera with a focal length of 20mm, and a sensor of 20 Mpixels. No real synchronisation is performed with GPS receiver of the drone because no output signal is available (like flash X-sync). The trigger time is recorded and serves as synchronisation time to geotag pictures. Table 1 shows the different parameters of both cameras. * Corresponding author

Reference equipment
The reference equipment is composed of high grade airborne sensors integrated to a common assembly composed of a Riegl Li-DAR VQ480U, a navigation grade IMU IXBLUE AIRINS (one of the highest end FOG gyro IMU on the market), and an 80 Mpix PhaseOne IXAR-180 Camera with a Rodenstock lens of 42 mm. All sensor times are synchronised through a dual frequency Javad Delta GNSS receiver ( Figure 2).

Experimental site
In order to have a variety of surface to study, we chose a corridor of 2 km x 200 m width, composed of culture, forest of deciduous and conifers, industrial area, railways, roads and a high voltage (380 kV) power line. A set of well distributed control and check points were placed and measured by static GNSS survey to ensure an accuracy better than 2 cm in X, Y and Z. A set of check points has also been deployed on some surface and measured by short base RTK GNSS. Figure 3 shows the location of areas of interest, and control and check points distribution.

Data acquisition
The area has been flown in December 2017 with 2 days between the LiDAR flight -Helicopter and photogrammetric flights -UAV.
To compare all sensors with the highest comparable terrain, it was mandatory to operate all sensors in a very short period of time. The flight parameters for each sensor are summarized in the Table 2. To compare sensors with the highest comparable terrain, it was mandatory to operate all sensors in a very short period of time. All flights were conducted within 2 days and no particular changes in terrain, nether in weather conditions were observed within those days and data sets are totally "comparable".

COMPARISON METHOD
The purpose of this paper is first to compare two photogrammetric sensors (see Section 2.1) with reference to a LiDAR sensor. Then three different softwares commonly used by the mapping community will also be compared on the same criteria: an open-source solution -MicMac (Galland et al., 2016), and two well-known proprietary softwares Agisoft Metashape (Jaud et al., 2016), and Pix4D Mapper (Cucci et al., 2017).
The comparison methodology has been defined according to the following parameters to study: • Aerial-triangulation performance: a comparison on bundle block adjustment was performed between the three softwares. As their mathematical camera models are different, internal and external camera parameters are not easy to compare since correlation between those parameters could be high without external direct georeferencing input. Nevertheless, tie points quality as well as GCP's and check points residuals can be analysed. The method uses BINGO-F, a reference bundle block adjustment software (Kruck et al., 1996). For each tested software, we exported raw image coordinates of tie points and GCP's. Thus we computed the block adjustment in BINGO-F and compared respective statistics: image residuals in image space (σ0), GCP's residuals. This gives an idea of tie points quality with σ0 information and its influence on EO/IO parameters. The tie points matching process has been made for all softwares with full resolution images.
• Point cloud matching and noise analysis, DTM filtering and DTM accuracy: most of the time, accuracy numbers rely on GCP or Check points residuals, which is far from real mapping accuracy. The real mapping accuracy can be observed at the final stage: the point cloud accuracy. Because point cloud accuracy will reflect all possible incoming error sources: image quality, resolution, quality of ground control points and their distribution, lens geometry, surface contrast. Thus, mapping accuracy will be checked on point cloud accuracy related to independent check points and point clouds. By independent check points, we mean points that have not been measured manually in the images. Moreover, noise is one of the components of point cloud accuracy, reflecting the ability of point cloud to describe details of surface. Here we focused on noise analysis for smooth surfaces such as paved areas or roofs. In terms of photogrammetry, it is an important component since it qualifies the performance of camera on poor contrast surfaces such as asphalt, snow or sand. We will compare the deviation between a reference planar surface and point clouds.
• Edge and sharp element rendering: Density is not the only aspect of point cloud accuracy. Is photogrammetric dense correlation able to provide the same sharpness as Li-DAR can? We focused on details such as roof edges (sharpness) and rails detection to answer that question. Electric wire and pylon detection also required fine edge element rendering, and as UAV manufacturers seem to recommend this technology for corridor mapping or power line mapping (vegetation clearance), it was interesting to compare each technology for wire and aerial object detection.
• Vegetation mapping: Despite its lack of penetration in vegetation, photogrammetry is widely used to obtain canopy models. We will therefore focus here on the accuracy of this type of product, in comparison with conventional LiDAR products.

RESULTS
As presented in the Section 3., we will here describe the results obtained on the different data sets, in term of (1) camera radiometric performance, (2) aerial-triangulation performance, (3) point cloud matching, (4) Edge and sharp element rendering, and (5) vegetation mapping.

Radiometric comparison
In order to explain performance differences seen later in aerialtriangulation (Section 4.2) and dense matching step (Section 4.3), it is necessary to have a look at raw sensor performances, and more specifically at the radiometric performances. Indeed, since all the photogrammetric process is based on pixels analysis, the sensor capacity to accurately record the local shape of objects is very important. Such capacity is very highly impacted by the radiometric dynamic range and spatial bandwidth known as Modulation Transfer Function (MTF), especially on homogeneous image areas (e.g., roads or building roofs).
The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLIII-B2-2020, 2020 XXIV ISPRS Congress (2020 edition) Figure 3: Experimental site near Aclens (Switzerland). The area presents a varied scene with construction, fields, forest and infrastructure (road, rail, power lines).
With a higher dynamic range, the Sony RX1 sensor is able to measure smaller radiometric variations than the Sony QX1, on Figure 6, one can notice that road areas (on Profile A and B) look very homogeneous when recorded with the Sony QX1, but have more intensity variations in the Sony RX1 data. The same behaviour can be seen on the roof profile (Profile C) where the metal slats borders are much more visible (with more contrast) in Sony RX1, than in the Sony QX1 data. Table 3 shows the different statistics of the bundle adjustment performed by BINGO-F for each tie points/set of tested softwares. As no export to Bingo format was available for MicMac, we tested it only on Agisoft and Pix4D.

Tie point quality
The most significant differences are in the image space residuals standard deviation called σ0 and the estimated precision value on tie points (derived from cofactor matrix). We can notice that σ0 is bigger for Pix4D than for Agisoft on high quality imagery, which characterise that tie points matching is more accurate on Agisoft. Nevertheless, on QX1 where image quality is poor, the difference is not significant which tends to show that image quality fades the tie points quality difference between Pix4D and Agisoft. The analysis of GCP's does not show a significant difference on residuals. This can be explain by the fact that the GCP's are measured "manually".

Ground residuals
As explained in (Höhle and Höhle, 2009), accuracy assessment of digital elevation models is not that easy. In this paper, the authors suggest that median value is a good indicator, thus we decided to study such indicator on our dataset. Since outliers have been deleted by hand (mainly power line points in the "under power line" area), we also computed standard deviations, as mentioned in this paper.
Thus, in order to qualify the georeferencing performance of the aero-triangulation step, 11 control points and 8 check points well distributed in the areas have been used (Figure 3). Control points are introduced in the compensation process and allows to control the ability of the bundle adjustment process to fit to those data, when check points allow to control the accuracy independently (mainly to detect over fitting behaviour). For each set of points (control and check points), we computed the Root Mean Square Error (RMSE), as detailed is Equation 1, for the planimetric component (XY) and the horizontal one (Z). Results are synthesized in Table 4.
First, one can notice that control and check points RMSE (both in XY and in Z) are, nearly by a 2 factor, better with the Sony RX1 than with the Sony QX1 camera, for all software, even if the Ground Sampling Distance (GSD) varies by a 1.4 factor (1.9 cm for the RX1 and 2.6 cm for the QX1). This shows the importance of sensor quality, as seen in the Section 4.1. Only one exception here is noted for Pix4D, where check points RMSE are better with the QX1 than with the RX1. This can be explained because this software had more difficulties in aligning images on the eastern part of the area (forest part), resulting on poorly georeferencing and large differences measured.
When comparing software, Pix4D always shows the best RMSE on Control points, showing a good ability to fit to those points, whereas Agisoft have the worst results (almost by a two factor). MicMac seems to be between the first two, closer to Pix4D with the RX1 camera, and closer to Agisoft with the QX1 camera. But when comparing results on check points, the ranking between the three softwares is completely reversed. On those latter, Agisoft is the only software to have similar RMSE on control and check points, while MicMac and Pix4d have large variations between RMSE on control and check points. Both seem to over fit to control points, not being able to generalize beyond these points.

Point cloud matching and noise analysis
Previous comparisons allowed us to compare aero-triangulation results on specific points (control and check points) which have been carefully measured in the images by an operator. In this section, we will study the ability of the software to correctly map in 3D the area. Such ability will depend on the aero-triangulation performance, but also on the dense point matching between images, and on the regularisation algorithm.
In order to perform this comparison, we selected 6 areas (Noise analysis zones on Figure 3) on various surface types and configurations. Three asphalt areas have been selected: first an easy one, near the center of the flight strip (later referred as Asphalt), then a more challenging one near the border of the area (Edge area), and finally a even more challenging one under a power line (Under power line). Moreover, two fields areas and a gravel one have also been chosen to compare dense matching performance. For each area, a mean plane has been determined from the LiDAR reference dataset. Then, median and standard deviation of distances from 3D dense cloud to this average plane have been computed from each combination of camera / software. All results are summarized in Table 5. Moreover, a more detailed statistical analysis using box plots was performed on 4 specific areas -asphalt, asphalt under power line, gravel and Field-A areas (Figure 7), and a detailed view of point cloud colored relatively to the distance to the mean plane on the asphalt and field area ( Figure 4).
First, as expected, median and standard deviation distances on Asphalt area are better than on Edge area, which illustrates the influence of the image quality on the edge of each frame. Therefore, we recommend taking into account a sufficiently wide margin at the edges of the area to avoid such problems, e.g. by adding an additional strip centred on this edge. On these two areas, the impact of the camera model is not obvious on the median distance, as the aero-triangulation step allows a good compensation of the measurements in both cases. However, we can observe that standard deviation values are on average higher by a factor two with the QX1 camera as with the RX1 camera, regardless of the

Asphalt area
Field-A area software used. When looking at more contrasted areas, such as field and gravel, this difference is no more relevant. Thus, a better camera sensor (like Sony RX1), allows to get more precise point cloud (with lower noise level) on homogeneous surfaces (e.g., asphalt), when lower quality sensor (Sony QX1), gives similar results on heterogeneous surfaces (e.g., field or gravel).
Software comparison in terms of accuracy (given by the median) and precision (given by the standard deviation) is not an easy task, since no general behaviour may be observed. Even so, one can notice that, in most of the cases, Pix4D seems to give more accurate point cloud (with lower median value) but with higher standard deviation, where Agisoft has slightly lower accuracy but with higher precision (lower standard deviation), as seen in Table 5. Moreover, the box plots, in Figure 7, reveal much greater differences in noise levels with Pix4D than with Agisoft. Distances from points to mean plane differ by a few centimeters on the point cloud from Agisoft, which is comparable to the noise level obtained with Lidar. Such distances vary by several tens of centimeters with Pix4D on the asphalt area. On such area, Figure 4 shows that Agisoft is the only software allowing a fine study of road surfaces (e.g., for detecting potholes). This behaviour is all the more noticeable on challenging areas, such as under the power line where distances from points to mean plane takes values higher than one meter. In contrast, point cloud noise levels obtained with the three tested softwares on more heterogeneous areas, such as the Field area, are comparable. Figure 7, on the Field-A area, shows no significant difference, and the 3D point clouds show similar patterns (Figure 4).
However, it is important to keep in mind here that this behavior can also be influenced by the regularization method used, in particular depending on the regularization parameter chosen. Indeed, with a lower regularization value, more small objects will be reconstructed such as electric wires (see Section 4.4), at the cost of higher noise on this type of surface. MicMac seems to suffer of aero-triangulation issues with Sony QX1: median values are always higher than ones from Agisoft and Pix4D. Since fine tuning parameter is not an easy job with this software, better results might have been achieved with a greater knowledge of this software. However, with the other camera (Sony RX1), MicMac gives similar results than Agisoft and Pix4D.

Edge and sharp element rendering
While accurate and precise planar surface reconstruction is a critical step when dealing with Digital Terrestrial Model (DTM), edge and sharp element rendering is even more important when object reconstruction and detection is necessary (as buildings, rail tracks or electric pylons). To perform such comparison, we computed cross sections on several areas of interest, and plotted together each photogrammetric dense cloud with the LiDAR reference data. Those cross sections are shown on Figure 9.
The first major observation that can be made concerns the quality of the reconstruction according to the camera models. Indeed, one can notice that for every software, the Sony RX1 provide much accurate and more detailed 3D model on building and rail tracks, allowing to see small object, which are not well visible with Sony QX1. Both better radiometric data and sightly lower ground pixel size, allow to improve the level of details of the 3D reconstruction. For a given sensor, one can also notice sightly different results from one software to an other. With both sensors, Agisoft seems to describe better the building edges, by preserving object sharpness. However, on the rail tracks, with the lower quality sensor (Sony QX1), Pix4D seems to be the only one to succeed to preserve object sharp, where Agisoft heavily smoothed the surface, and MicMac surface has a lot of noise.
The cross section on one electric pylon (Figure 9 -bottom), gives a good overview of the different behaviour of the three software programs being compared. First, each software gives very similar results with both cameras, but they all give various results. In this case, Pix4D gives the best results, the ground look smooth (at this scale) while it succeeded to reconstruct almost half of the pylon structure and wires. By comparison, MicMac gives also a smooth ground reconstruction, but has a very low number of points on the pylon structure (only at the bottom part). Agisoft failed to reconstruct the pylons, and moreover has a lot of noisy points on ground near the tower foot.
Thus, no photogrammetric solution succeeded to reconstruct the whole pylon structure and electric wires. Regarding pylon structure, photogrammetric process cannot be considered as irrelevant. Indeed, all the pylon structure, insulator part, wire attachment points, are well visible on images ( Figure 5), so that a manual reconstruction or automatic object detection (using machine learning algorithm) may give promising results. However, it seems much more difficult to get similar results on wires because of their very homogeneous linear structure. The GSD is probably also a source of partial failure in reconstructing wired structures, because lack of spatial resolution implies radiometric aliasing. If the size of the object is smaller than 3-5 pixels, aliasing kill definitely all details making possible a reliable point matching.

Vegetation mapping
We all know that unlike photogrammetry, LiDAR makes it possible to map the ground, even under dense vegetation cover. The idea here is not to show this property again, but to focus on modeling the top of trees, which is sometimes cited as one of the uses of photogrammetry. Indeed, in many country LiDAR based Digital Terrestrial Model are now available, and it would be interesting to use photogrammetric data to obtain the 3D model of the canopy at a lower price, and with higher temporal frequency. Figure 8 shows cross sections of three locations with different vegetation types: (1) low vegetation (10m high) area, (2) forest border area, and (3) area with road in the middle of a high forest. The first observation is that MicMac did not succeed to reconstruct any of those areas with the parameters we used, since this software is not easy to handle, fine tuning parameters might have provided better results. Agisoft and Pix4D got similar results on those areas.
In the low vegetation area, both Agisoft and Pix4D built successfully the canopy shape. Compared to LiDAR reference, only the left tree summit has not been reconstructed, resulting in a difference of about 1 m in vegetation height estimates. Pix4D is the closest to the reference data, with 0.6 m difference with the RX1 camera, when Agisoft gives a difference of 1.2 m. When using the lower quality QX1 camera, results are very similar for both softwares (1.3 m). In the forest edge area, those two softwares managed to build the tree trunk at the field border, but both produced bad 3D model of the top of this tree. Pix4D is sightly better here, but still has big difference with LiDAR reference data. Regarding the third cross section, Agisoft and Pix4D were able to detect the canopy shape and the road as well, in spite of high trees around it. However, one can notice large deviation with the Li-DAR reference. Such deviation varies from 0.5 m with Pix4D to 1.1 m with Agisoft with the RX1 camera, whereas both softwares give 1.6 m difference with the QX1 camera.

CONCLUSION
In this paper, the acquisitions made with two different sensors and processed with three photogrammetry softwares have been compared on various criteria. The first conclusion we can make, is that a full frame sensor (such as the Sony RX1RII we used here) is really an advantage, providing more accurate data, with better sharpness in the 3D reconstruction, with similar ground sampling distance. Indeed, the better radiometric performance of the camera allows better tie point matching, and so a more accurate bundle adjustment, as well as a better dense matching, providing a more detailed 3D point cloud.
Regarding the software comparison, none of them seem to really stand out. MicMac gives promising results on the bundleadjustment, but fails to deal with the forest part of the area, resulting on large errors on check points and non-existent point clouds in this area. Agisoft and Pix4D mapper perform better on such forest areas: Agisoft find enough tie points to have an accurate bundle adjustment, with the better results on check points comparison, when Pix4D gives more 3D points, with less deviation from the LiDAR reference on the forest canopy. However, this latter, gives very noisy 3D model on homogeneous areas (such as asphalt and gravel), when Agisoft gives smoother 3D model on such areas.
Finally, this study makes it possible to specify the pros and cons of photogrammetric surveys using UAVs, showing that there are lot of reasonable applications for photogrammetric data, while others need LiDAR data in addition to images.   The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLIII-B2-2020, 2020 XXIV ISPRS Congress (2020 edition)

Low vegetation
Forest edge Road through high forest The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLIII-B2-2020, 2020 XXIV ISPRS Congress (2020 edition)