INVESTIGATIONS INTO THE ACCURACY OF THE UAV SYSTEM DJI MATRICE 300 RTK WITH THE SENSORS ZENMUSE P1 AND L1 IN THE HAMBURG TEST FIELD

The development of increasingly powerful Unmanned Aerial Vehicles (UAV) is progressing continuously, so that these systems equipped with high-resolution sensors can be used for a variety of different applications. With the Matrice 300 RTK, Da-Jiang Innovations Science and Technology Co. Ltd (DJI) has launched a system that can use the high-resolution camera Zenmuse P1 or the laser scanner Zenmuse L1 as a recording sensor, among other sensors. In order to investigate the geometric quality of these two sensors, HafenCity University Hamburg, in cooperation with LGV Hamburg, NLWKN in Norden and the German Archaeological Institute in Bonn, flew over the 3D test field in the Inselpark in Hamburg-Wilhelmsburg on 5 August 2021 with the P1 camera and the L1 laser scanner. Using the Matrice 300 RTK as carrier platform, the test field was recorded in various configurations at altitudes between 50 m and 90 m above ground. Prior to the UAV flight campaign, 44 marked ground control points (GCP) were signalised in the test field, which had already been surveyed by LGV in 2020 using geodetic measurement methods to achieve a coordinate accuracy of ±5 mm for each GCP. The results of aerial triangulations as well as 3D point clouds generated from image data and laser scanning are compared with reference data in order to demonstrate the accuracy potential of these measurement systems in this paper. * Corresponding author


INTRODUCTION
Unmanned aerial vehicles (UAVs) are increasingly used in various disciplines for flexible surveys of small to mediumsized survey areas. The use of UAV systems equipped with Real-Time Kinematic (RTK) GNSS increases the attractiveness of these systems for many tasks, as they offer a positioning accuracy of 2-3 cm in the national coordinate system with these sensors (Gerke and Przybilla, 2016;Przybilla et al., 2020;Kersten and Lindstaedt, 2022). As a consequence, a significant reduction of control points is possible, making the use of RTK-GNSS based platforms more flexible and efficient for many applications. In recent years, UAV systems with RTK-GNSS have increasingly established themselves as workhorses for applications in UAV photogrammetry. With the DJI Matrice 300 RTK, a system is now available that has high positioning accuracy and can be equipped with a high-resolution camera or laser scanner, among other sensors. This makes it possible to record a wide variety of objects such as urban scenes, coastal zones, agricultural areas or forest areas. Results on the geometric quality of aerial triangulations for different UAV based camera systems have already been published (Przybilla et al. 2019;Kersten et al. 2020). Gerke and Przybilla (2016) presented first results on the influence of onboard RTK-GNSS and cross-flights for a UAV system, while Przybilla et al. (2020) published first results of RTK-based UAV photogrammetry using four DJI Phantom 4 RTK systems flown in cross-flights at different altitude on the site of the Zollern colliery UAV test field in Dortmund. Further accuracy tests have been carried out by Zhao et al. (2020) and Zhao (2021). In recent years, unmanned aerial systems with RTK-GNSS are state-of-the-art in UAV photogrammetric applications. In order to investigate the geometric accuracy potential of these two sensors P1 and L1 on-board the UAV system Matrice 300 RTK, HafenCity University Hamburg, in cooperation with the State Office for Geoinformation and Surveying (LGV) Hamburg, the Lower Saxony State Office for Water Management, Coastal and Nature Conservation (NLWKN) in Norden, Germany and the German Archaeological Institute (DAI) in Bonn, carried out aerial flights over the 3D test field in the Inselpark of Hamburg-Wilhelmsburg on August 5 th , 2021. The UAV flights were conducted in various flight configurations and at flight altitudes between 50 m and 90 m above ground. For accuracy investigations, the image orientations and camera calibrations of the different UAV image flights were calculated by aerial triangulation using the software Agisoft Metashape. The accuracies of aerial triangulation were analysed using different ground control and check point configurations. The accuracy potential of the laser scanner was analysed using geodetic check points and reference data (profiles and selected areas) of a terrestrial laser scanner. Additionally the laser point clouds were compared with imagebased point clouds of P1 and with official data of airborne laser scanning provided by LGV. The following questions, among others, are answered:  What accuracies (aerial triangulation and terrain models) are achieved by the UAV flights of the two recording systems in these investigations?
 Which aerial flight configurations provide the best results compared to reference?  Is it possible to reduce the number of GCP with corresponding lower accuracy requirements for projects when using accurate RTK-GNSS observations for UAV flights?

THE UAV TEST FIELD IN WILHELMSBURG INSELPARK
In the Inselpark in Hamburg's Wilhelmsburg district, which hosted the International Garden Show in 2013, the LGV Hamburg set up a test field for UAV systems consisting of 45 ground control points (GCP) on an area of 150 m × 300 m. The GCP coordinates were determined using various geodetic measurement methods and the heights were additionally determined by levelling. The LGV specifies a coordinate accuracy of ± 5 mm for each GCP coordinate. As can be seen in Figure 1, the GCP are evenly distributed over this approximately 4.5 ha area of the Inselpark. Prior to the survey on August 5 th , 2021, 44 GCP were signalised on grass, asphalt and sand using target boards made of waterproof plastic with dimensions of 50 cm × 50 cm ( Fig. 1, right).

Figure 1:
Ground control point distribution in the UAV test field Inselpark Hamburg-Wilhelmsburg (left) and targets on different surfaces (right) -grass, asphalt, sand and stone.

THE UAV SYSTEM USED
The DJI Matrice 300 RTK ( Figure 2) is a 6.3 kg quadcopter from the Chinese manufacturer DJI Technology, which can be operated at altitudes of up to 5000 m with a maximum flight time of 55 minutes. Equipped with the Automatic Dependent Surveillance -Broadcast (ADS-B) anti-collision system, the UAV can achieve a positioning accuracy of 1.0-1.5 cm + 1 ppm using RTK-GNSS. In contrast to many comparable systems, the M300 RTK does not have a fixed sensor, instead the platform can be equipped with various sensors such as the camera DJI Zenmuse P1 or the (airborne) laser scanner DJI Zenmuse L1 for aerial flights. The M300 RTK is powered by two TB60 batteries. For longer missions, the batteries can be replaced one after the other during operation after landing without disconnecting the sensor system from the power supply.

The DJI Zenmuse P1 Camera
The DJI Zenmuse P1 camera (Figure 3 left) is offered by DJI for the Matrice 300 RTK. This is a 45 megapixel (pixel size 4.4 μm) digital camera equipped with a full-frame (35.9 mm × 24 mm) CMOS sensor that can be operated with various lenses offered with different focal lengths. In the context of these investigations, a lens with a focal length of 35 mm was used, which has a field of view (FOV) of 63.5° and can take photos in an aperture range (F-Stops) of F2.8 to F16.

The DJI Zenmuse L1 Airborne Laser Scanner
In addition to the P1 camera, the Matrice 300 RTK can optionally be used with the DJI Zenmuse L1 airborne laser scanning sensor (Figure 3 right), which is the first laser scanner from DJI. This scanner, which is equipped with a LiDAR module from the manufacturer Livox, has a range of 450 m with a FOV of 70° (LIVOX 2022). In flight planning, a choice can be made between single-return or multiple-return mode. In addition, two different scanning modes are available, which result in different point patterns for specific requirements or objects to be scanned, and which enable scanning of up to 240,000 points per second. However, the L1 sensor also manages up to three returns per laser shot, so that the point rate can be up to 480,000 points per second when scanning vegetation, for example, with two or three returns (Singh, 2020). These two scanning modes are referred to by DJI as repetitive and non-repetitive (Fig. 4). According to the manufacturer, the L1 sensor achieves a system accuracy of 10 cm in attitude and 5 cm in altitude at a flying height of 50 m above ground. Unfortunately, it is not clear from the manufacturer's technical specification whether the system accuracy refers to positioning or 3D point determination. The precision of the distance measurement (RMS 1σ) for the laser scanner is specified as 3 cm at a distance of 100 m (DJI 2022).

Aerial flight configurations
First, the test field was recorded by two image flights with the Matrice 300 RTK/Zenmuse P1 system (Table 1). These two flights took place at an altitude of 70 m and 90 m above ground. During the first flight, a combination of nadir and oblique images (backwards and sideways) was taken, while during the second flight at the higher altitude only nadir images were taken. This resulted in a Ground Sampling Distance (GSD) of 8.8 mm for the nadir images and 10.2 mm (image centre) for the oblique images (oblique) at an angle of 60° for the first flight, while the second nadir flight had a GSD of 11.3 mm. For both flights, the exposure time was set to 1/1000 sec, while the F-Stop varied between 4 and 7.1 and the light sensitivity of the sensor between ISO 400 and 640 for an optimal exposed image. Subsequently, three flights over the test field were carried out with the Zenmuse L1 laser scanner ( Table 2). The different scanning modes were compared and the influence of increasing the flight altitude from 50 m to 90 m was investigated. The strip overlap was set to 60% for all flights. In addition, multiple return echo mode was used on all flights to investigate the ability of laser scanning to penetrate vegetation. After the start of the UAV flight, the laser scanner and the inertial measurement unit were calibrated in the air by a recording procedure implemented by the manufacturer before the actual data acquisition started. During the flight and scanning operation, the 3D point cloud was already coloured in real time by the RGB values of the Zenmuse X4S camera (20 megapixels) integrated in the laser scanner and displayed on the DJI Enterprise smart remote control, which has an ultra-bright 5.5-inch 1080p display for controlling the UAV system during flight.

DATA EVALUATION AND RESULTS
The recorded aerial image blocks were evaluated in the software Agisoft Metashape V1.7 using the signalised 44 GCP. The aerial triangulations of both image flight configurations were calculated with different GCP configurations in order to assess the quality of the results based on different variants similar to (Kersten et al., 2020). In Agisoft Metashape, the image point measurements were performed automatically and the GCP measurements semi-automatically. In the subsequent bundle block adjustments, the software calculated the image orientation and camera calibration parameters for each GCP version. In the next step, 3D point clouds were generated by dense image matching for the photo blocks of UAV flights 1 and 2 using the orientation parameters of the version with all 44 GCP. The data from the Zenmuse L1 laser scanner can (currently) only be analysed with the DJI Terra software. The imported point clouds of the three flights were each optimised by strip adjustment and finally exported in LAS format in the UTM coordinate system (EPSG 4647) and with ellipsoidal heights, just like the point clouds generated in the photos. The highest quality level was selected for the data processing. The quality of the 3D point clouds generated from the acquired data of the five UAV flights was investigated using 44 checks points (ChP) and by comparing different profiles and reference surfaces acquired with a FARO Focus 3D X330 terrestrial laser scanner. The reference data were scanned around the building, which is visible in Figs. 8 and 10, in 29 scans (resolution 1/5 and quality 3x). When registering the scans in the FARO® SCENE software, an average point error of 4.2 mm was achieved. Comparable geometric accuracy investigations of image-based 3D point clouds have already been carried out for various UAV systems in the test field at the Zollern colliery in Dortmund (Przybilla et al., 2019).

Comparison of the Results of the Aerial Triangulation
For detailed accuracy investigations, different GCP versions with different numbers of spatially well distributed ground control points (all GCPs, 12, 5 and 1 GCP) were calculated in bundle block adjustments, whereby all GCP not taken into account were then used as check points. In all bundle adjustments, the positioning coordinates of the exterior orientation showed an RMSE (Root Mean Square Error) in the range of 11 to 16 mm, while the deviations of the height coordinates were calculated at approx. 11 mm. The averaged standard deviations of the RTK-GNSS measurements for the image positions of both image flights were 15 mm in attitude and 29 mm in height. However, the individual values of the RTK-GNSS measurements per image position were introduced into the bundle adjustment as a priori standard deviation. The results for UAV image flight 1 with nadir and oblique images (2215 photos) are summarised in Figure 5. The GCP have been measured on average in 155 photos. The a priori standard deviation for each control point coordinate was set to 5 mm in each adjustment version. In the bundle adjustment without GCP or with a single control point, the deviations at the 43 and 44 checks points are for X = 15 mm and Y = 11 mm, whereby the deviations at the height Z are higher by a factor of 2.8 with up to 42 mm (right two columns in Fig. 5). Even in the adjustment with all GCP, the RMSE for the check points is 19 mm in the height coordinate, while the XY coordinates are at average deviations of 10 mm and 5 mm respectively. The fewer GCP are used in the adjustment, the significantly higher the RMSE values in the height coordinate become. Due to the very high redundancy caused by observations in 2215 aerial images, a significantly better result was expected, which was then achieved with image flight 2 (Fig. 6). Causes for the large height deviations in the GCP and ChP could be the geometry of the flight configuration, the narrow FOV of the lens as well as the recording procedure with the pivoting of the camera on the lever arm (gimbal) and the associated change in focusing for oblique images compared to nadir images, which thus also influences the camera calibration. DJI defines the vector of the lever arm from the GNSS antenna centre to the projection centre of the camera, which should have only minor correction effects on the result. The reprojection error, a geometric error corresponding to the distance in the image between a projected and a measured image point, was 0.4 pixels for image flight 1 and 0.3 pixels for image flight 2. The image point measurement accuracy of the signalised GCP was determined to be 0.2 pixels for both image blocks.
The results of UAV flight 2 including only 408 nadir images are summarised in Figure 6. Each GCP was measured on average in 23 photos. As an a priori standard deviation, 5 mm was chosen for all three coordinates of the GCP in the adjustments, which corresponds to the accuracy achieved by the geodetic GCP determination. This assumption of the standard deviation was confirmed by the adjustment using all GCP (Fig. 6 left column). Even with decreasing number of control points, the deviations (RMSE) at the check points remain at 10 mm or better. It can also be seen that using only a single GCP stabilises the result of the adjustment in the position and height of the check points (right columns in Fig. 6). From this it is concluded that despite the accurate RTK-GNSS measurements of the image positions during the aerial flight, at least one GCP should be placed in the object area to achieve an acceptable result of the aerial triangulation, especially at altitude. The importance and influence of ground control points for aerial photo triangulation, especially for aerial flights without RTK-GNSS, is shown by (Lindstaedt and Kersten, 2018) for various projects.

Point-based comparison
For point-by-point comparisons, the shortest distance (in vertical direction) between the check points (ChP) and the dense point cloud is calculated. Due to the high point density (see Tab. 4) and the flat target signs, it is assumed that the Zcoordinate around the centre of the target sign is the same. The distribution of GCP for the study area is shown in Fig. 1  was created in Metashape with the resolution "medium" from the image data of flights 1 and 2, while the point clouds of flights 3-5 were acquired directly from the laser scanner and processed in the DJI Terra software. The results show that the aerial flight with the combination of nadir and oblique images has a systematic height offset of 39 mm, which also occurs in the aerial triangulation results due to the deviations (RMSE) at the check points in the same range. This result is also documented in Figure 7 (left) by the red colouring of the check points. In contrast, only small local systematic effects are visible in Fig. 7 (right), which, however, result in small deviations at the check points. The smallest deviations at the check points were achieved with the nadir images (flight 2), as the maximum negative deviation ranges from -29 mm to a maximum positive deviation of 13 mm and thus has a span of 42 mm (Tab. 3). For the three data sets of the laser scanner, an equal level of accuracy is achieved in each data set, which differs only slightly from the good result of image flight 2.

Line-based comparison
In the line-by-line comparisons between profiles from the point clouds of the five UAV flights and reference data, object areas with height differences were selected in the study area scanned with the terrestrial scanner (Fig. 8), such as stairs (profiles 1-3) and a house façade with roof structure (profile 4). The quality of the point clouds was visually analysed here using profiles 2 (stairs) and 4 (house wall) as examples (Fig. 9). In the visual comparison between the generated profiles and the reference data of the terrestrial scanner, the measurement noise in the point clouds of the L1 laser scanner can be seen on the one hand and the quite good reproduction of the stairs in the point clouds of the UAV image flights on the other hand ( Fig. 9 left).
The comparison of the results shows a very similar result for profiles 1 and 3 as for profile 2. As expected, the point cloud of image flight 1 showed a very good fit to the house wall below the roof overhang due to the oblique images in profile 4 ( Fig. 9  right), while the other point clouds are smoothed in the area of the roof overhang. Especially in profile 4, the advantage of oblique images can be demonstrated if vertical structures in dense point clouds should be measured. For the comparison of the profiles, airborne laser scanning data from 2020 was also used, which was acquired on behalf of the LGV Hamburg using a RIEGL VQ-780II laser scanner with a point spacing of approx. 10 cm as the result. In this data set, the stairs are also slightly smoothed, but due to the small number of points and presumably good filtering including smoothing, measurement noise is not obviously visible.

Area-based comparisons
For the areal comparisons with the available reference data, the different point clouds from the five UAV flights in three selected test areas were analysed. The test areas for the areal 3D comparisons are shown in Fig. 10. The selected areas represent surfaces with varying surface structures: Area 1 (paving stones, concrete and sand), Area 2 (smooth paving stones) and Area 3 (wood, sand and lawn). For the test areas (areas 1 and 2), point clouds from terrestrial laser scanning with the FARO Focus 3D X330 are available as reference data ( Fig. 11 and 12), while for area 3, comparisons were only made between the point clouds from image flight 2 (nadir images) as the best data set of the image-based point clouds with the three different point clouds of laser scanning (Fig. 13). In addition, a comparison was also made with the airborne laser scanning data from the Riegl scanner (Fig. 14).

Figure 10.
Overview of test areas in the Wilhelmsburg Inselpark (outlined in red): Area 1 (paving stones, concrete and sand), Area 2 (smooth paving stones) and Area 3 (wood, sand and lawn).
Tables 4 and 5 summarise the deviations (in Z) between the 3D point clouds of all five flights and the TLS reference data for area 1 and 2, which were calculated in CloudCompare, as were the previous comparisons.
The following results can be summarized:  Flight 4 with the laser scanner L1 has the lowest number of points per m 2 due to its flight altitude of 90 m above ground and, together with flight 5, the highest maximum deviations or the largest span as the amount of the sum of maximum negative and positive deviation.  Flight 2 with the Zenmuse P1 camera has the best results in terms of maximum deviation, span, average deviation and standard deviation. However, the number of points per m 2 for both areas is lower than for the other flights, also due to the flight altitude. Only flight 4 with laser scanner L1 flown at 90 m above ground has a lower number of points per m 2 .  The differences between the two laser scanner flights 3 and 5 are very small, so that one can conclude from these results that there is no difference in the result of the two scan modes repetitive and non-repetitive in the available data sets.
 The image-based 3D point clouds of flights 1 and 2 provide better results than the point clouds of the flights with the laser scanner. With the combination of nadir and oblique images combined with the significantly higher number of photos, the highest point density per m 2 is achieved.  Especially in area 2 with the smooth paving stones, the image-based point clouds achieve significantly better results than those of the laser scanner.  With standard deviations of 5 mm to 40 mm from the reference, good results were achieved for the different generated point clouds (P1 and L1) in the point-by-point and area-by-area comparisons.  The following Fig. 11-14 visualises the colour-coded deviations of the 3D comparison calculated in CloudCompare between the test data set of the respective 3D point cloud and the reference or comparative data. The colour-coded scale shows the deviations in the range of ±2.5 cm in green, while the positive maximum with +25 cm is shown in red and the negative minimum with -25 cm in blue. The colour-coded visualisation of the deviations makes it easier to recognise systematics effects in the result.
In the left-hand graphs of Figures 11 and 12, systematic deviations (yellow colouring) to the TLS reference data can be seen in the point cloud generated by photos of flight 1 for test area 1 and 2. In contrast, for the point clouds of flight 2, as already visible in profile 2 ( Fig. 9 left), deviations can only be seen at the edges of the stairs. The deviations at the edges of the stairs are somewhat more pronounced in the point cloud of flight 3 with the laser scanner (see centre in Fig. 11 right). In the surface of the test area, the differences to the reference data are somewhat larger, whereby effects from the strip adjustment are probably also visible here. Fig. 12 shows an example of the measurement noise of the sensor for flight 5 (L1) with a slight systematic effect at altitude (yellow colouring).
Since no reference data were available for test area 3, comparisons were only made between the point clouds of flight 2 (nadir images) as the best data set of image-based point clouds and the three different point clouds from the L1 laser scanner (Fig. 13). The colour representation of the deviations between the point clouds of flight 2 and the laser scanner point clouds also shows slight systematic effects in height (yellow colouring in Fig. 13 left, reddish colouring in the left part of Fig. 13 centre and blue colouring in the left part of Fig. 13  right). Overall, the height differences between the point clouds are within the specified accuracy range of the Zenmuse L1 sensor (see chapter 3.2). For a visual comparison of the UAV-based point clouds, point clouds acquired by airborne laser scanning (ALS) with the RIEGL VQ-780II laser scanner could also be used. The data was provided by LGV Hamburg from an ALS survey in March 2020. These ALS data cannot serve as a reference due to the low point density of 23 points per m² and the presumably poorer height accuracy, but they reveal systematic effects in the UAV-based point clouds. Fig. 14 visualises the results of the 3D comparisons. Here it is again clear that the point clouds of flight 1 are systematically too high overall, while the point clouds of flight 2 and of the flights with the L1 fit together surprisingly well. There, the differences are, among other things, due to the different recording date, the vegetation growth and the different accuracy ranges.

CONCLUSION AND OUTLOOK
This paper summarises the first results of the accuracy investigations of the UAV system DJI Matrice 300 RTK with the sensors Zenmuse P1 and L1 in the Hamburg test field Figure 11. Comparison of point clouds to TLS (reference) for UAV flights 1, 2 and 3 on test area 1.   Inselpark. Flight planning and control was very easy using the DJI Pilot app, which is very user-friendly and allows automated flights. Compared to the DJI Phantom 4 Pro, the flight time is twice as long due to the two batteries on the aircraft platform. A system shutdown is not necessary when changing the batteries because both batteries can be changed one after the other. Due to the switched-on power supply, the parameters of the interior orientation presumably also remain stable for the camera.
The results of the aerial triangulations show that for UAV projects with somewhat lower accuracy requirements for checks points (XYZ = 3-5 cm), e.g. topographic applications, it is possible to compute the bundle block adjustment even without GCP coordinates, since the standard deviations of the exterior orientation parameter XYZ can nowadays reach 1-2 cm in XY and 2-3 cm in height Z by RTK-GNSS measurements. For reasons of reliability, at least one but preferably five GCP should be used at the corner and in the centre of object space. For the results of aerial triangulation, an accuracy of one GSD was expected, but this was only achieved in aerial flight 2 when the photo block was oriented using at least five GCP. The aerial triangulation of the nadir images (flight 2) achieved overall significantly better results at the check points than the flight 1 with the combination of nadir and oblique images, where the height component showed deviations of up to 42 mm for all bundle block adjustments. This combination of image shots during the aerial flight (nadir-backward-sideways) provides very good coverage of the terrain surface, but the jerky movements of the camera and the ongoing refocusing of the lens due to the changing shooting perspectives probably provide unstable camera geometry. However, this assumption still has to be verified with the help of the image data by splitting the aerial image configuration of flight 1 into three blocks (nadir images, oblique images backwards and oblique images sideways) so that three separate camera calibrations can be calculated. The examinations of the 3D point clouds showed a clear result: Aerial flight 2 with nadir images produced the best results in comparison with the other flights, while with the image data of flight 1 a systematic height shift occurred in the check points, in the profiles and also in the area-by-area comparison using reference data, which was not to be expected in this way. The three point clouds of the Zenmuse L1 laser scanner showed very similar results, which are even slightly better than the accuracy specifications of the manufacturer. A significant difference in the quality of the point clouds could not be found in the two scanning modes in the present study. Investigations into the performance of the laser scanner for applications in the detection of vegetation such as trees and bushes have not yet been carried out with this data sets.