EVALUATION OF 3D UAS FLIGHT PATH PLANNING ALGORITHMS

: The application of image-based methods in inspections and monitoring has increased signiﬁcantly over recent years. This is especially the case for the inspection of large structures that are not easily accessible for human inspectors. Here, unmanned aircraft systems (UAS) can support by generating high-quality images, that contain valuable information about the structure’s condition. To guarantee high quality and completeness for the acquired data, inspection missions are planned in advance by computing a ﬂight path for the UAS, that covers the entire structure with the required quality. Many approaches on this topic exist that aim to solve this planning task. Nevertheless, each publication on this matter mostly stands on its own, working with its own criteria and no comparison to other approaches. Therefore, it is currently not possible to compare different approaches and select the most suitable for a speciﬁc scenario. To solve this problem, this work proposes an evaluation pipeline that applies well deﬁned quality criteria on ﬂight paths for close-range image-based inspections. These criteria are limited to fundamental aspects for the evaluation of paths that were created for diverse scenarios with diverse criteria and still ﬁnd common ground for comparison. As experiments show, this pipeline allows the comparison of different approaches, objectifying the performance and working towards a common understanding of the current state of the art. Finally, the Bauhaus Path Planning Challenge is presented, inviting submissions to a comparison based on this pipeline to collaborate on an objective ranking, available under https://uni-weimar.de/pathplanning .


INTRODUCTION
The application of unmanned aircraft systems (UAS) in various scientific and industrial settings has increased significantly in recent years (Jeong et al., 2020), mainly due to increased performance, reliability, and availability (Grubesic and Nelson, 2020). Equipped with modern sensors, such as cameras or Li-DAR scanners, UAS are especially used in different monitoring and inspection tasks, for example in forestry (Yrttimaa et al., 2020), agriculture (Santos et al., 2020), environmental sciences (Tmušić et al., 2020), and civil engineering, where many applications in the maintenance of infrastructure have been developed, that improve existing processes and allow for new value generation by levying this new technology. An overview is given in (Brilakis and Haas, 2019). The usefulness of UAS was shown for example for the inspection of pipelines in the combination with virtual reality (VR) tools (Liu et al., 2019), residential building roofs (Silveira et al., 2021), or infrastructure inspection (Morgenthal et al., 2019).
While different sensors are used on unmanned aerial vehicles (UAV), this work focuses on RGB cameras, as they are an accessible and widely used payload. Nevertheless, most considerations also apply to other sensors, respecting their configurations and possibilities. To process the generated data, different procedures can be employed. Apart from manually viewing generated images to support human analysis, a very common and powerful tool is the Structure-from-Motion (SfM) pipeline (Schönberger and Frahm, 2016), used to compute 3D reconstructions from the images. This allows for analyses of the geometry of structures, changes to the geometry, and measurements of different quantities. Images and 3D models can further be evaluated using modern Deep Learning (DL) algorithms that are able to analyze images faster and more accurately than humans. They can detect different defects on the structures, for example cracks (Valença andJúlio, 2018, Benz et al., 2019). * Corresponding author Each deployment of UAS has the objective of answering specific questions about a structure that require certain data with adequate quality and resolution. To be able to produce this data, strong constraints have to be imposed on the acquisition of the images and especially the flight path of the UAV. To achieve a certain resolution, a specific constant distance between the object and the camera has to be maintained. To fully cover a structure with the images, it is necessary to take images from specific viewpoints, specific positions of the UAV with specific orientations of the camera that take the scene geometry into account. To allow for an accurate and stable 3D reconstruction using SfM, the orientation of the images has to be considered, such that there is sufficient overlap and suitable relative orientation of adjacent images. With these requirements, it becomes obvious to take careful consideration before starting the data acquisition, as without a suitable flight path the questions relevant for the UAS deployment cannot be answered.
Two general approaches for designing suitable flight paths are commonly used, either preparing the route in advance based on available information, for example existing 3D models, aerial images, or construction plans, or creating the route "on the fly". The latter requires either a skilled pilot with good spatial awareness to cover the entire structure without gaps or very powerful autonomous controls that solve the planning in real-time while considering the geometry and the task-specific requirements. For large structures like bridges and very strict quality requirements, human pilots are generally not able to steer the UAS on a path that achieves the desired results, especially when the geometry is complex and images from a very close distance are required. While autonomous control promises to remedy this, no such solutions exist to the authors' knowledge, and autonomous flight is not allowed in many countries, especially not around critical infrastructure. Further, methods that only operate on real-time data have no possibility to work towards a global optimum as the entire scene will only be known after a successful mission. Therefore, truly optimal routes can only be computed based on complete scene knowledge. Therefore, this work focuses on pre-planned flight paths that have been computed based on available rough 3D models, either from the building's design stage or from previous inspections. This means, the resulting evaluation is mostly applicable to this specific setting of complete information and not transferable to planning in unseen, unknown environments that need to be explored.
Many researchers and practitioners have devised strategies to compute such flight paths, often with special requirements and limitations that are specific to their use case. This work introduces an open path planning challenge in Section 2, inviting contributions to establish a common state of the art, founded on the evaluation metrics described in Section 4. To give an overview of existing approaches, Section 3 organizes previous works based on their considered constraints and requirements, while highlighting the challenge in comparing them based on available information. This section further introduces works that assess the quality of the data created from UAS images and some foundations for the analyses proposed in this work. In Section 4, a benchmarking system and evaluation pipeline are introduced that use fundamental criteria, which can be applied to all flight path generation approaches and build a common ground for analysis and ranking. In Section 5, the proposed pipeline is applied to a test scene and two flight paths to evaluate the performance and suitability of the chosen criteria, highlighting the benefit of computing localized quality measures. Finally, Section 6 lines out the results of this work and reinforces the call for participation in the path planning challenge.

BAUHAUS UAS PATH PLANNING CHALLENGE
Several researchers, for example (Zhang et al., 2020) or (Roberts et al., 2017) have mentioned in their works that no established method exists to compare the results of different flight path planning approaches. This is confirmed by the literature study in Section 3, where it shows that existing analysis procedures only evaluate their own contribution based on a specific scenario and only sometimes compare them to different approaches, using expensive simulations in specific settings, for example in (Roberts et al., 2017). Furthermore, these approaches compare the author's own contribution in form of a carefully crafted method with very simple and often superficial implementations of different ideas that will not perform to the same level.
The Bauhaus UAS Path Planning Challenge for close-range image based inspection invites contributions from researchers and practitioners for the evaluation, comparison, and ranking of flight paths for specific inspection tasks, inspired by common benchmarks like the Middlebury Stereo Vision Benchmark (Scharstein et al., 2002). These tasks are defined in the challenge through the resolution and accuracy requirements as described in this work for specific 3D scenes. Submissions will automatically be evaluated using the benchmark described in Section 4 and ranked by the final score S from Equation 12.
The scenes used in the challenge cover a variety of different scenarios to test the submissions in different settings. The first scenario is the bridge pier presented in this work, shown in Figure 3a, as a very simple geometry but with strict resolution and accuracy requirements. The second scene is a model of a school building, shown in Figure 1a, with its surroundings, obtained via SfM from UAS images. It has a more complex geometry but less strict requirements, since the scenario is not about submillimeter effects but larger phenomena. The third model is a synthetic scene of a house modeled to have special geometrical features like slanted and curved surfaces, different levels, and an underpass, shown in Figure 1b. This more challenging geometry is to test the adaptability of the algorithms to strong geometric features. To improve availability, reproducibility, transparency, and openness, all data and information for the challenge will be available on the corresponding website https://uni-weimar.de/ pathplanning. This encompasses detailed descriptions of the scenarios, the 3D models of the scenes, the scores of previous submissions and the source code used for the evaluation, so the results can be verified by everyone. To appropriately present all submissions, they each will have a dedicated page where the authors can provide a detailed description and reference the corresponding publications, source codes, and additional materials if publicly available.
The goal of the challenge is to compare existing approaches to UAS flight path planning on the same scenes with the same quality parameters, so that a scientifically founded state of the art can be established. This benchmark also aims at providing important information for users when deciding which algorithm can be used for their specific setting. Finally, it is an invitation to start a conversation about how to measure the quality and success of a flight path, based on the measures proposed in this paper. While this work decidedly only uses very basic and fundamental aspects for the evaluation, it can be fruitful to include more complex aspects into the evaluation and expand the scenes to a more diverse set with different requirements in future.

EXISTING APPROACHES
With the rise of UAS applications for monitoring and inspection purposes, researchers as well as practitioners have identified and addressed the need for carefully selected viewpoints for a successful inspection in various ways. While this is the case for any measurement setup using images (Luhmann et al., 2019), mounting the camera on a moving UAS introduces new challenges and constraints, for example legal regulations imposed by many countries, for instance the European Union (EU, 2019). To address these constraints and requirements, many approaches have been developed in recent years to plan UAS flight paths that are suitable for the different use cases in inspection and monitoring, an overview of which was compiled in (Bolourian and Hammad, 2020). While most approaches are part of the same cosmos of applying UAS in the monitoring and inspection of structures, the underlying constraints and considerations differ widely between the different publications, as shown in (Almadhoun et al., 2019).
A number of contributions stay with classical aerial images from a constant height looking in nadir direction. In (Majeed and Lee, 2019), the authors plan inspections in urban settings, where they avoid obstacles and optimize the ground coverage with a minimal number of images using a sweep pattern. While the results for their use case are promising, the flight path cannot capture real 3D scenes, as vertical planes like facades cannot be sufficiently covered by nadir imagery. Other approaches such as (Peng and Isler, 2019), (Zhang et al., 2021) or (Bolourian and Hammad, 2020) do not limit the viewpoint positions to a simple plane above the scene with nadir views, but to other geometric primitives like spheres, 3D polygons, 3D boxes, or adaptive rectangles. Even though this allows for better adaption to the real scene geometry, it does not work well for more complex or concave geometries and does not give special consideration to geometric features like edges. Therefore, other approaches have been developed that take the complete geometry of the structure into account and compute complex free-form paths. In (Besada et al., 2018) a path planning procedure for tower-like structures is introduced that produces suitable routes for the inspection with thermal images by fitting multiple cylinder shapes with fixed viewpoint configurations into the scene, though photogrammetric constraints are not considered. Similarly, in (Zhang et al., 2020) a planning procedure based on full coverage is proposed that optimizes the precision of the reconstruction, by iteratively adding viewpoints to achieve complete coverage and then high precision. In (Koch et al., 2019) and (Roberts et al., 2017), two powerful methods are proposed that select optimal viewpoints on a grid of candidates, achieving full coverage and high quality while considering additional constraints like semantic annotations or visual quality. In those works, different heuristics for the selction of the viewpoints are applied that form a submodular reward function for the addition of more viewpoints. Other approaches use completely free placement of the viewpoints by computing viewpoint candidates for each triangle of a polygonal mesh of the object and reducing them to an optimal route, for example via clustering with regard to the viewing angle (Hoppe et al., 2012) or via greedy selection of the viewpoints that provide most new coverage, until complete coverage is achieved (Debus and Rodehorst, 2020). The two approaches used in Section 5 have been proposed by (Morgenthal et al., 2019), placing fronto-parallel viewpoints around the structure, based on either intersecting the scene with parallel planes and using the intersections as viewpoint positions or sampling a grid of viewpoints around the structure, considering photogrammetric requirements for a good 3D reconstruction, such as image overlap, limited rotation, minimum basewidth, and constant resolution.
In addition to computing suitable flight paths, many researchers have conducted analyses into the quality of the produced data that can be achieved using UAS. Generally, these analyses are based on performing the full analysis pipeline and comparing the results to a known ground truth. In (Saponaro et al., 2019), (Roberts et al., 2017) and (Cwiakala, 2019) the authors test reconstruction performance on real images from UAS missions, while in (Hoppe et al., 2012), (Peng and Isler, 2019) and (Koch et al., 2019) simulated images are used. In all these cases, a 3D reconstruction with SfM is computed and certain points are compared to their kown true position.
The evaluation pipeline proposed in this work builds on established methods from the fields of computer vision and scientific computing. Algorithms for projective geometry and image analysis are described in (Förstner and Wrobel, 2016), (Luhmann et al., 2019) and (Heuel, 2004) and provide the computational foundations for this work. Implementations of these algorithms exist for many software tools such as CloudCompare (Girardeau-Montaut et al., 2005) and programming environments, for example in (Kovesi, 2020) for the Julia programming language (Bezanson et al., 2017). A modern technique in computing is automatic differentiation (AD), a method to automatically determine derivatives of computer programs. It is very commonly used in machine learning (Baydin et al., 2018), but also in many other fields of scientific computing (Rackauckas et al., 2020), and implemented in efficient software libraries (Revels et al., 2016).

PROPOSED BENCHMARK
Evaluating the performance of flight paths for photogrammetric 3D reconstruction is important for the application of UAS and having trust in the results. From a scientific perspective, comparability of ideas and implementations is important to establish current practices, a shared state of the art, and finally progress. As lined out in Section 3, the principal task in UAS flight path planning is the same over all existing contributions: Computing efficient flight paths that cover the entire structure with constant resolution and accuracy that enable answering specific questions about the structure, as defined in (Koch et al., 2019). However, the specific constraints can be very different between them, for example semantic constraints on the scene, specific hardware configurations, or only using classical aerial images. Finding a common measure that is able to usefully assess all different scenarios requires extracting general criteria that are expressive, conclusive, and concern only the core task. Therefore, the evaluation cannot include very complex measures, as they require non-fundamental considerations.
Two fundamental use cases can be defined for image-based inspection of structures: The detection of certain effects in the images and the reconstruction of a 3D model using SfM. From this, two criteria can be derived that can characterize flight paths qualitatively, together with a common optimization target, the length of the flight path or respectively the number of images. The required object resolution d obj is a measure for the smallest effect on the surface that is to be visible in the images. It directly defines the optimal distance between the camera and the object. In traditional aerial photogrammetry, the resolution is also known as ground sampling distance (GSD), whereas this work uses the more general term resolution. Computing a 3D reconstruction from images using SfM is not without errors, as measurements in images are not without errors and can accumulate depending on the geometry of the image bundle. An important requirement for a successful inspection is the admissible accuracy e, a measure for the admissible variance in the positions of the reconstructed 3D points. With these two requirements -resolution and accuracy -defined, the requirements for the flight paths are clear.
Another very important and commonly applied criterion is the complete coverage of the structure of interest. In the proposed benchmark however, this criterion is not used, as it is implicitly contained in the other two criteria. The structure has to be completely covered for the resolution to be achieved at all points and for all points to be reconstructed with the required accuracy. Further, these constraints do not make any additional assumptions, but focus only on photogrammetric and image analysis aspects. Accordingly, some aspects that can be relevant for specific settings are not considered, for example the influences of wind, motion blur, semantic constraints or flight properties and orientation limitations of the UAS. Especially the on-site lighting conditions and surface properties cannot be considered in an abstract benchmark. Therefore, these aspects have to be considered outside of this work during the mission to achieve optimal results. This makes the benchmark applicable regardless of the used UAV (fixed wing or multirotor), as long as it is able to reach the computed viewpoints with sufficient precision.
This work proposes methods to measure these criteria from a computed flight plan, a simple camera model with constant parameters for the entire mission, and a 3D model of the scene to be inspected. This model of the scene is to be considered a rough representation of the correct geometry in the sense that it is sufficient for navigating the scene without collisions and capturing all parts of interest, while not containing the information that is of interest for the inspection due to being not detailed and accurate enough. It could for example be a CAD model, which abstracts the geometry to a certain degree, or reconstructed from aerial images, not containing details of the scene. The content of the scene, whether it is a single structure like a tower or a building or a larger area, for example a city quarter or a large bridge, should have no effect on the method, so that it is applicable in all scenarios. While a polygonal mesh representation of the scene is used for many purposes, for some computations a set of n surface points P is computed and used as a representation of the scene, also called object points. These points are sampled randomly on the surface of the mesh, kept the same for all evaluations to increase comparability, and projected into the images of the flight path to simulate feature points. In the context of this work, a flight path consists of an ordered set of viewpoints C, camera positions and orientations that the UAS will follow in order. Finally, a combined score is proposed that can be used in ranking different approaches, while the partial measures can support the selection of an approach for a specific use case.
The measurement of the criteria requires the pairwise visibility relations of all viewpoints and all object points, computed by projecting the object points into the images and checking for occlusions using raycasting. This resulting image bundle contains the information normally obtained by SfM without having to compute the expensive SfM pipeline, which would also require either real images that limit the applicability of the method through enormous efforts required and make it unusable as a tool for predicting reachable qualities, or simulated images that only add computational effort and would introduce the influence of rendering, feature extraction, and SfM into the analyses. The visibilities are represented as: where Vij = visibility relations between the object points and viewpoints, ci ∈ C = set of m viewpoints in the flight path, pj ∈ P = set of n 3D points on the object surface

Resolution on the Surface
In order to detect effects of interest in the images, a certain resolution on the surface of the scene is required. With a defined camera model, this directly translates into a maximum distance between camera and object that is admissible to achieve this resolution, shown in Equation 2. To minimize the number of images required to cover the entire structure, the distance should be as close to this optimum as possible:  Since precisely achieving this computed distance is practically not feasible and to deal with slanted surfaces, a range d around it is defined instead as the accepted distance to reach the required resolution. The range places a near limit dnear = d * − d and a far limit d far = d * +d on the accepted distance to include the depth of field of the camera, as shown in Figure 2a. The resolution requirement δj for an object point pj is computed via the distance of the point and all viewpoints from which the point is visible, implicitly requiring each point to be visible from at least one viewpoint: where dij = distance between viewpoint ci and pj The global fulfillment of the object resolution requirement is computed as the proportion of object points, for which the resolution requirement is satisfied:

Accuracy of the 3D Reconstruction
While the object resolution measure can be computed by evaluating single visibility relations, the accuracy requirement involves the relative orientations of adjacent images, as a narrow baseline between two images can lead to a glancing intersection for the object point triangulation, schematically shown in Figure 2b. This criterion concerns the expected accuracy of the 3D reconstruction from SfM using the computed flight path. To quantify this, measurement errors are propagated through the triangulation, the covariance of the triangulated 3D position is determined, and a principal component analysis (PCA) is performed to find the largest variance of the triangulation. The linear triangulation computes the position of a 3D point from corresponding measurements in multiple images. Here, a view of an object point is only used in the triangulation, if the projection lies in the image, the visibility is not occluded, and the distance between camera and point is close to the optimal distance d * , similar to Equation 3. This provides a near constant resolution of the resulting 3D model over the entire structure. This implicitly requires sufficient coverage of the structure, such that each object point is contained in at least three images for the triangulation.
First, the Jacobian matrix Jj containing all partial derivatives of the triangulation with regard to the image measurements is computed using automatic differentiation. For the triangulation of point pj on the scene, Jj is a 3 × 2c matrix, where c is the number of views that contain the object point. 2c parameters are used in the triangulation and 3 coordinates are the result.
To perform the non-linear uncertainty propagation through the triangulation, Jj is multiplied with the covariance matrix ΣI of the input to obtain the covariance matrix of the output ΣP,j according to (Ochoa and Belongie, 2006): where ΣI = covariance matrix of the image points, ΣP,j = covariance matrix of the triangulated point pj With the assumption that the measurements in the images are within 1px of the true position and uncorrelated, the covariance matrix for the input ΣI is set to the identity matrix. Performing the PCA, the largest eigenvalue of the output covariance matrix emax,j is determined, quantifying the largest variance of the output and serving as a measure for the achievable accuracy.
The local fulfillment of the accuracy criterion Φj for each point pj ∈ P is computed by comparing emax,j to the admissible variance e * : The global fulfillment of the accuracy requirement is computed as the proportion of object points, for which the accuracy requirement is satisfied:

Path Length and Image Count
The previously described measures are externally defined quality criteria that have to be fulfilled for the flight path to be usable for the specific purpose and can be considered as boundary conditions for the path planning problem. For efficient inspections, the objective is to minimize the mission duration by minimizing the number of required images and the distance the UAS has to travel along the flight path. Accordingly, these two aspects can be combined into a cost function for the path planning problem, which aims to minimize this cost. As the relative cost of the two components -what is more expensive, a longer flight or more images? -is not easily determined, the simplest combination as the sum is used to compute the path cost L: where ci − ci−1 = Euclidean distance between two successive viewpoints, m = |C| = number of images in the path The flight length does not consider the specific flight properties of a UAS such as minimum curve radius, downstream forces, line of sight for pilot intervention, or intermediate stops to recharge the batteries, as these are use case dependent and cannot be incorporated objectively.

Combined Score
A benchmarking system aims to make different approaches comparable along objectively measurable criteria, for which this work has proposed the described measures. Ideally, the object resolution δ and accuracy Φ measures are used as boundary conditions, violations of which invalidate a computed flight path, and the only comparison measure is the path length L, where lower values are better. However, to the knowledge of the authors, no existing approach is reliably able to satisfy the quality criteria, so no admissible solutions would exist under this consideration. To remedy this, the boundary conditions are included in the cost function similar to a Lagrangian relaxation, such that flight paths that do not fulfill the quality requirements obtain a reduced score. The quality Ψ of the flight path is computed as the product of the two single quality measures: This multiplication punishes flight paths that do not come close to fulfilling the requirements, providing zero reward, if one of the constraints is not satisfied at all. The quality measure Ψ is combined with the path length L and measures from the scene to compute the score of the path. The path length L is included as the divisor, so shorter paths receive a higher reward. To normalize the scoreŜ for different models, the surface area of the model of the scene is used as a factor, since it is roughly proportional to the path length and the number of images -doubling the size of the model requires double the number of images. The target distance between camera and object d * is used to normalize the score for different inspection scenarios, as the path length and the number of images are roughly inversely proportional to its square -doubling the distance quadruples the area covered by one image:Ŝ where Ψ = quality factor of the flight path, L = cost of the flight path length, A = surface area of the 3D model of the scene As the path cost L goes into the denominator of the score calculation, very short paths result in badly defined behavior of the function. To compensate this, short paths of less than 10 viewpoints are excluded from the evaluation, as those paths can only be suitable for very small and special scenes that do not warrant the complex considerations applied here.
With no optimal solutions for the path planning problem being known, an optimal value for the scoreŜ is also not known and may only be the result of continuous improvement of the planning approaches. To provide an upper bound for the achievable score, a minimum length path for a very simplified setting can be computed. Assuming that both resolution and accuracy are perfectly achieved, the score only depends on the path cost L. For these considerations, the model of the scene is simplified to a planar strip with the width exactly covered by one image from the optimal distance d * in landscape orientation and the height such that the total surface area of the model is maintained. As three-view visibility is required for all object points, as described in Section 4.2, the minimal number of required images can be computed as: where rx and ry are the horizontal and vertical resolution of the images and their product with the squared object resolution d obj is the area covered by one image. The length dist min of the flight path is the height of the planar strip, as it has to be traversed once. With these assumptions a minimal path cost L min = m min + dist min can be computed and used for an upper bound of the achievable score Smax, as in Equation 10. This bound can be used for the normalization of a score in the range [0,Smax] to the range [0, 1], although the strong simplifications applied for computing the upper bound make it impossible to reach the maximum score. The final score is calculated as:

APPLICATION OF THE BENCHMARK
To validate the benchmarking system proposed in Section 4, it is applied to two flight paths computed for a simple scene of one bridge pier, shown in Figure 3a. It is 30m high, with rectangular cross section, slightly slanted sides and a surface area of 456m 2 . For the point-wise evaluation of the measures, a set of 45000 points is randomly sampled on the surface of the mesh and used for the evaluation of both methods. The camera used for the inspection has the following geometric properties: • Resolution: 7360 × 4912px (36MP) • Sensor size: 35.8 × 23.9mm (full frame) • Focal length: 55mm The application scenario used for the evaluation is acquiring images for the detection of small cracks on the surface and locating them on a 3D reconstruction of the object. To achieve this, the quality requirements are set to a resolution d obj = 0.2mm/px, which results in a target distance d * = 2.621m according to Equation 2 and a distance range d of 10% = 0.26m, and an admissible accuracy e * = 1mm.
Two flight paths are computed using two different yet simple approaches from (Morgenthal et al., 2019). They have been used in different settings with good results and can serve as a baseline. For the first path, the mesh is intersected with horizontal slicing planes and viewpoint rings are computed around the intersection shape. For demonstration purposes, one viewpoint ring in the upper third of the pier and one at the bottom are manually removed from the solution to show the effect of missing coverage, resulting in a flight path with 2313 viewpoints, as shown in Figure 3b. The second approach places viewpoints on a regular grid around the object, selects those that have roughly the required distance d * to the surface, and moves them to have the required distance, looking directly towards the object surface. This route consists of 2556 viewpoints and is shown in Figure 3c.
Both paths are analyzed using the benchmark described in Section 4 and evaluated on all criteria. The results for the two pathplanning methods are summarized in Table 1 The values for the path length L show that the higher number of images for the second route also results in a higher length cost. Nevertheless, this higher number of images also results in a significantly higher quality score Ψ, especially for the accuracy Φ. Accordingly, the total score S is higher for the second method, showing it is able to produce an overall better result. As the criteria are also evaluated for the sampled object points on the surface, the individual values can be used to validate the results. Figure 4 shows visualizations of the results for some select cases. Figure 4a shows that the resolution for the grid method is satisfied for the entire structure, except for some points at the very top of the pier, reflecting the 99% fulfillment for that criterion. Figures 4b and 4c show the effect of the removed rings of viewpoints for the slice method, where the accuracy is decreased, resulting in an accuracy score of around 78%, an effect which does not occur in the grid method without removed viewpoints.
Overall, this evaluation shows that the chosen criteria are able to identify errors and difficulties in the computed flight paths and the visualization can be useful in locating those issues, providing valuable feedback during the design and implementation of new approaches. At the same time, the resulting final score is suitable to compare the quality of the flight paths and therefore suitable as a ranking method, even though no optimum is known to exist.

CONCLUSIONS AND CALL FOR PARTICIPATION
This work proposes a benchmarking and ranking system to evaluate pre-computed flight paths for close-range image-based UAS inspections. After reviewing the current state of the art, recent contributions to the problem of flight path planning, and existing evaluation methods in Section 3, a set of fundamental and simple criteria was identified in Section 4 that can be used for the evaluation. Acknowledging the variety in the application scenarios for which flight paths are computed, these criteria are elementary and can therefore be applied to all use cases by only considering core requirements.
The chosen quality criteria concern the object resolution and accuracy that can be achieved with a flight path and the implicitly measured coverage of the entire scene. A measurement procedure is proposed for each criterion, together with an equation for computing a combined quality score, which measures the satisfaction level of the quality requirements. The quality score can be combined with the length of the route and the number of images into a final benchmark score of the flight path that assesses the performance of the computed path. The proposed benchmark is evaluated on a simple model and two computed paths in Section 5, where the scores of the benchmark show that significant differences between the paths can be detected and therefore support the analysis of intuitive visual assessment of the paths.
The proposed performance assessment procedure forms the basis for the Bauhaus UAS Path Planning Challenge for closerange image-based inspection, in which researchers and practitioners are invited to contribute their solutions for the UAS flight path planning problem in predefined scenarios to identify valuable contributions and methods and establish an objective state of the art. This open challenge is available under https://uni-weimar.de/pathplanning, where all information is provided, including the detailed scenario descriptions, the source code of the evaluation pipeline and a ranking of previous submissions. Finally, this challenge, the resulting ranking of approaches, and the experience with the underlying evaluation are an invitation for the community to contribute to further improvements and adjustments to the benchmark to establish it as the measure for future contributions.
After gathering experience with the current proposed benchmark, it can be expanded to also include more complex measures and aspects, determining finer differences when a baseline performance can be established. This extension can include adding more details to the proposed measures, like also using the incidence angle of the viewpoints on the surface, the overlap of adjacent images or the photogrammetric network design, but also introducing new criteria such as no-fly zones, semantic aspects of the scene, or the reliability of the UAV in reaching the correct positions. This allows the future expansion of the benchmarking scenarios to additional scenes that capture those aspects, creating a useful and established reference for the problem of UAS flight path planning.