QUALITY ASPECTS OF TRUE ORTHOPHOTO IN URBAN AREAS

Orthophotos are one of the most popular photogrammetric products and have been a leading source of up-to-date 2D data of urban areas for years. In the last few years, together with innovations in the area of Dense Image Matching, Digital Surface Models created with dense image matching start to be utilized as the height source during orthorectification. Recently this production workflow of true orthophotos were adopted to production standard in many countries. The aim of the presented research was to evaluate recent developments in the area of automatic true orthophoto generation for urban areas and to define factors which have the main influence on the quality of the final product. Obtained results showed that besides of the image overlap, the main factors which have direct influence on the resulted true orthophoto are the occurrence of shadows and vegetation (trees). One of the outcomes of the presented research was that the quantitative methods develop for quality evaluation of Digital Surface Models and Point Clouds are not directly transferable on the quality evaluation of true orthophotos.


INTRODUCTION
Orthophotos created from aerial nadir images have been a leading source of up-to-date 2D data of urban areas for years. They are also one of the most popular photogrammetric products created or ordered by National Mapping & Cadastral Agencies (NMCAs) or local authorities, commonly updated year to year. However, in urban areas, the commonly created "classical" orthophotos (where orthorectification is done with DTM -Digital Terrain Model) suffer from a well-known problem of building leaning and occlusions caused by this effect. Therefore, since mid-1990s (Ahmar & Ecker, 1996) a lot of research effort has been made in order to develop a proper method for fully automatic generation of true orthophotos. Proposed methods utilized various data sources for creating Digital Surface Models (DSM), such as lidar data, DTM and building outlines and 3D building models and some of them have been adopted by commercial software. For years, proposed solutions have not achieved widespread popularity and the conventional orthophotos (created with DTM) remained more popular.
In the last few years, together with innovations in the area of DIM -Dense Image Matching (Hirschmuller, 2008;Haala, 2014, Zhang et al, 2017 and the rapid growth of lidar point density, the situation started to change. DSM models created with Dense Image Matching start to be utilized as the height source during orthorectification of this same images which were used for image matching. This type of data was incorporated into ISPRS benchmark on urban object detection and 3D building reconstruction, however in original results among 27 different methods of urban object detection which were submitted only 4 were methods solely based on images -utilize only a true orthophoto and a DSM generated from these images (Rottensteiner et al, 2014). In the following years, the same dataset achieved much wider popularity when it were used during ISPRS Semantic Labelling Contest (2D), where 140 different results were submitted. It could be expected that the popularity of this type of a data will increase in the future due to its temporal and technological homogeneity as well as geometric consistency, which is an advantage for machine learning applications.
With the development and rising popularity of DIM as a source of DSM new problems and errors connected to radiometric properties of an image (like shadows) or geometric configuration of matched images were addressed (Haala, 2013). Within production workflow of true-orthophoto based only on DIM and images it is expected that all errors of the image matching will be propagating through DSM to the final product. Recent research in the field of true-ortho by Gharibi & Habib, (2018) noticed this problem especially on the edges of buildings where artifacts degrading the quality of true orthophoto (the so-called sawtooth effect) are clearly visible. On the other hand, the paper is more focused on occlusion detection techniques that on further investigation of the quality issues with dense image matching based true orthophoto.
Finally the true orthophotos and its quality also become an object of interest for the NMCAs and local authorities. At the moment, the AdV (2019) publication is probably the most comprehensive overview of true orthophoto (TOP) quality problems.
The aim of the presented research was to evaluate recent developments in the area of automatic true orthophoto generation for urban areas and to define factors which have the main influence on the quality of the final product.

TEST FIELD AND DATA
For the presented research the central part of Warsaw (the capital of Poland) was used as the test area. The City of Warsaw Municipality have been ordering aerial orthophotos for years, due to development of the real-estate market (of which the most spectacular manifestation is the continuous and unstoppable growth of the number of skyscrapers in the downtown area) a few years ago some additional conditions of images acquisition had been added to the expected parameters of the ordered orthophoto. Since 2018 the central part of the city has to be collected with higher overlap between images (80%) and between strips (80%), when for the other parts of the city 60/60% overlap have been expected.
For two consecutive years, 2018 and 2019, images were taken by two different companies, but with these same photogrammetric flight parameters (GSD of 8 cm) including (by accident) the same camera model -Leica DMC III. Because both companies used the same model of large-format photogrammetric camera in order to achieved the same GSD and cover exactly the same area with nadir images the flight plans for the central part of the city were exactly the same and images were taken in almost the same places ( Fig. 1), what provided a perfect opportunity for the year to year reproducibility comparison. Both of the mentioned datasets were acquired at the beginning of April during leaf-off period which is considered favourable by NMCAs and local authorities for images acquisition and orthophoto production in the case of cities. However because vegetation is one of the factors which has influence on the Dense Image Matching reliability some additional experiments, with another nadir images dataset taken in full leaf-on condition, were planned. In 2017 aerial images for Warsaw were taken in May, although the camera used and planned GSD were exactly the same (Tab. 1) like during acquisitions in two followings years, however the datasets are not identical and cannot be directly compared because the images overlaps between strips are much lower (55%) and the flight direction was different.
Year The exterior orientation parameters for each of the datasets were estimated independently, during an aerotriangulation -bundle adjustment, which were performed with the Trimble Inpho or the Z/I ImageStation software. The aerotriangulations were performed in national coordinate system (EPSG2178 -ETRS89 / Poland CS2000 zone 7) with precise GNSS/INS observations and dozens of Ground Control Points. Each of orientations pass independent quality check procedure on Check Point in order to confirm achieved accuracy on sub-pixel level.

METHODOLOGY
The quality evaluation of a DIM products and their comparison with an ALS data are common subject of many research works published in last years, however a majority of previous works were focused on evaluation of DSM or point clouds, which are the intermediate steps during a true orthophoto generation.
Zhang et al (2018) presented a complex method for the evaluation of a dense image matching quality, based on the detection of planar patches in the point cloud. In order to find the most reliable parts of the point cloud from DIM they remove from further evaluation patches with either noise, insufficient number of points (data gaps), or those placed in a shadow or on grass (vegetated areas). These areas which are favourable for the image matching are also entirely correctly generated on true orthophoto ( Fig. 2). For the evaluation of the orthophoto created with a DIM we propose to reverse this list and focus during evaluation on this areas which could be problematic and lead to errors in the point cloud resulting from image matching. Finally six test areas ( Fig. 3) are selected for evaluation of trueorthophoto quality in different scenarios. Three of them in north part of test filed are focused on vegetated areas:  the first one (1 th ) consist of several few floors residential buildings in loose formation surrounded by trees;  the second (2 nd ) covers part of a city park, almost fully covered with threes;  the third test area (3 rd ) is placed mostly on a slope and covered with two gardens and, beneath the slope, a park area with many trees. The rest of test areas are connected with Built-up areas:  the first of them (4 th ) consist of the most central part of city downtown with few skyscrapers (over 200 m high) surrounded by other buildings;  last two test fields (5 th & 6 th ) are two city blocks with densely built typical 19th century residential buildings (several floors and inner courtyards).
Quantitative evaluation of the final true orthophoto is a hard task because this type of product is designed to be seen and interpreted by human eyes and observer filings cannot be easily quantified.
The proposed methodology begins with an evaluation of the intermediate data -the DSM (resulted from a DIM, with resolution equal to desired orthophoto resolution and the GSD of used images), which is later used during final orthorectification. For a quality evaluation the DSM Metainformation layers (DSM Quality Layers) resulted from the nFrames SURE software were used. Figure 3. Location of test areas (from 1 to 6 marked with red outlines) within the test field.
One of the most important factors of error propagation from DSM to orthophoto should be a completeness of the used DSM For .evaluation of this factor we use the DSM Cell Point Count layer in which the number of original 3D points in the point cloud resulting from a DIM per single DSM raster cell is stored. The second important factor should be the size of the area without any valid 3D measurements, which could be interpreted as the interpolation distance -the distance between the DSM cell without any valid 3D points and the nearest pixel with measurements, which is stored in the DSM Distance Mask layer (as the Euclidean distance in pixels). The third factor which should have a direct influence on a DSM and an orthophoto quality is the roughness of the used point cloud, which is connected to the image matching accuracy. This type of information is stored in the DSM Cell Standard Deviation metainformation raster as the standard deviation value calculated from 3D points in the Z direction for the each of DSM cell.  (Fig. 4): the first of 10 images (5 per strips) in the area of a single overlap between strips and the second of 15 images (5 per strips) in the area of a double overlap between strips. Within the block of 80/80 overlaps the distribution of numbers of overlapping images is almost homogenous and every place on the ground (without considering occlusions) should be visible on all 25 images (5 images from 5 strips). The test fields locations were planned accordingly to this areas of overlap. The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLIII-B5-2020, 2020 XXIV ISPRS Congress (2020 edition) Experiments were performed with the nFrames SURE software (Rothermel et al., 2012, nFrames, 2020 version available as of 20 th February 2020 (4.0.2). In the Aerial Nadir settings scenario, which is optimized for a typical photogrammetric flight with large-format cameras and grids with constant overlaps, with the ultra processing quality level which implies processing of the images in the original scale (pyramid level 0) and without any advanced modification of internal parameters. In order to estimate the percent of a resulting pixel without any valid 3D measurement and size of a data gaps the DSM Distance Mask layers were reclassified into four classes:  area without interpolation where the DSM Distance Mask value is equal to 0;  pixels which values were interpolated from a distance smaller than 3 pixels, the upper bound of three pixels correspond to the geometric accuracy of orthophoto generated with DTM so there was assumption that interpolation distance of 3 pixels or less should not cause major errors on the resulted true orthophoto.  pixels which values were interpolated from a distance above 3 pixels and up to 7 pixels, value of 7 pixels in equal to interpolation distance grater than 0.56 m  pixels which values were interpolated from a distance above 7 pixels.
Besides the quantitative comparison of resulted DSMs, the qualitative evaluation of true orthophoto were performed by the observer. During the qualitative evaluation the spatial distribution of a quantitative indicators were compared with the places where visible errors on the true orthophoto occurred.

RESULTS
The results obtained (Tab. 3) on the test areas (1, 2 and 3) with dense vegetation (mostly trees) show some general trends, which were expected. The higher number of the overlapping images results in the higher points density and less areas without any valid measurement. On the other hand the higher overlap is connected with higher standard deviation. The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLIII- B5-2020XXIV ISPRS Congress (2020 year to year it is visible that the results quality in 2019 is slightly worse than in 2018, what could be connected with different light condition during the flight (images acquisition). The images obtained in leaf-on condition (2017) provides DSM with more areas without any valid 3D measurement.
Comparing the results obtained on the 1 st test area which consists of area build-up with residential buildings (few floors height), surrounded by trees with the results from the 3 rd test area (covered mostly with gardens, park areas and only a few buildings) it is visible that the co-occurrence of buildings and trees causes more no-data areas than just a presence of the areas covered with vegetation. The type of a land cover affects the results more than current vegetation season (leaf-on or leaf-off period). This influence is clearly visible when comparing results form leaf-off seasons (2018 and 2019 dataset) with constant overlaps of 25 images (lower part of tab. 3). The 2 nd test area which consists of a part of a city park, almost fully covered with trees, have also the smallest number of no data areas.
The test areas covered with buildings (4, 5 and 6) provided results (Tab. 4) with similar patterns in case of the overlap influence (the higher point density and the higher Standard Deviation with the higher number of overlapping images) or the quality comparison between datasets from 2018 and 2019 (results from 2018 are slightly better than from 2019). By comparing results obtained on the 4 th test area -downtown (with few skyscrapers) with results from 5 th and 6 th test areaswith densely built residential buildings it is visible than results from 5 th test area have definitely more no-data areas than other two. It could be explained by the slightly different building density on test areas 5 th and 6 th and the higher proportion of narrow courtyards in the 5 th test area (Fig. 6).
The qualitative evaluation of the achieved results shown that the influence of particular factors is more complex that it might be expected from quantitative analysis only. Firstly the influence of a vegetation (especially trees), appears to be more complex than a simple comparison of number of no-data pixels between the leaf-on and the leaf-off periods. The spatial distribution (Fig. 5) of no-data areas and errors vary between vegetation periods. Most of the tree crowns reconstrued from the leaf-on images appears to be correct and without blunders but errors and no-data occurrence have severe influence on the areas around trees and this issue is emerging especially when the trees are closer to building facades. In case of leaf-off images errors and their spatial distribution are different. No data areas occur mostly within the tree crowns (Fig. 5 d,f) what is probably a consequence of image matching failure.
Influence of this problem decreases with increase of overlap between images, but it is still visible on the dataset with 80/80 overlaps. Furthermore it is clearly visible that the occurrence of this issue is more probable when the trees are located in the shadows cast by buildings. This corresponds with the results of the quantitative analysis where the higher percentage of no data areas were registered on the 1 st test area (buildings surrounded by trees) than on the 3 rd test area (gardens and city park area).
Presence of the no-data pixels within in the build-up test areas are more consistent with our assumptions. On the 5 th test area (Fig. 6) they appear mostly in the inner courtyards within the city block. On the more complex scene of the 4 th test area (Fig. 7) no data areas are visible in the same type of places but also in the narrow streets or passages between high buildings, close to the building edges. This results can be easily connected with occlusions and explained by not enough images coverage in this type of places. Figure 8. Part of the 4 th test area. Example of the large no data area between buildings where DSM interpolation did not affect the final true orthophoto quality.
The example of different type of places were no-data areas tend to occur are shadows, which is clearly visible when comparing results from the 4 th test area created with datasets of lower overlaps (80/60) for two (2018 and 2019) different years (Fig. 7). Moreover as it was noticed during the evaluation of the results for the 1 st test area the occurrence of no-data in shadowed areas is more probable in places where leaf-off vegetation is present. During qualitative analysis of the obtained results one of the most commonly encountered problems are artifacts on the building edges ( Fig. 9 a).
They are more frequent on the shadowed part of the buildings and are often connected to a no-data area. Unfortunately, exceptions to both of these rules are common. Furthermore, the larger artifact of this type are often connected with local higher point density and significant increase of the standard deviation value for most of them (which are smaller but still significant, like on Fig. 9). This makes it hard to find any quantitative indicators derived from the DSM metainformation layers or even directly from the 3D point cloud.

CONCLUSION
Performed year to year comparisons on the true orthophotos achieved with different overlaps scenarios provide stable results in terms of the product quality as well as similar types of errors occurring. The minor differences could be explained with different weather (sun) conditions during the images acquisition.
Obtained results shown that for the downtown city area with dense buildings, skyscrapers and inner courtyards a Dense Image Matching on nadir images are insufficient for the obtaining of Digital Surface Model without any no-data areas, even in the case of images with high overlaps of 80/80. Results could be probably improved by including additional dataset during processing -Airborne Laser Scanning point cloud.
The occurrence of no data areas are more frequent in the shadowed parts of terrain, probably due to a lower radiometric quality of images. One of the factors that increases this problem is the occurrence of a leaf-off vegetation in the shadowed areascrowns of the leaf-off trees tend to cause some blunders and errors, especially when they are placed in shadows of buildings. On the other hand, crowns of leaf-on trees are reconstrued properly in most cases but they cause errors in surrounding areas (because of occlusion). In presented research, the standard sources of matching errors such as water areas or glass roofs/facades of buildings are omitted.
With recent developments in the area of Dense Image Matching it is possible to automatically create true orthophotos, however resulting products are not free from errors and artifacts. While a lot of effort was devoted to development of the quantitative methods of Digital Surface Models or Point Clouds comparison in last years, quality indicators from this intermediate products are not directly transferable to quality evaluation of the final product -a true orthophoto. On the other hand, in complex urban scenes ( Fig. 10) even manual quality check is difficult to perform. With rapidly growing popularity of the true orthophotos some additional research is needed in this area.