3D Content Generation using Hybrid Aerial Sensor Data

In aerial data acquisition a new era started with the introduction of the first real hybrid sensor systems, like the Leica CityMapper-2. Hybrid in this context means the combination of an (oblique) camera system with a topographic LiDAR into an integrated aerial mapping system. By combining these complimentary sub-systems into one system the weaknesses of the one system could be compensated by using the alternative data source. An example is the mapping of low-light urban canyons, where image-based systems mostly produce unreliable results. For an LiDAR sensor the geometrical reconstruction of these areas is straight forward and leads to accurate results. The paper gives a detailed overview over the development and technical characteristics of hybrid sensor systems. The process of data acquisition is discussed and strategies for hybrid urban mapping are proposed. A hybrid sensor alone is just a part of the whole procedure to generate 3D content. As important as the senor itself is the workflow to generate the products. Here again a hybrid approach, with the processing of all datasets in one environment, is discussed. Special attention is paid to the hybrid orientation of the data and the integrated generation of base and enhanced products. The paper is rounded off by the discussion of the advantage of LiDAR data for the 3D Mesh generation for urban modelling.


INTRODUCTION
The need for a wide variety of actual data becomes more and more important. Especially in the context of smart cities 3D data is one of the main sources for all kind of planning processes. This also leads into the fact that a one-time 3D model of a city is no longer suitable. But on the contrary, there is a big need for faster updates of the data and a greater variation of data products. In this context elevation becomes highly important, meaning not only terrain elevation above sea level, but also the height of objects above the ground, like building or tree heights. Additionally, the data should be suitable for all kinds of analysis and so additional semantical information that comes with or can be derived from the data itself will be one of the success criteria. Acquiring the data required with traditional sensors systems would need a high quantity of flying hours, what is often difficult due to airspace regulations, weather conditions or availability of the right equipment; expensive and (due to CO2 emissions) ecologically questionable. By combining the single sensors like image sensor and LiDAR into hybrid systems for a simultaneous capture of all required information, the results become more reliable and the cost and environmental impact are reduced.

History of hybrid Sensors
For a very long-time aerial data acquisition to produce geospatial content, such as ortho images, elevation models or base mapping, was purely done using aerial camera systems. With the development in the laser technology together with the introduction of global navigation systems and performant IMUs the first airborne laser scanner systems were introduced in the mid '90s of the last century. This led to the formation of two camps around airborne data acquisitionthe image-based data acquisition on the one side and the LiDAR focused on the other side. Both claimed themselves somehow to be superior over the other. But with an objective view into this, it always was obvious that both are wrong and right or with other words it was like comparing apples with pears.
Both systems have their Pros and Cons, from image data for example it is only possible to map what you see, and for the generation of 3D content, see means here that each object needs to be seen in at least two images taken from different positions. LiDAR on the other side generates only 3D points without any relation between the single points and without additional image data these points are often hard to classify or interpret (Lemmens, 2020).
As much as the differences between the two groups led into typical products derived from the sensor data, from the image data usually large orthophoto mosaics are generated and from LiDAR data typically DTM/DSM data is derived, both have their strengths that in combination are able to help produce a superior data product.
Considering the Pros of image based and LiDAR systems it shows that they are complementary, which consequently led to the first hybrid systems. In the early beginning this meant that the LiDAR systems had an additional nadir camara as piggyback sensor or for aircrafts with two holes a LiDAR and an aerial camera were used simultaneously. Then, with the help of large efforts in miniaturization, the first integrated hybrid sensor was released in 2016 by Leica Geosystems, named Leica CityMapper. It combined 3 types of sensor systems, a nadir camera system, an oblique camera system and a topographic LiDAR system in one pod. In this setup the subsystems are combined in a way that they have adapted fields of view and performance parameters on the one side and make use of the same integrated GNSS/IMU system on the other side. This first integrated hybrid sensor was just the start of this new hybrid era (Toschi et al., 2019). Regardless the differences between the different hybrid systems, they all have one feature in commonthey all are combinations The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLIII- B2-2021XXIV ISPRS Congress (2021 of first-class sub-systems, so that none of them is inferior to the other.

Hybrid Sensor Characteristics
Summarizing the above, a real hybrid sensor needs to fulfil the following: • all sub-systems are integrated in one platform (one pod) • all sub-systems use the same GNSS/IMU system Hybrid sensor technology can, and most probably will, be used not only for urban mapping, but also for large scale aerial data acquisition in the near future. However, this means that the setup of the overall system needs to be tuned into the one or the other direction. For urban mapping the focus is the 3D city mapping with products like city models, 3D Mesh models, TrueOrtho, and DSM/DTM point clouds. This means that obscured areas (due to building lean or narrow roads) are not wanted. To achieve this camera systems with a large field of view (FOV) (more than about 30 deg) are not suitable or produce a lot of redundant data. For city modelling it is usually also necessary to use an oblique viewing angle (between 35 deg and 45 deg) to capture facades, that are directly needed for the texturing of the building or 3D Mesh models. For large area mapping the focus is usually different and one of the key success criteria is a larger field of view of the camera and LiDAR system for efficient mapping. Building lean is less important for these applications and can, if necessary, be minimized with additional flight lines in dedicated areas. Also, the oblique views are less important in this case. A proposed configuration for urban and large-scale mapping can be found in Table 1.

Hybrid sensor for urban mapping
Following the above mentioned, the Leica CityMapper-2 is the only real hybrid sensor system for aerial data acquisition on the market. Consequently, the focus of this paper is on characteristics and performance of the Leica CityMapper-2 as an example of hybrid sensors for urban mapping.
The CityMapper-2 is consequently designed to fulfill the needs for urban mapping as described above. The sensor system consists of an integrated nadir and oblique camera system with a Hyperion 2+ LiDAR in a single pod (Figure 1). The system is supplemented by an integrated GNSS/IMU system and the required storage capacity to deal with the large data volumes recorded during a typical mission.
At urban mapping projects it is usually the case that one needs to deal within areas with strongly varying lighting conditions. This means that the cameras are subject to very high demands on dynamic range and low-light performance. To realize this, a camera system with a forward motion compensation (FMC) is recommended to be able to fly with a high speed even under lower lightning conditions as shown in the Figure 2. The principle of oblique viewing angels for the modelling of vertical surfaces is widely used in 3D city modelling. To use the advantage of an oblique viewing angle together with an active LiDAR system results in a LiDAR system with a conical scan pattern as shown in Figure

HYBRID DATA ACQUISITION AND ORIENTATION
As described in the sections before, the single components of a hybrid sensor system alone have each several disadvantages or weaknesses. By combining these complementary subsystems into an integrated sensor, most of the less advantageous characteristics can be eliminated (see Table 2 &. Figure 5). As a result, the variety of data products is larger, and the quality and reliability of the resulting data is higher. One good example is the mapping of narrow streets with the adjacent buildings in an urban environment. In this section the planning and acquisition process for a hybrid aerial urban mapping mission is described in more detail. Following the data acquisition, the orientation of the data into a homogenous reference frame is key for all later product generation from the hybrid datasets. Here we investigate into different approaches for hybrid aerial sensor orientation.

Hybrid data acquisition for urban mapping
With an image-only approach, it is not possible to extract enough points on the ground in the required quality, because often a point is not visible in more than one image and, due to shadows, the contrast is not suitable for image matching. Here LiDAR measurements are of big advantage as it need only "view" the reflection from the ground from a single point overhead. As it is an active system, the shadows also have no negative influence. Another example is the mapping of areas covered by vegetation. An image-based approach is only able to map what is visible from above and so the ground often is not visible, particularly when looking at one point on the forest floor from multiple locations in the air. A LiDAR system overcomes this as Laser pulses (i.e., the "footprint" of the pulse on the canopy) are able to pass through openings in the footprint to the lower vegetation levels and reflect ground and canopy surface (and often also the objects in-between).
On the other side, a LiDAR-only system would have several deficits. For the generation of textured 3D models, image information is mandatory and can only be added from an image sensor. In this case, the oblique sensor is a great advantage, as it adds the views onto the facades. One last important point to mention is the spacing between single points. From image sensors the number of points per m² is much higher and usually in the range of the GSD, whereas for the LiDAR sensor the point spacing is larger (i.e., fewer points/m 2 ).
Taking the above into account the consequent step is towards hybrid data acquisition. Using a hybrid sensor system brings the advantages of both types of systems together. If there are no compromises in the quality of to the single sub-systems, there also will be no side effect on the usage of a real hybrid system. For further reading Mandelburger et. Al, (2017) elaborates some of the aspects in more detail.
Finally, there is value the fact that the hybrid sensor collects all data simultaneously, minimising temporal effects, such as vehicles in the LiDAR data set that are not in the image data and vice versa.  In addition to the sensor specific aspects, the flight planning and acquisition configuration is another important success parameter for the hybrid aerial data acquisition. To better understand this, first it is important to have a closer look into the products typically requested in urban mapping projects:

Image Data LiDAR Data
For the various products it is important that the initial data already match. For the DSM a combination of LiDAR points with a high accuracy in the elevation will be supported by the photogrammetric point cloud for the fine details. To realize this it is essential, that the overlap is chosen in a way that all areas are visible in at least two images. Therefore, a suitable forward overlap is necessary. For the TrueOrtho the DSM needs to have straight and sharp (building-) edges and for the nadir images it needs to be secured that no areas are occluded due to building lean effects. Hence, for the TrueOrtho the side-overlap in relation with the FOV of the camera and the maximum building height and distance in the project area needs to be chosen in a way that everything is visible in at least one image (see Figure  6). The basis for the 3D-Mesh is a combined dense point cloud containing both photogrammetric and LiDAR points for the geometrical modelling and the image data for texturing. For the LiDAR setup and the nadir images there are no additional requirements necessary, only for the oblique images it is important, that the facades are fully represented in the image data with an appropriate GSD. For DTM generation no additional requirements are necessary. For the LOD models the full coverage of the facades is mandatory if textured models are required.

Figure 6:
Building Lean vs. Side-Overlap, preferred configuration for TrueOrtho production

Hybrid sensor orientation
To be able to use hybrid sensor data to the full, it is mandatory to have not only a hybrid sensor, but also a full hybrid workflow as shown in Figure 7. Looking into the integrated hybrid workflow, the "Pre-Process" and the "Product Generation" primarily simplifies processing, whereas the Adjustment part is essential for a good quality and high accuracy of all derived products.
Historically, the adjustment of the image data and the adjustment of the LiDAR data are independent processes. The image data are adjusted as part of a bundle block adjustment. The adjustment of the LiDAR data is done by a strip adjustment of the single flight strips into a LiDAR block. The LiDAR block then can be transformed into the correct frame using reference structures or planes.
By using a hybrid sensor system both sub-systems use the same flight trajectory for the orientation. From this on there are different levels of integrating the two complementary sources, image and LiDAR, into a hybrid and consistent solution (see Figure 8). The easiest but weakest way to apply a consistent set of orientations is the use of direct georeferencing for both datasets. This leads to datasets with a stable relative orientation, but absolute orientation using reference points or structures and a datum check is missing. To introduce a absolute datum into the orientation and make the results more accurate in term of absolute orientation the use of GCPs in the image block adjustment is a suitable way. Her the global shift parameters from the image adjustment are applied as corrections on the LiDAR datum to have a common absolute reference frame. The disadvantage is that tilting or local effects are not considered.
To overcome this, an integrated block and strip adjustment with object space tie points is considered. This leads to a consistent solution between the two datasets but does not consider effects caused by calibration deficits. To fix this weakness as well, a full hybrid orientation that models an adjusted common trajectory for all sub-systems is proposed. More details and good overviews on hybrid orientation are presented in Haala et al, (2020), ), Tochi et al. (2018 or Glira, (2018).

Standard Products from hybrid data
A hybrid sensor system like the Leica CityMapper-2 with an oblique and nadir image system together with a LiDAR scanner generates with every image take nadir and oblique images and continuously LiDAR points. With this a wide variety of (standard) products (see Figure 9) can already be produced without any additional efforts. This is possible, as all data is recorded simultaneously with consistent orientations. It is important to mention that each sub-system makes its own special contribution within the product range. Table 3 and Figure 10 highlights the contribution of the single datasets to the final products. It shows that the nadir images contribute most for Ortho generation and the texturing of roofs and mostly horizontal surfaces. In Addition, the nadir images are of high importance for the detailed modelling of the surface for the TrueOrtho generation as it helps to model building edges in a detailed and straight way. The oblique images make an important contribution to the texturing of vertical structures and the modelling of facades. Finally, the LiDAR data directly delivers a 3D point cloud that can be turned into a DSM and after classification into a DTM. In a 3D building modelling workflow, the LiDAR data is best suited to reconstruct the shape of the building roof types.
Due to the time synchronisation of the single datasets they are also perfectly suited for land use classification based on artificial intelligence. The combination of image data together with a normalized DSM yields in 3D land use data as shown in Figure 11, here trees are transferred directly into single 3D objects. A comparison of the 3D land use result with the 3D mesh can be seen in Figure 12, both automatically generated by Melown Technologies with their Vadstena 3D Reality-capture System.  Table 3: Contribution of dataset to final product, ++ strong, + minor, -no contribution

Advantage of LiDAR data for 3D Mesh generation
Some of the advantages of using LiDAR data in addition to image data for urban mapping were already stated. Looking into the generation of 3D Mesh models, the usage of LiDAR data helps to solve a number of issues that occur when working with images only even if flown with a very high overlap. The main reason for this lay in the photogrammetric image matching approach itself. To generate a 3D point it needs at least 2 images with adequate local texture representing the same point on the ground. As this is not always the case, some areas are not or not good enough represented in the photogrammetric point cloud. Vegetationthe ground under vegetation can not be modelled in an image only approach. LiDAR adds ground points ans allows to represent the surface (tree crown) as well as trunk and ground.
Shadow areasshadow areas or other areas with low lighting condition are difficult for matching algorithms to find corresponding points. This often leads to mismatches and so quite some noise in the resulting point cloud. LiDAR as an active system does not need any sunlight and only one measurement to generate a point in the shadow area with high accuracy.
Occlusions & Canyonsdue to narrow road it will often not allow to get two images representing the same point on the The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLIII- B2-2021XXIV ISPRS Congress (2021 ground to generate a 3D point with photogrammetric means. With LiDAR it is only necessary to bring one pulse to the ground to measure a 3D point with high accuracy.
Homogenous surfacesdue to the low variation in the greyvalues it is often difficult to get proper matches on the ground with photogrammetric means. This also applies for repetitive structures and led often to mismatches and noise in the resulting point cloud. Again, LiDAR is not affected by this and can easily generate measurements on these surfaces.
Some examples illustrating the differences between an image only and an image + LiDAR approach are shown in Figure 14 and Figure 15. In Figure 14 the advantage in narrow roads (1), backyards (2) or for the modelling of facades (3). Figure 15 illustrates the advantage of LiDAR for homogenous surfaces (in shadows) and on the transition between road and buildings. In general, it can be shown that LiDAR adds an additional level of robustness into the 3D mesh modelling.

New and enhanced products from hybrid data
It was shown in the previous sections that hybrid data has several advantages compared to dedicated image or LiDAR sensors. The classical data products can be generated easier, more robust or with higher accuracy. In addition, many new products can be developed. A selection of the possibilities will be presented here in Table 4.

Benefits of Hybrid Data
Vegetation cover analysis 5 bands of spectral data, better tree height measurement Transmission line vegetation clearance Easier detection of vegetation interference.
Detection of transmission lines in LiDAR data Urban forestry 5 bands of spectral data and better tree-height measurement due to LiDAR data Change detection Fast refresh rate, 5+ "bands" facilitates 3 rd -party change detection analysis Tree danger Map -Smart Monitoring Easier detection of trees close to critical infrastructure ( Figure 16) Robust classification All data acquired at same time, fewer temporal changes, perfectly suits AI Simplified contracting process Multiple data in same flight (RGB, NIR, orthos, obliques, LiDAR) Table 4: Samples of new and enhanced products from hybrid data Figure 16: Dashboard for Smart Monitoring of trees close to critical infrastructure -(c) Hexagon Geospatial

SUMMARY AND CONCLUSIONS
In this paper the technical specifications of a hybrid sensor system were carried out. It was shown, that by combining a passive image system with an active LiDAR system in one pod, the weaknesses of the one part are compensated by the strengths of the other system and vice versa. The workflow from the data acquisition over the orientation of the Image and LiDAR data to the generation of a great variety of products was illustrated. In a last section the advantage of the additional LiDAR data for the 3D modelling of urban areas was presented and new or enhanced product ideas were proposed.
The simultaneous capturing of image and LiDAR data leads to consistent data products and opens the door to a wide range of new data products. The flight planning can be optimized and the overlap between flight strips can be reduced, with this the flying times and so the environmental impact can be reduced. Customer will get more consistent data from the same flying hours. The workflow for hybrid data, from raw data to final products also trends to be optimized and moves as well into hybrid solutions. This makes the processing for the operator much more harmonized and the investment in different software packages will no longer be necessary. The most critical point, the hybrid orientation is already solved on different levels of complexity and integration and so guarantees that the hybrid data will produce consistent data products.
From a product perspective the sub-systems of a hybrid sensor deliver all their own contribution to the final deliveries. With the LiDAR as an addition to the classical oblique systems for urban mapping, the most critical areas, like narrow road canyons, shadow areas or vegetation could be optimized-The hybrid solutions show more reliable and accurate results here.