CONFIGURATION AND SIMULATION TOOL FOR 360-DEGREE STEREO CAMERA RIG

: The demand for capturing outdoor and indoor scenes is rising with the digitalization trend in the construction industry. An efficient solution for capturing these environments is mobile mapping. Image-based systems with 360° panoramic coverage allow a rapid data acquisition and can be made user-friendly accessible when hosted in a cloud-based 3D geoinformation service. The design of such a 360° stereo camera system is challenging since multiple parameters like focal length, stereo base length and environmental restrictions such as narrow corridors are influencing each other. Therefore, this paper presents a toolset, which helps configuring and evaluating such a panorama stereo camera rig. The first tool is used to determine, from which distance on 360° stereo coverage depending on the parametrization of the rig is achieved. The second tool can be used to capture images with the parametrized camera rig in different virtual indoor and outdoor scenes. The last tool supports stitching the captured images together in respect of the intrinsic and extrinsic parameters from the configuration tool. This toolset radically simplifies the evaluation process of a 360° stereo camera configuration and decreases the number of physical MMS prototypes.


INTRODUCTION
In recent years, image-based mobile mapping has evolved into a highly efficient and accurate mapping technology for collecting 3D data, which a cloud-based web service can provide.Novak (1991) and Schwarz et al. (1993) designed first stereovisionbased mobile mapping systems (MMS).Ongoing improvements in positioning and imaging sensors, algorithms and computing technologies have enabled very powerful stereovision mobile mapping approaches.Investigations in mobile mapping at the Institute of Geomatics (IGEO), University of Applied Sciences and Arts Northwestern Switzerland (FHNW) started in 2009 with the development of an image-based MMS, which has evolved to a multi-stereo camera system (Burkhard et al., 2012).This system generation is already extensively used for largescale road and rail infrastructure management.There is a large demand to expand the multi-stereo camera configuration to full 360° panorama coverage while maintaining the measurement capabilities with the same accuracies.Existing 360° panoramic MMS, equipped with a single panorama camera e.g.Heuvel et al. (2006) use co-registered LiDAR data for depth map generation.However, the scan resolution limits the density of the depth map.Therefore, the depth map should be generated from stereo vision since generating dense maps from a stereo system ensures the spatial and temporal coherence of radiometric and depth data of the 3D imagery (Nebiker et al., 2015).Existing systems equipped with stereoscopic panoramic cameras can either have a vertical (Earthmine, 2014) or a horizontal arrangement (Blaser et al., 2017) to enable full coverage of complex environments.With the trend in architecture and construction towards digital building design and construction progress control, the need for accurate and rapid 3D mapping of indoor scenes is rising.Existing stereoscopic panoramic camera system like described above lack in possibility to adapt for indoor mobile mapping applications.The BIMAGE Backpack (Blaser et al., 2018) is one of a few indoor MMS which uses a Structure from Motion (SfM) method for generating 3D information.SfM-based systems with a virtual stereo base cannot reach the high accuracy of a system with a calibrated physical base.Holdener et al. (2017) proposed a multi-head stereo panoramic camera arrangement with five physical stereo bases.With a fisheye stereo system with a base of 60 cm and low cost sensors, they realized maximal deviations of 3 cm on distances up to 8 m.Nevertheless, the accuracy of depth maps for indoor mobile mapping could even be increased with an optimized system configuration.Optimizing this configuration is challenging, however, since multiple parameters are involved and different requirements have to be met: -The system should provide full 360-degree depth information.
-From the images a panorama free of stitching errors shall be generated -Depth information should be available from a certain minimal distance.The involved parameters are, for example, the camera field of view (FOV), the camera resolution, the base length of the physical stereo base or the minimal distance, from which depth information is available.As there is a multiple camera system, no single projective center is available which inevitably leads to stitching errors if all images are composed into a single panoramic image.The impact of stitching errors, whether they are visible or not, depends on the scene and the base lengths of every stereo pair.Based on the system by Holdener et al. (2017), different configurations with different parametrizations of the camera rig needed to be evaluated.Lian et al. (2018) proposed an Image Systems Simulation for 360° Camera Rigs.Their software supports designing and evaluating 360° camera rigs, but does not support a physical stereo configuration.
To easily design, test and validate different configuration ideas we developed an evaluation toolset, which enables to define different system configurations and to evaluate these configurations.The toolset consists of a stereo coverage tool, which lets users evaluate, from which distance on there is depth information available, a camera rig configuration tool that can be used to configure a stereo system, place it in virtual 3D scenes, and capture images.Lastly, these captured images can be composed with a panoramic tool and then, the resulted panoramic image can be evaluated.
In the following section 2, each part of our evaluation toolset is described.Section 3 shows how the tools can be used with an indoor and an outdoor use case.

EVALUATION TOOLSET
Based on the proposed multi-view stereo camera system by Holdener ( 2017), a virtual camera rig has been created.This rig consists of five stereo pairs covering 360 degree horizontally and one stereo pair, which faces upwards (Figure 1).The stereo camera rig is formed by a horizontal polygon with five corners.On every corner, two cameras are positioned.Regarding space saving, Holdener et al. (2017) proposed forming stereo pairs not with neighboring cameras but instead with the next but one camera.Therefore, the length of the stereo base is slightly shorter than the diameter of the rig and the edges of the polygon are not the stereo bases.The length of the stereo base is also the maximum extent of the whole rig.
Since the horizontal aligned cameras are not able to cover the space above the rig, there is an additional stereo base facing upwards.The length of the upwards facing stereo base can differ from the horizontal stereo base.
We developed a toolset consisting of three different tools.The first tool stereo coverage allows configuring the system in dependence of the distance, from which there is overlapping depth information.The second tool enables configuring a stereo camera rig based on the parameters from the first tool, placing it in an indoor or outdoor scene and capturing images.The last tool is used to stitch captured images into a panorama image.
With the aid of the developed toolset, users can virtually evaluate different rig configurations and check the influences of different parametrizations.The first and the second tool have been developed using the game engine Unity.In the following sections, the different parts of the evaluation toolset are presented.

Stereo coverage
The first tool stereo coverage can be used to determine from which distance on there is depth information and if depth information can be calculated in any direction.This tool simplifies the process of finding the right parametrization of a full 360-degree depth coverage.Figure 2 shows the UI for the depth coverage.The following parameters can be set: - The FOV of the stereo pair, which represents the availability of depth information, is projected from the centre of a sphere on its surface.Depending on the radius of the sphere, all stereo pair FOVs are overlapping.This indicates that from this distance on full 360-degree depth information is available.If the FOVs do not overlap, either the FOVs of the cameras are too narrow, the stereo base is too long, or the radius of the sphere is too small.With this tool, the perfect parametrization can iteratively be found.Once a parametrization is found, this configuration can be used in the camera configuration tool with test scenes to acquire images and then to calculate a panorama image.

Camera rig configuration
In a second step, the camera rig can be configured based on the parameters from the stereo coverage tool.This can be done with the help of a simple UI (Figure 3).Multiple parameters can be set.The first three parameters control the horizontal camera rig.
The camera rig can be rotated around the vertical axis and the height above ground can be set.The length of all horizontal stereo bases can be adjusted.The next three parameters control the upwards facing stereo base.The base length can be different from the horizontal stereo bases.The upwards facing base can have a vertical offset in respect to the horizontal rig and can be tilted.The FOV for all cameras can be adjusted by changing the horizontal FOV, the vertical FOV gets calculated from the aspect ratio, which can be defined by setting the sensor width and height.
The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLII-2/W13, 2019 ISPRS Geospatial Week 2019, 10-14 June 2019, Enschede, The Netherlands With a dropdown menu, different environment scenes can be chosen (see 2.2.1).
Figure 3. Camera rig with configuration panel (right) Once the rig is positioned on a desired location and parametrized with the desired settings, 12 high-resolution images from every camera are acquired.These images are exported together with the intrinsic and extrinsic camera parameters.The intrinsic parameters include the pixel size, the sensor dimension and focal length.The extrinsic parameters contain the relative position and orientation of every camera.In a created log file the absolute position of the camera rig, the FOV and the stereo base length are exported.With this information, a panorama image can be calculated from the individual images.

Test Scenes:
To evaluate the rig in different environments, the configuration tool includes four indoor scenes and two outdoor scenes.One of each is a reconstructed scenery; the others are simulated scenes (Figure 4).The simulated scenes are open source scenes from online 3D model repositories.
Both real world and simulated scenes have several benefits.The advantage of reconstructed scene is that the dimension of the rig can be adapted to a real-world scene.There, the user can verify if the camera rig fits through specific doors or narrow corridors.
In addition, a user can use his local knowledge about a real scene and can place the camera rig at a specific location in order to capture images and evaluate the parametrization of the rig at this specific location.The main advantage of simulated scenes is that these scenes have strong edges.These edges are visible on the captured images and after the panorama generation, stitching errors are more obvious.In addition, simulated scenes have uniform lightning.Important for all scenes are typical room dimensions, otherwise a configuration of a camera rig for an atypical room size is useless.The navigation in the scenes is not completely free, for every scene there are predefined locations to where the rig can be navigated (Figure 5).The predefined locations are marked with a blue sphere.The locations are predefined in order that the captured images and the resulted panoramas are reproducible and therefore comparable.Once selected a sphere, the position of the viewer and the rig will be changed.The UI allows to configure the rig and capture images on every location.
Figure 5. Camera rig with a blue sphere in the background, indicating a possible location for the camera rig

Panorama stitching
The final part of the evaluation toolset is the panorama generation tool.This tool is implemented in python.The tool stitches the individual images into a panorama image taking into account the interior and exterior parameters.Only the left camera images of the stereo pair are used for the panorama generation.Multiple parameters can be set: The size of the resulting panorama can be chosen and the radius of the sphere, on which the individual images are projected, can be set.First, a look up table (LUT) for every image is calculated, then all images are resampled with the LUT and finally, the images are stitched together.The sphere radius has a great impact on stitching errors in the panorama image.Getting rid of all stitching errors is not possible since the rig cameras do not have the same projection center.It is also obvious that the stitching errors will be more apparent with this configuration compared to a panorama stitched from a multihead camera like a Ladybug5 because the cameras have a lateral offset of the half of the stereo base length.Figure 6 shows the individual camera images and the stitched panorama.The camera rig is here configured that the upwards facing camera is oriented toward zenith and therefore the camera does not overlap with the horizontal cameras with this parametrization.
Figure 6.Individual camera images (top row, middle row) and the resulted panorama with sphere radius 20 m

RESULTS
As mentioned before, this toolset enables evaluating the configuration of a 360-degree stereo camera rig.In the following sections, one case for an indoor mobile mapping application and one case for an outdoor application are described.

Indoor
For indoor mobile mapping applications, it is important, that stereo coverage is already possible in a short range.In addition, the camera rig should be small enough to fit through doors.With these restrictions, a configuration has to be found with the stereo coverage tool: With these parameters, images can now be taken using the configuration tool.Once the images have been taken, the images can be stitched together with the panorama generation tool.Generating panorama images from these images is challenging and stitching errors are highly visible.As Figure 8 depicts a larger radius of the projected sphere and a smaller stereo base result in better panorama images.

Outdoor
For outdoor mobile mapping applications, the availability of short-range depth information is less crucial as for interior applications.The minimum distance for depth availability is around 3 meters.Therefore, the camera rig can be configured differently: Similar to the indoor application task, depending on the base length or the minimal distance of depth information, the parametrization of the camera rig needs to be adapted.The length of the stereo base can also vary depending on the size of the mobile mapping platform.A shorter base length like in Figure 9 (left) would be appropriate for a quad-mounted MMS whereas a camera rig with longer base lengths would be suitable for a mobile mapping car (see Figure 9, right).As it is visible in Figure 10, stitching errors still exist even with the images projected on a sphere with a radius of 20 meter.The shorter the stereo base, the less visible are stitching errors.Shorter stereo base influence the accuracy of depth information and therefore this tool helps finding the best trade-off between perfect looking panorama images and accurate 3D measurements.

CONCLUSION AND OUTLOOK
In this paper, we presented three tools, which radically simplify the evaluation process of a 360-degree stereo camera configuration.The tools are suitable for both indoor and outdoor mobile mapping system.Further, they simplify and systematize the previously iterative evaluation process.Thus, the number of physical prototypes will drastically decrease.
A variety of different approaches to configure a stereo panorama rig is supported.If the sensor and therefore the FOV is already present, the dimension of the rig can be parametrized accordingly.Depending on the minimum distance for depth information, the rig can be parametrized correspondent.
The toolset has already proved as an important toolset for sensor evaluation in a recent railroad-mapping project.
In the future, the tools will support additional camera models.Supporting fish-eye cameras would open up additional fields of application.Merging the tools can further simplify the evaluation process.Another improvement would be the a priori accuracy visualization of the depth map directly in the configuration tool.With this visualization, the impact on the accuracy by altering the stereo base would be directly visible.

Figure 1 .
Figure 1.360 degree camera rig: concept with horizontal field of view (bright red) and stereo field of view (dark red) (left), camera rig with horizontal and upwards facing stereo base (right) Figure 2. Depth coverage tool with the UI (right), here the FOVs are not overlapping

Figure 4 .
Figure 4. Indoor scenes (top row, middle row) and outdoor scenes (bottom row), top left and bottom right scenes are real scenes from dense matching/laser scanning Figure7depicts that a closer stereo coverage is only possible, if either the base length decreases or the FOV increases.A decrease of the stereo length and an increase of the FOV result both in a less accurate depth calculation from the stereo pairs.With these parameters, images can now be taken using the configuration tool.Once the images have been taken, the images can be stitched together with the panorama generation tool.

Figure 8 .
Figure 8. Panorama from camera rig with stereo base length of 40 cm and projected sphere radius of 4 m (top) and 20 cm and projected sphere radius of 20 m (bottom)

Figure 10 .
Figure 10.Panorama from camera rig with stereo base length of 60 cm and projected sphere radius of 20 m (top) and 80 cm and projected sphere radius of 20 m (bottom)