CALIBRATION OF A MULTI-SENSOR WHEELED ROBOT FOR THE 3D MAPPING OF UNDERGROUND MINING TUNNELS

: Mobile robotic systems show great potential for automation and a vast selection of tasks in everyday life and industrial facilities. The most important fields of application are related to limiting human exposure to harmful conditions and greatly increasing work safety, among other cost-driven factors. One of these field is the mining industry. While in surface mining automation of several tasks is already being done e.g., using drones, in GNSS-denied underground environments using such innovation at a market-ready level is significantly more challenging, also due to the difficulties of spatial data acquisition needed, e.g., for navigation. This paper presents a system calibration procedure (6D pose estimation) of a multi-sensor mobile robot developed for inspecting underground mining tunnels and infrastructures. We introduce the sensors setup for the acquisition of spatially-related data (images, laser scans) and propose a multi-step calibration workflow of the different perception devices, such as RGB, thermal cameras and LiDARs, in a coherent reference frame. The quality of subsequent calibration stages is investigated based on the visual results and derived statistical measures. The propose procedure allowed the relative pose estimation of all sensors without specialized setups and targets, utilizing only natural urban scenes and a standard checkerboard pattern.


INTRODUCTION
Underground environments, such as mines, are normally precarious and dangerous places. Underground mining tunnels require regular inspections to ensure environment mapping, the safety of personnel, smooth running of infrastructures (e.g. conveyor belts, ventilation, Etc.), detection of anomalies (Dabek et al., 2022), etc. Therefore, using a wheeled robotics solution equipped with multiple sensors and based on Simultaneous Localization And Mapping (SLAM) algorithms to inspect mining tunnels is a safer and more reliable solution (Chakravorty, 2019), and different best practices are available in the literature (Ghosh et al., 2017;Jacobson et al., 2020;Szrek et al., 2021;Zhou et al., 2021;Duarte et al., 2022). The need for reliable and precise 3D surveying data and maps of underground spaces -often unsafe and dangerous for humanshas motivated the development of a multi-sensor wheeled robotic system suited to these domains. This paper describes the robotic system developed within the activities of the EIT Raw Material AMICOS project for the autonomous exploration of underground mining environments to support mapping processes. The robot carries six sensors (plus lights) which need to be relative oriented (6DOF pose estimation) in order to properly fuse all acquired data.

Mobile robots for inspection and 3D mapping in mining
Underground mining environments, in operations or abandoned, feature numerous objective challenges for autonomous vehicles: low-light conditions, dust, absence of GNSS signal, small obstacles, uneven ground, etc. Initial efforts to tackle these challenges released large and bulky solutions but already able to autonomously navigate and explore long corridors of subterranean environments mainly with a single LiDAR scanner onboard (Ferguson et al., 2003;Baker et al., 2004;Huber and Vandapel, 2006;Silver et al., 2006). More recently, also boosted by the developments and improvements in SLAM methods, the trend showed integration of multiple sensors. Vidas et al. (2013) integrated a thermal camera and a 3D LiDAR to create a 3D spatial thermographic map of the surveyed area. Neumann et al. (2014) presented a multi-sensor exploration vehicle for mapping underground mining sites composed of radar, cameras, multiple LiDARs and an IMU sensor. Kim and Choi (2021) proposed an autonomous driving robot to perform 3D mapping of mining tunnels based on two 2D LiDARs placed horizontally and vertically. A solid-state LiDAR (SSL) for mining mapping was presented by Wei et al. (2021). Mobile robots equipped with various sensors are widely used for inspecting and mapping mining scenarios and infrastructures. Anomalies in the operation of machines were searched with LiDAR and thermal (IR) images (Dabek et al., 2022), while Skoczylas et al. (2021) merged LiDAR and acoustic data to monitor belt conveyors used in a mineral processing plant. Robotics-based 3D documentation of the mining environment for virtual reality applications was presented by Grehl et al. (2015). While many automatic procedures for monitoring infrastructure in the surface mining industry are commercially available, especially using drones, in underground mines, autonomous mobile robotics applications are so far limited (Shahmoradi, 2020). In the scope of the AMICOS project, one of the developed solutions tackles the issue of automating the monitoring and diagnostics of belt conveyor infrastructure in underground mines. The data acquired during the tests of the wheeled mobile robot developed for this use case has proven to be successful in different aspects of autonomous robot operation in mining conditions, e.g. detecting humans in the working environment (Szrek et al., 2021) and repeatedly following the path , as well as in automating the monitoring of conveyor belt elements such as idlers (Dąbek et al., 2022) and the belt (Trybała et al., 2020). While the data analysis from a single sensor can already allow for achieving useful results, precise calibration of a whole sensor suite would allow for fusing the data from a multitude of sensors and, as a result, open a way for novel analysis methods. A multi-sensor robot developed for this case is described further in Section 3.

System calibration
For single sensors, RGB camera calibration is a well-known procedure in the photogrammetric and computer vision communities (Remondino and Fraser, 2006). For thermal camera, the essence of the calibration process is to obtain hot reference points with known dimensions located on a flat surface. Liu et al., 2018 tested the heating of an ordinary checkerboard printed on paper and achieved good results. Another proposed method was a grating made of nickel-chromium heat-resisting wires. The points of crossing the grating formed reference points for the camera calibration. LiDAR calibration is usually addressed from a geometric or radiometric point of view (Kaasalainen et al., 2009;Habib et al., 2011;Kashani et al., 2015;Kersten and Lindstaedt, 2022). There are many approached for the 6D pose estimation of pairs of sensors (Debattisti et al., 2013;Guidel et al., 2017;Beltran et al., 2022). Velas et al. (2014) proposed a calibration algorithm for a Velodyne laser scanner and a camera using a 3D marker that is visible for both the camera and the LiDAR. Markers are based on simple shapes, such as squares or circles and the detection of their edges. After their identification in the images and point cloud, feature correspondences are found and the calibration process is performed. Huang and Grizzle (2020) presented a method of object position estimation for integrating camera and LiDAR images. Thanks to the use of additional targets, they significantly reduced projection error (up to 50%). Chen et al. (2020) described a different approach for extrinsic calibration between a camera and a 3D LiDAR based on camera images acquired with an infrared filter. In this case, specially prepared 2D and 3D points were used to calculate the geometric parameters. The experiments were carried out with the use of the Velodyne VLP-16 sensor. Despite the rich selection of research on calibrating single sensors or pairs of different devices, only some established methods of computing the relative poses between many sensors working in the same system are available. Synchronization is generally an issue as USB devices are known for unknown large time offsets (Olson, 2010) and sequential approaches are usually privileged (Oliveira et al., 2022). Some works tackle the problem considering sensor measurement uncertainties, thereby allowing sensors with very different error characteristics to be used side by side in the calibration (Pradeep et al., 2014). A planar target for the simultaneous calibration of cameras, LiDARs and radar sensors was presented in Domhof et al., (2019) whereas Glira et al. (2022) and Jiao et al. (2021) presented an online target-free calibration method for multisensor systems.

PROPOSED MULTI-SENSOR WHEELED ROBOT
The realized multi-sensor wheeled robot is composed of RGB and thermal cameras, RGB-D sensor and two LiDAR scanners (Table 1 and Figure 1). The sensors are connected to a companion computer via USB and Ethernet ports. Where required, additional converters are used to convert the signal to the appropriate standard. Due to the limited number of connectors, one of the devices is connected via an external LAN card using the USB port. A PC running Ubuntu Linux and ROS (Robot Operating System) collects the data. ROS is a popular metasystem used in robotics. It allows the acquisition of sensory data, sends control commands to the robot, and carries out specific actions. ROS programs mostly run using a Publisher-Subscriber communication system. Each sensor has its independent software module responsible for data reading and publication in the ROSsupported format using the data stream called topic. The published data can be read and processed by the Subscriber module. Data visualization is done via Rviz installed with ROS. The connection diagram for the sensory system is shown in Figure 2. Camera ST 1 and ST 2 are monochrome Basler cameras and operate in stereo mode. Data synchronization takes place at the recording computer level. The RGB-D Intel RealSense D455 provides depth images, especially in low lighting conditions. It has a wide field of view and a range of up to 6 m with small external dimensions. The next sensors, RGB and IR cameras, work in pairs. The RGB camera lens has been selected to make its field of view similar to the IR camera. The IR camera sends the recorded image in analogue form, so it was necessary to use an analogue signal converter to USB standard (AV-USB).  Two types of LiDARs are also included and used for mapping purposes. A Velodyne VLP-16 with 16 measurement lines that form layers and a scanning angle of 360 degrees. The Velodyne was vertically mounted on a rotating module based on a Dynamixel servo motor to increase the resolution of the acquired data. The second LiDAR is a Livox Horizon which has a smaller angular range but a more denser point cloud.
The robot is remotely controlled with a radio-based steering panel ( Figure 3). Thanks to the wireless connection, it is also possible to remotely observe the visualized sensory data on a standard Android-based tablet. The data from the LiDAR and optical sensors are supplemented by an additional inertial sensor NGIMU connected to the USB port. An independent battery powers the sensory system. The DC / DC converters provide the appropriate voltages to power the computer and the sensors requiring an external power supply. The sensor system is mounted on a column transported by a wheeled mobile robot with a skid steering driving system.

Overall process
Since the measurement system consists of several types of sensors (LiDARs, cameras and inertial units), different means of calibration must be utilized to obtain the full characteristic of the system. A multi-step calibration process is proposed to calculate the internal parameters of all cameras and relative transformations between reference frames associated with each sensor (6D relative poses). The procedure is outlined in Figure 4. First, the intrinsic parameters of each camera are determined based on the tie points detected on a standard checkerboard. The RGB-D camera is selected as the main sensor, linking coordinate systems of cameras, thermal camera and LiDARs.
In the next step, simultaneously acquired images of a checkerboard from RGB-D, stereo and RGB cameras are used to 1 https://github.com/ethz-asl/kalibr estimate the relative transformation between those sensors. A similar procedure is performed with RGB-D and thermal camera data but using a heated checkerboard. Due to the size limitations of the heated bed for the pattern, it could not be used for calculating the extrinsic parameters of a whole camera system (a reduction in the calibration accuracy was noted). For the computation of the internal and external camera parameters, Kalibr 1 , the open-source ROS library was utilized with the appropriate code modification.

Intrinsic camera calibration
For every camera in the system, the interior parameters were independently computed using the method presented by Zhang (2000). A pinhole camera model with radial and tangential (decentering) lens distortions for all-optical sensors is used (k1, k2 and P1, P2) whereas the skewness factor was assumed to be insignificant, as the influence of those parameter on the mobile mapping process is normally negligible (Heikkila and Silven, 1997). To correctly compute the internal parameters of the cameras, it was ensured to capture images with sufficient camera rotations, varying the distance camera-object and covering the entire sensor. The method detects a set of coplanar tie points of a  , , , ]) with a closed-form solution. Next, the distortion coefficients are calculated using a least-squares estimation. In the last step, all parameters are optimized through reprojection error minimization, performed with a non-linear Levenberg-Marquardt algorithm. Finally, the quality of the calibration is evaluated by analyzing the standard deviations of obtained parameters and residuals plots.

Multi camera bundle adjustment
For all cameras operating in the range of visible light, a bundle adjustment to derive the relative camera poses was performed using set of images of a standard checkerboard pattern consisting of black and white fields (similar to the intrinsic calibration). This time, however, the data was captured simultaneously by all cameras in ROS, with image timestamps assigned according to the ROS master node clock. During the procedure, the cameras form a 'camera chain', i.e. images acquired with different sensors within a time tolerance limit are linked in a chain of stereo pairs. In those pairs, tile corners of the checkerboard are extracted, and their homographies are used within a Levenberg-Marquardt bundle adjustment. Contrary to the method originally implemented in Kalibr, which tries to optimize the internal and external camera parameters simultaneously, we fix the values of camera intrinsic parameters (computed as Section 4.2) to reduce the number of unknown variables in the optimization problem, thus increasing the relatability and stability of the solution.

LiDAR-camera alignment
The pose of the LiDAR sensors in the main sensor reference frame is determined using the method proposed in Yuan et al. (2021). The method is targetless and uses linear features present in the surveyed scene, which can be identified both in an RGB/IR image and in the LiDAR point cloud. For point cloud processing, we extract edges not from geometric discontinuities of the captured points but by finding plane intersections. This helps to limit the measurement noise caused by laser beam divergence, characteristic of point cloud edges (i.e., the so-called bleeding points). Those points are then iteratively matched to edges extracted in the images with the Canny algorithm (Canny, 1986) by projective transformations. Since the convergence of this method relies heavily on the initial guess for the camera-LiDAR alignment, a rough calibration can be performed beforehand. This part comprises a grid search over a broader range of possible rotation and translation values and maximization of the number of found edge point correspondences. The results also include the estimated uncertainty of the final rotation and translation values of the camera-LiDAR relative transformation.

LiDAR-LiDAR alignment
The relative poses between the two scanning sensors serve to align the two reference frames. Thanks to the rotating module of Velodyne mounted on the robot, it is possible to acquire a dense point cloud (i.e., without empty areas between scan lines) with a robot staying in place. Consequently, Velodyne and Livox scanners can independently reconstruct a given scene. Using two clouds of the same scene, an iterative closest point algorithm (ICP; Segal et al., 2009) is used to find the LiDARs' extrinsic: the point cloud acquired with Livox is matched to the cloud acquired with a rotating Velodyne sensor, which acted as a reference dataset. Then, distances between Livox point cloud and local surface models of the reference data are computed and visualized.

Thermal camera
The thermal FLIR camera calibration was performed with the same method described in Sections 4.2. However, the calibration setup needed to be modified due to the different features in thermal images. A special, heated version of the checkerboard pattern is utilized to calibrate the thermal camera ( Figure 5). A 3D printer was used to provide constant heating to the partly insulated checkerboard so that the corners were not getting blurry on the thermal images. Nevertheless, due to the heated pattern size limitation, it was impossible to perform a multi-camera bundle adjustment with the thermal camera. Because of that, a relative pose of the thermal camera in the robot's main reference frame was estimated utilizing the Livox LiDAR sensor. The relative pose estimation between LiDAR and thermal camera is carried out using data acquired in an outdoor scene on a sunny fall day. Due to the weather conditions, the thermal images clearly showed multiple sharp edges of manufactured structures (building walls, lamp posts, benches). A multitude of objects at varying distances from the robot allowed a reliable estimation of the FLIR camera pose.

RESULTS
Approximately 50 frames with a checkerboard clearly visible and successfully identified by the automated corner extraction procedure were acquired and used for the intrinsic calibration of each camera. Only images containing the full checkerboard pattern were processed. Coverage of the field of view and reprojection error distribution were analysed after each calibration run ( Figure 6).    A separate data acquisition of the checkerboard pattern was carried out to obtain the optical sensors' relative poses using multi-camera bundle adjustment. Images from 6 cameras were captured simultaneously. In the dataset, in up to 245 frames, captured at a lower framerate of 5 Hz, the full pattern was correctly identified. However, due to not-fully overlapping fields of view of the cameras, the number of frames per camera pair used in the bundle adjustment was varied. The actual number of observations per camera pair is shown in the graph in Figure 7, where the sensors are denoted accordingly to their order in all other tables (cam0 -RealSense RGB, cam1 -RealSense IR left, and so on). Although the uncertainties of the bundle adjustment results were lower than 1 mm, due to the chained nature of the algorithm implemented in Kalibr, the Authors decided to perform another calibration run on the same dataset, changing the order of camera-pair chain formation. RealSense RGB camera was again chosen as the main sensor (cam0), but the ordering of the other sensors was shuffled. The resulting differences between those two runs of multi-camera calibration are shown in Tables Table 7: Matrix of distances between sensor poses estimated in two multi-camera calibration runs.
For LiDAR-to-LiDAR calibration, two point clouds were simultaneously captured in an indoor university corridor by the Livox and Velodyne scanners. The standard deviation of the ICPbased matching was 13.6 mm, which is lower than 1σ of both sensors' ranging accuracy. The Livox point cloud was limited to the common field of view of both LiDARs and coloured by the cloud to local surface model distances. The median of the distances was 3.4 mm, and 95% of the distances were smaller than 16 mm. Outlying values were obtained mostly for points close to the edges and on the more reflective surfaces. The visualization of the results is presented in Figure 8. The determination of the relative transformation between Livox and camera reference frames was performed on the basis of processing two outdoor data acquisitions. Due to the different linear and planar features being distinct in RGB and thermal imagery, it was not possible to perform them on the same scene due to the loss of accuracy. The dataset used for co-registering the LiDAR to the RealSense RGB camera was captured at the building entrance, where walls, window recesses and the pavement form clearly identifiable planes and edges. The extracted and matched features are shown in Figure 9 whereas the result of point cloud colourization after the co-registration procedure is shown in Figure 10. The uncertainty of computed extrinsic parameters after the coregistration procedure was 0.023° for the rotation matrix and 3.14 mm for the translation matrix. Another outdoor scene was used to calibrate the thermal camera, which included more contrasting features in the thermal images and contained more depth variability. Similarly, to the former LiDAR-camera calibration, the features identified in the thermal image and the point cloud are shown together with their correspondences in Figure 11. The point cloud was coloured using a thermal image, and the resulting relative sensor transformation is presented in Figure 12. The uncertainties of the calibration were even lower than in the RGB camera calibration: 0.014° for the rotation matrix and 1.17 mm for the translation matrix.

CONCLUSIONS
In this article, a mobile mapping robot developed in the scope of EIT Raw Materials AMICOS has been presented. Its possible fields of application for the underground mining industry have been outlined. Presented related works regarding mobile mapping solutions, especially in harsh industrial environments, show the high potential of practical deployment of such a system for automated 3D mapping and integration with AI data processing and analyzing for inspection and monitoring purposes. However, the implication of this is the need to develop hybrid sensor systems, able to acquire and fuse diverse types of data simultaneously.
The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLVIII-2/W2-2022 Optical 3D Metrology (O3DM), 15-16 December 2022, Würzburg, Germany Figure 11: Thermal camera-LiDAR relative pose calibration: detected lines in the thermal image (blue), in the point cloud (red) and correspondences between points sampled from them (green). Figure 12: LiDAR point cloud coloured with the thermal camera data after the calibration process.
An approach to calibrate the robotics system with multiple mapping sensors has been showcased. Due to the multitude of devices able to acquire data representing spatially distributed information, a multi-step, hybrid calibration workflow has been developed based on open-sourced tools and algorithms. As our application shows, various calibration setups, targets and scenes needed to be utilized because of diverse types of features, which could be easily and unambiguously identified in the data acquired with different sensors. Development of algorithms able to simultaneously calibrate hybrid sensor suites, like the one presented in this paper, while still maintaining the accessibility of equipment needed (e.g., checkerboard target) would be desired by the growing robotic community. Such a solution would also tackle another important problem of the presented multi-step procedure: the accuracy assessment of the calibration results. In our case, multiple optimization problems are solved independently, which may produce incoherent uncertainty estimates of the calibration's parameters. In future works, the Authors plan to seek a solution to this issue.