PHOTOGRAMMETRIC 3D MEASUREMENTS IN FORESTS WITH ACTION CAMERAS

This paper presents a study of the application of action cameras, such as the GoPro, for the extraction of 3D models in a forest environment. These cameras are very small and can be used by any operator, even in other forestry works, to collect imagery for photogrammetric post processing. The method implemented uses the Structure from Motion approach in order to obtain the 3D model. The model becomes georeferenced by means of the camera positions which are recorded by the camera navigation GPS receiver. The aim of the study is to assess the geometric accuracy of such methodology, especially in relative terms, in the evaluation of object dimensions. Several tests were carried out both in open sky and under trees, with worse conditions for GPS positioning. The results pointed to scale accuracy of around 3%, which may be acceptable for many applications. Accuracy can be improved by using a scale bar of known length placed in the scene. The method is intended to be used in tree measurements for forest inventory or other outdoor measuring. * Corresponding author


INTRODUCTION
Photogrammetry provides methods for three-dimensional measurement in a large diversity of conditions, both from aerial or terrestrial images. Its application in the forest environment is mainly done from aerial images, but there has been a recent interest in the application of terrestrial images for the generation of 3D point clouds in forests as a measurement tool for forest inventory (Piermattei et al., 2019, Mulverhill1 et al., 2019. Many of the recent developments in terrestrial laser scanning can be of great interest in forestry (Liang et al., 2016). Fixed systems in tripods, have the limitation of reduced area coverage. Mobile systems, operated in a backpack and including an IMU for precise positioning in obstructed environments are an excellent solution. However, prices of such systems are high and probably not accessible for many users interested in forest inventory.
Photogrammetric systems are in general lighter, easier to transport and much more accessible. In particular, action cameras, commonly used in sports and outdoor activities, are very adequate for rough environmental conditions. The fact that many now have GPS units, makes them very interesting to do terrestrial photogrammetry in a forestry environment. This paper explores this fact, making use of an action camera and structure from motion (SfM) processing.

METHODOLOGY
The proposed method aims at obtaining dense clouds and 3D models, using the Structure from Motion (SfM) approach, with images acquired by an action camera in a forest environment.
The main purpose is to use the 3D model to measure dimensions of trees and other objects of interest within forestry inventory or general forest studies. The camera system to be used and the methodology are described in the following sections of the paper.

Description of the camera
A GoPro Hero 8 was used in this study. It can be mounted on top of a helmet and activated remotely by the operator with a smartphone. The operator controls the direction of the images simply by moving his head, looking at the objects of interest. The intention of the system is that in the field the operator just acquires image data, which will be later processed in the office. If much, only a reference tape measurement of some object may be done for quality assessment or eventual scale calibration (Mokroš et al., 2018).
The camera acquires video or discrete images. It includes a GPS receiver of navigation grade, with a positional accuracy of a few meters. Discrete images acquired in the JPEG format are tagged with WGS84 geographic coordinates and altitude above the ellipsoid, in the EXIF metadata. The positioning solution is an integration of GPS and attitude sensors, so even in a relatively dense forest environment, most of the images will have a position. Anyway there is a trend to a decrease in the accuracy inside dense tree coverage.
Action cameras are known for having a large radial distortion that increases very much the field of view. For a photogrammetric use it is preferable to use an alternative image mode, called "linear", which corrects the essential component of the radial distortion. Figure 1 shows an image of a scene acquired in standard mode and in linear mode. A global correction model is applied to images of the linear mode. Each individual camera unit will probably have some residual distortion, so camera self-calibration will have to be taken into account in the processing. Analysing the images and their metadata it can be found that the images have, in the linear mode, the following characteristics: Image size: 3000 lines by 4000 columns Sensor pixel size: 1.73 m Focal length: 3 mm The corresponding focal distance in pixel units is of 1733 pixels, but in the self-calibration it was always adjusted to values around 1905 pixels, which corresponds to an increase of 10% with respect to the starting value. Self-calibration was analysed and described below.
Another concern in terms of interior orientation is the effect of the rolling shutter. Although it is more noticed in video, image distortions can occur if the camera has a fast movement.
Although an operator will in general have relatively small velocity when walking, this effect was also analysed. Common SfM softwares such as Pix4D Mapper and Agisoft Metashape include models for evaluating and correct the rolling shutter effect (Vautherin, 2016). Tests of camera calibration were done considering the rolling shutter correction turned on and turned off.
Images used in the tests described in this paper were acquired in the time lapse mode, with a time interval of 0.5 seconds. At the speed that an operator moves in the field, that image rate provides enough overlap for 3D data extraction. Images are recorded in JPEG format and tagged with the coordinates given by the GPS receiver. Alternatively, it would also be possible to acquire videos, at a much higher frame rate, and extract discrete frames to process in the very same way (Liu, et al., 2018). GPS positions are integrated in the MP4 video format and can be extracted to tag all the frames (Gonçalves and Pinhal, 2018). The advantage of acquiring discrete images is the higher resolution.

Description of the method
Images are acquired by the operator in an area covered by trees, with the camera pointing approximately in the horizontal direction, and following a path such that a set of trees will be covered from different directions. Tree trunks will show up in the images, in a way that for consecutive images there are common objects, not with very large changes of perspective. The operator should avoid passing very close to shrubs or other obstacles that produce very strong changes in the consecutive images, since that may cause problems for the image matching. From a set of images of trees, it should be possible to obtain a 3D model by SfM.
The main interest of the system is not the absolute geolocation accuracy of the 3D model but the accuracy of relative measurements such as tree height, or diameter. To start, we may think we have a set of images without any known geolocation. The image alignment in the SfM processing starts by identifying tie points in order to do a free bundle adjustment, in an arbitrary coordinate system (the model system). This results in the generation of an initial sparse point cloud, in the model system. This is similar to the concept of traditional relative orientation of a stereo pair of aerial photographs. A point with a position vector, u, in the model coordinate system, is transformed into its georeferenced position vector, X, by a 3D conformal transformation: where M is a rotation matrix, TX is a translation vector and s is a scale factor. If positions of camera projection centres are known, and are not collinear, they act as control points for absolute orientation of the model, using this equation. The point cloud will become georeferenced, but the accuracy of those positions will be similar to the accuracy of the camera GPS receiver, which is of a few meters.
The main concern in the system to be implemented is the accuracy of the scale factor, since it is critical for the ability of measuring object dimensions (Liang et al., 2014, Huang et al., 2018, Piermattei, et al., 2019. A standard procedure in terrestrial photogrammetry would be to measure one or more calibration distances. In the present case the idea is to depend, as much as possible, only on the GPS camera positions, avoiding calibration field measurements. There is a risk that, if the accuracy of the projection centre GPS positions is low and they are in a small number, the scale factor may be very different from the reality, resulting in a low accuracy of the relative measurements. Increasing the number of images, the compensation effect of errors is expected to result in a more accurate scale factor. A first test was done with a small number of images, around an individual tree. Figure 2(a) shows a planimetric plot of the first case, in an area of 7 by 7 square meters, around a tree, where 31 images were used, with the positions given by the camera GPS receiver. Errors of a few meters in different directions originate a clear deformation of the surveyed path. After the image alignment, the positions become regularized, as can be seen on figure 2(b), revealing the actual shape of the path followed.   In fact, looking at the orientation of the tree, we can see that it is tilted with respect to the Z axis of the reference system, while in fact it is vertical. That is explained by the height errors of the initial GPS coordinates. The plots of figure 2 show only planimetry, but the recorded camera heights had changes of 4 meters, and in fact the ground is flat and the camera was at a constant height. These errors in the camera projection centres introduced a tilt in the georeferenced model. Anyway, for the main purpose of the system the accuracy of the model location or orientation are not critical. The main point is to avoid a wrong scale, which would introduce a bias in the measurements to be done. The system will be tested with a much larger number of images, so that camera random errors in different directions tend to compensate.

TESTS OF THE SYSTEM
Two sets of tests were done at the Astronomical Observatory of the University of Porto, one for calibration purposes, in an open area, and a second set under tree coverage, in a situation closer to the intended real operating conditions of the method.

Tests in area without obstructions
The first set of tests were around the building of a telescope, which can be seen in figure 4. The area does not have obstacles for GPS positioning, the building has many details for image matching and some points are marked, which were measured with a total station. The main intention of these tests was to assess the camera calibration with different options and to assess the scale errors. The camera was taken around the building, at a slow speed, and collecting approximately 250 images. The first operation was to process these images in Agisoft Metashape, doing the image alignment, with control points (marks on the building) and with camera self-calibration which resulted in what we called "LabC" parameters. The average RMS of reprojection errors was of 1.1 pixels. The parameters were, in the mathematical formulation used by Agisoft Metashape (Agisoft, 2022), the following: Focal distance: 1904.5 pixels Principal point (cx, cy): (10.7, 38.9) pixels Radial (k1, k2, k3): (-0.00282, 0.00324, -0.00103) Decentering (p1, p2): (0.00184, 0.00425) Then a total of 5 sets (A1 to A5) like the first one, were collected and were processed without control points, with camera self-calibration and in two versions: without rolling shutter correction and then with rolling shutter correction. In all cases camera parameters were very similar, only with the difference that with rolling shutter correction, the principal point had larger variations of position. The average RMS of reprojection errors were always between 1.1 and 1.3 pixels. The main comparison of performance was made with the assessment of calibrated distances defined by the marked points. Figure 5 shows two of them. There is another marked distance of 15 m in the back of the building.  Within each set, the results of the 3 combinations were very similar. It becomes evident that the effect of rolling shutter correction is not relevant. There is also no special advantage of using a lab calibration. In real operation of the system it will be The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLIII-B2-2022 XXIV ISPRS Congress (2022 edition), 6-11 June 2022, Nice, France reasonable to always do the camera self-calibration, with no RSC.
The scale errors were small, around 1%, in some cases but rather large in others, around 5%, which may be acceptable in some situations, but might not be acceptable in others. To have more certainty of the accuracy, the use of a simple calibration bar may be useful. In all the tests the relative error became very small, well below 1%. Anyway, this test was done in very good conditions, with very well defined scale bars and with very rigorous distances. A real scenario in a tree environment will probably give lower accuracy. That was the reason for the second set of tests.

Tests under trees
Four tests were done under trees, in an area of the Observatory with evergreen oaks. These trees, although not composing a dense forest, are broad-leaved and cause a significant obstruction of GPS signal. They are the main trees of southern Portugal, and so this situation represents what can be found in many places where the system could be exploited.
Images were acquired in an area enclosed by a rectangle of 40 by 60 meters, along a path with approximately 300 meters, in a total between 450 to 500 images in each set. A levelling staff with an extension of 4 meters was placed on the ground in order to be used as a scale bar. Figure 6 shows one of the photos, where the staff can be seen. There were also some wooden buildings, which were used to assess the scale of the model. The set of 4 tests were identified as B1 to B4. Camera positions were recorded for all of the images, except in the case of B4, in which only around half of the images were tagged with coordinates. This is something that operators should be aware of in the filed because the camera signalizes the correct functioning of the GPS. Figure 7 shows an aerial image of the area, with the camera horizontal GPS coordinates, which may look irregular because of the GPS accuracy. The image alignment is done, with camera self-calibration, in order to obtain the complete orientation of the images. The points in the levelling staff and in the building walls are identified in the images where they are better seen.
Camera GPS positions and the adjusted ones after the SfM processing were plotted, in order to assess the changes. Figure 8 represents the two sets of camera positions: in red the original and in blue the new ones. Differences are of few meters, normally less than 3 m. A similar analysis was done for the elevation, which was represented as a function of distance travelled. Figure 9 contains that graph, with distance and elevation in meters.

Figure 9.
Original GPS camera elevation (red), and corrected ones after SfM (blue). Distance (x) and elevation (y) in meters.
Position changes occur in different directions, so it may be expected that there will be some compensation effect in the model scale.
Scale bars are created and the distance in the 3D model, dmodel, is calculated. The relative error, R, is calculated with the known reference distance, dreference, in the following way: If several verification distances are considered in a set of images, the value to present is the average of the relative errors. In a second step the levelling staff was used as a calibration scale-bar, with 4 meters, and the average error in the other reference distances was calculated. There was an improvement in the model scale accuracy, when one scale bar is introduced, although not as significant as in the previous tests (A1 -A5). Figure 10 represents these data in bar chart. Set B4 had a very large error before calibration. The possible explanation: this was the case where only around half of the images had GPS positions. Images could be aligned and all ended up with coordinates, but the scale determined had a poor accuracy. Using a calibration scale bar, the model scale can be obtained with errors of between 1 and 2%. The system can be considered a reasonably good measuring tool for forests: 2% means an error of 2 cm in a distance of 1 meter, which is probably acceptable for many measurements in outdoor conditions.
Even depending only on GPS the error is relatively small. More tests will be done, in larger extensions and with more images to assess its performance and establish rules for the best use of the system.

CONCLUSIONS
The system developed for measuring dimensions of trees and other objects by terrestrial photogrammetry is very simple and cheap. It makes use of a GoPro camera, which is simple to operate and can be used in rough environments. The camera GPS unit provides positions even under trees, which helps obtaining a good georeferencing for most of the data.
The photogrammetric processing is done with Agisoft Metashape. It could be concluded that, on a regular basis, processing can be done with camera self-calibration and without rolling shutter correction.
The system provides point clouds and 3D models with an acceptable accuracy, which can be improved using scale bars. A scale bar of 1 or 2 meters can be easily transported and placed in the scene.