Co-registration of multitemporal uav image datasets for monitoring applications: A new approach

: In the last years we have witnessed a rapid development of UAVs (Unmanned Aerial Vehicles), especially for image collection. One of the advantages is the possibility to perform high resolution and repeated flights in a cheap way to detect changes over time. Thus, dynamic scenes can be monitored acquiring image blocks in different epochs in a flexible way. Anyway, most of UAVs are not able to provide accurate direct geo-referencing information, so image blocks from different epochs still need to be co-registered to efficiently detect changes. This task is mostly completed using GCPs (Ground Control Points), although this approach is time consuming as manual intervention is needed. This paper aims at investigating new techniques to automate the co-registration of image blocks without the use of GCPs, just relying on an image based co-registration (IBCR) approach. The image alignment is initially performed on a reference (anchor) epoch and the registration of the following (slave) epochs is performed including some (anchor) images from the reference epoch with fixed external orientation parameters. This allows constraining the Bundle Block Adjustment of the slave epoch to be consistent with the reference one. The study involved the use of 10 multi-temporal image block over a large building construction


INTRODUCTION
Monitoring studies and change detection application often need the co-registration of datasets acquired at different time.This is still an issue in archaeological, disaster management and construction scenarios and there is a need to generate highly accurate information in a flexible and easy way at a reasonable cost.Especially construction projects need to be periodically monitored and controlled efficiently to meet planned targets (El-Omari & Moselhi, 2011).The information generated during the changes surveyed in the construction site can serve as a feedback for the contractor and financial investors to check how and when the development is and was progressing.This information of the construction sites can be also used to detect changes (Matikainen et al., 2004, Champion, 2007).In the past, terrestrial and classical aerial photogrammetry methods have been used in the field of construction industry (Memon et al., 2004), but also disaster monitoring (Gerke andKerle, 2011, Murtiyoso et al., 2014), urban development, documentation of archaeological sites (Chiabrando et al., 2011), agriculture and natural resources management (Aicardi et al., 2016).However, these methods have their limitations.For example, using classical aerial photogrammetry it is difficult and costly to detect changes that are taking place on small areas (like building construction site).This is because for small areas, it is much too cumbersome and rather impractical to have a conventional flight for example once a day.On the other hand, terrestrial photogrammetry methods are time consuming and dangerous to carry out on a construction site where there is a lot of heavy machinery movement and, sometimes, it is also impossible to capture data in inaccessible areas.Different approaches were also adopted from the scientific community, such as airborne laser scanning to analyse the changes of building footprints (Rutzinger et al., 2010) or objectbased analyses and GIS tools (Durieux et al., 2008).
In this regard, Unmanned Aerial Vehicles (UAVs) can be very powerful systems since they have the capacity to operate at lower heights and can capture information at different viewing angles (Unger et al., 2014).Rapid developments in UAVs hardware and software technologies have made great impact in many geo-spatial application fields.Photogrammetry and remote sensing are some of the disciplines that have profited from the UAV technological advancement race (Everaerts, 2008;Eisenbeiß, 2009;Cook, 2011;Chiabrando et al., 2013).The latter has been driven by the need for relatively cheap and easy information acquisition and processing, which is the basic necessity for carrying out high quality research and development projects at minimum cost.From a photogrammetric point of view, UAV data for multitemporal analyses have been investigated a lot (Gülch, 2011, Rosnell et al., 2011, Vallet et al., 2012).The user has flexibility not only in terms of flight parameters as such, but in principle the same area can be flown as often as possible and as long as the weather is favourable, i.e. a very high temporal resolution can be realized easily.
In contrast to heavy weight platforms, most UAVs cannot carry location and attitude registering sensors of high quality.Although some first air planes with at least RTK-based GNSS are available (Gerke and Przybilla, 2016) still at least a local Global Navigation Satellite System (GNSS) receiving station or some correction network infrastructure needs to be in place.Furthermore, these systems are today very expensive in comparison to the commonly used UAVs.For this reason, indirect sensor orientation, i.e. the incorporation of ground control points (GCPs) seems necessary in order to achieve high accuracy and precise co-registration between the single multi-temporal images (epochs).However, physical acquisition of GCPs in the field or site is possible but it is time consuming and costly.Even when the GCPs were collected previously and are readily available, the process of incorporating them into the co-registration process requires manual input from the user and it is time consuming and monotonous and therefore prone to gross errors.For this reason, a methodology for the automated registration of multi-temporal UAV blocks (with or without GCPs) would be very useful to speed up the process and allow the point cloud generation.
The overall objective of the presented work is to investigate a technique to automate the co-registration of two or more multitemporal UAV-image blocks (epochs) using external orientation parameters from anchor images selected from static (unchanged) areas of a chosen reference dataset.This allows to bypass the GCPs collection and processing and to have a completely automatic registration process.In the second section a case study is described; the used methodology is presented in the third section while in the fourth section the results of the approach are evaluated in terms of registration accuracy; finally some conclusions are reported.The developed approach provides a general solution to the registration of multi-temporal UAV images.In this sense, the construction site represents just an appropriate test to access the effectiveness of the presented methodology.It needs to be mentioned, however, that without the use of GCPs the absolute localisation within the mapping datum is not known accurately.The presented co-registration method just solves for the relative transformation between the epochs but leaves the absolute localization within the mapping datum unknown.For this reason, if an accurate positioning in the mapping frame is desired, some GCPs in the reference epoch will be still needed to adjust the block accordingly.

CASE STUDY
The data was captured during the EFPL's SwissTech Convention Center construction period.The study site is situated in Lausanne, Switzerland (Figure 1), where several multi-temporal image datasets were acquired on the construction area over a period of two years.The dataset was provided by the Pix4D company.Flights were specifically performed to have a temporal coverage of the construction and to monitor the development of the building for its final documentation.Furthermore, a multi-temporal coverage allowed also to perform change detection analyses and to have a fast knowledge of the work progress.The development of automatic procedures for the changing analyses needs to start from consistent georeferenced datasets.
For the aerial surveys, a very light weight UAV of less than 500g was used to capture the images.The system is called eBee and was designed and produced by Sensefly.It is a fixed wing UAV with a consumer grade GNSS, an altitude sensor, a radio transmitter and an autopilot circuit board.It has a payload of a maximum of 125g and can fly for about 30 minutes in low wind speed conditions (i.e. less than 20km/h).The images were geo-tagged using the on-board GNSS at the time of exposure during the flight campaign and the location information was stored in an Exchangeable image format file (Exif file).Each collection of images (Epoch) was taken on the same day.Ten epochs have been considered in this study, spanning the entire construction period.Between 70 and 160 images have been captured in each epoch.

Method Overview
In order to detect temporal changes taking place in object space, images captured at different times need to be spatially aligned (Sheng et al, 2008).Image registration in modern photogrammetry approaches (Behling et al., 2014, Zitová et al., 2003) integrates computer vision techniques for automated processing workflows and often it involves the so called Structure from Motion (SfM) (Westoby et al., 2012).SfM consists of few steps such as the feature extraction and matching, the concatenation of the images and their final refinement in BBA (Bundle Block Adjustment).In the case of multi-temporal datasets it would be better to select only the images that can allow the alignment, that is they need to include stable areas around the construction site.The aim of the work is to evaluate if it is possible to perform the image alignment without the introduction of external Ground Contro Points including anchor images in the bundle block of other epochs.Three main approaches of multi-temporal image block coregistration were used: 1) Geo-tag only that use only Exif file GPS information; 2) Reference GCP-based co-registration (RGCP) which uses the conventional GCPs to orient the block.As it will be explained in the next section, this approach was introduced for the validation of the RIBC results; 3) Reference Image Block Co-registration (RIBC) whose aim is to perform the images alignment starting with the EOPs (External Orientation Parameters) of anchor images.
Epoch 2 was acquired during the excavation phase and it was used as reference for the other blocks alignment because it had many characteristics similar to other epoch images captured in almost the same season.It is more similar to the others epochs, Figure 1.Location of the study test instead the Epoch 1 was acquired when the vegetation was very brown and it is different from the other data.This epoch was processed in Pix4D considering half of the image resolution using as external orientation parameters the GPS data that were considered in the BBA (Figure 2).Anchor images were then manually selected from this epoch and their optimized internal (IOPs) and external parameters (EOPs) were then fixed and used as reference for the BBA of the others data.Marked GCP and CPs (Check Points) from reference Epoch 2 were also manually extracted to be used in the RGCP procedure and to test the accuracy of the RIBC results.As a first step, these points were manually selected to don't introduce errors in the evaluation of the methodology.In the future they will be selected with an automated approach.

Reference GCP-based Co-registration (RGCP)
RGCP is well known in conventional photogrammetry because of the use of GCPs that allow to georeference the already relatively oriented images in a reference system.In this case, the used GCPs were not surveyed with GNSS instruments on the ground, but they were extracted from the reference Epoch 2 after BBA and they were then used to coregister input block of images from epochs 1, 3 to 10 (Figure 3).

Figure 3. Workflow of the RGCP process
This approach was included in our work as a reference to check the performance of the RIBC

Reference image block co-registration (RIBC)
In comparison to the common GCPs approach, the proposed one is based on the use of images.As a first step, these images were manually selected from reference Epoch2, considering the stable area around the building construction site.The RIBC involved the following procedures (Figure 4): 1. images blocks from input epochs (first aligned with EXIF file GPS data) and from selected anchor images were merged into one block; 2. saved EOPs from Epoch 2 were then added only to the corresponding reference images in the block, thus giving them a higher weight in terms of accuracy.In fact, they were considered in the process with a very high accuracy (1 mm); this ensured that the input epoch gets oriented based on reference Epoch 2 EOPs; 3. the camera interior orientation is also treated separately.While the cameras used for the reference Epoch 2 where introduced with the adjusted IO parameters and kept fixed, the camera parameters of the input Epoch got adjusted (self-calibration).
Because of the high correlation between IO and EO parameters it is important to also leave the reference Epoch camera at the original calibration status; 4. Bundle Block Adjustment was then performed between reference and input images starting from the EOPs of anchor images.It needs to be noted that due to the fact that no ground survey was available, block deformation and remaining systematic errors in the reference Epoch 2 remain undetected.In addition those errors are propagated into the other input epochs, However, for this relatively small image block and because of quite large height variations within the scene we assume that block deformation effects are not significant.

Data processing
Image processing was done in Pix4DMapper software.It is a program used for the automatic processing of images including image alignment, point cloud and DSM production (Pix4Dmapper, 2016).It is composed of three main steps: initial processing (image alignment /calibration), point cloud densification and DSM/orthophoto production.Reference images were selected from Epoch 2 around the construction site zone (Figure 5) where there were no major changes of features taking place.Any image from a reference block with a capture area encroaching into the construction zone by more than 40% was dropped as a reference as this could lead to obvious matching failures.
Figure 5. Reference images projection centres distribution and check points around the construction zone.
As described by Zitová & Flusser (2003), the images (anchor and input) should have features which are distinct, spread all over the image and efficiently detectable in both images.Figure 6 shows an example of an anchor image from Epoch 2 and the corresponding images in other epochs.As a first step, the reference images were manually selected, but to reduce the user manual operations, an automatic procedure will be also developed in the future for the images selection.
The reference epoch was separately processed for image alignment using half of the image resolution to first validate the methodology.This allows to have starting orientation parameters for the reference epoch and allowed to extract some Ground Control Points that were used as Check Points for accuracy evaluation.The distribution of the Check Points is schematically represented in Figure 5.

Accuracy Evaluation
Check points and ground control points were extracted from reference Epoch 2 as manual tie points (MTPs).The MTPs were extracted from features visible in images of other epochs as well.As a result, the points were marked in stable building corners and roads features (e.g.manholes).These points were later used either as check points (CP) or GCPs for root mean square error (RMSE) calculations to independently evaluate the accuracy of the BBA of the whole project area.
The influence of the following variables on the positional accuracy at the image alignment phase was also analysed:  accuracy evaluation of the input epoch images registered using only the image geo-tag and 7 CPs for discrepancies evaluation: the input epochs were 3, 5 and 10.The approach was also used as a yardstick.It was expected in this approach that the block accuracy would be very low since no reference was used but only image geo-tag for bundle block adjustment.The check points from reference epoch gave results of the relative block accuracy;  accuracy evaluation of all input epochs for RGCP approach: 12 GCPs and 7 CPs were used for comparison with RIBC approach;  effect of the distribution of images around the study site: it was performed for the three configurations: o configuration1: even distribution of 18 reference imagesall epochs were tested using this approach; o configuration2: even distribution of 37 imagesonly Epochs 3, 5 and 10 were tested; o configuration3: uneven distribution of 10 imagesonly Epoch 3, 5 and 10 were tested.The last two configurations were analysed just in three epochs to test the block behaviour.

Image "Geo-tag only"
As a first test, the processing and the registration of the blocks was made only considering the GPS/GNSS data registered during the flight in the Exif file and using half of the image resolution.The used UAV has a low-cost receiver able to register real-time positioning solution with an accuracy of some meters (2-5 m), so it is expected a registration result in the same range of accuracy.Seven check points were then used for accuracy evaluation.Table 1 shows that the discrepancies for this approach are big compared to the common accuracy obtained with a photogrammetric approach.No reference from Epoch 2 was used for image registration and this was expected as the GNSS geo-tag used differs in accuracy from time to time due to several factors such atmospheric conditions, number of satellites available in view at that particular time and so on.As a result differences due to systematic GNSS-position errors are always expected for images captured using the same GNSS from the same area but different times.These results stress the need for the use of GCPs on the ground or the implementation of the co-registration of epochs using also images, not only GPS/GNSS data.

Reference Ground Control Point-based Coregistration (RGCP)
Table 2 and Table 3 show the results of the RMSE calculated between the reference epoch and the input Epochs (processed with half of the image resolution) using reference GCPs and CPs.The results show that the discrepancies in Z coordinates are also relatively high compared to X and Y discrepancies in all the epochs, even if they are acceptable because the ground sampling distance (GSD) varies between 0.04-0.05m.The horizontal accuracy (RMSEx and RMSEy) in GCP and CP is very close to 1*GSD and the vertical one is like 2*GSD.It is acceptable and comparable to classical photogrammetric results.

Reference Image Based Co-registration (RIBC)even and uneven image distribution of Epoch 2
Table 4 and Table 5 show the RMSE of the X, Y, and Z coordinates for epochs 1, 3-10 for RIBC-18 even distributed images (processed with half of the image resolution).A total of 7 evenly distributed reference check points, manually selected from the Epoch 2, were used (Figure 5).
RMSE shows that there is a tendency in Z-coordinate discrepancy being more high compared to X and Y coordinates.Furthermore, RMSE for all epochs is within the range of 1*GSD (average GSD was at 0.045m) for the horizontal component and 2-3xGSD for the vertical component.This values are totally acceptable for the expected photogrammetric accuracy (1*GSD horizontal component and 2*GSD vertical one).Moreover, the obtained results are comparable to that derived from the RGCP approach.
Epochs 3, 5 and 10 were also tested with 10 uneven distributed images and the results are shown in Table 6.
[cm] Epoch3 Generally with uneven distribution of images, the RMSE is much higher than in the case of well distributes anchor images.
To further test the effect of increasing the number of reference images on accuracy, 37 evenly distributed anchor images were processed for Epochs 3, 5 and 10 as shown in Table 7.
[  7. The RMSE for 37 even distributed images for RIBC.
The results are within the 1-2*GSD value and further scrutiny of these results shows that, compared to the 18-evently distributed images, there is an enhancement in the final accuracy (especially in the Z component).The development of an automatic procedure for image selection can allow to speed up the process, but also to easily improve the accuracy selecting more images in a fully automatic way.
In Figure 7    Figures show that in Epochs 3 and 8 there were more problems in image matching between anchor and input images in these epochs.In fact, these images were captured in different weather conditions in contrast to those of Epoch 2, with different colour of the vegetation and also light condition.

Discussion
The developed algorithm of co-registration of multi-temporal datasets has shown promising results.
The most important part of this work was to develop a procedure without the need for including GCPs.Thus this investigation looked at the use of anchor images from a selected set of them taken on a particular day.
Comparison of RMSE (for CPs) for 'geo-tag only', RGCP and RIBC (even and uneven distribution) shows that RIBC and RGCP has less registration errors and the RIBC block accuracy was comparable to that of RGCP.The increase of the number of reference images from 18 to 37 can also improve the accuracy, especially in the Z component.
UAV-based multi-temporal images were captured in different climate seasons.The effect of vegetation cover and illumination differences between seasons could have affected image matching.For instance Epoch 8, in contrast to Epoch 2, was acquired in winter time, with many parts covered by snow and no leafs on the trees.The same variation can be seen in Epoch 3 that has a brown and yellow vegetation.As a result, discrepancies for images from Epoch 3 and 8 captured during the winter season were higher resulting in mismatches due to scene changes.On the other hand, RGCP is based on ground control points which were manually added by human intervention and therefore it is not influenced by such kind of error.
According to the results of RIBC co-registration errors shown in Table 4 to Table 6 for even and uneven images distribution, it can be deduced that distribution of images around the study site is very important.This is attributed to the fact that when images are not evenly spread around the area of interest, the BBA is not very able to strain the input epoch according to the anchor images.For this reason, anchor images should be evenly spread across the whole block.The Z value was significantly higher than the X and Y as shown in all the results.Any discrepancies could have been propagated to the input images if the reference images had any errors in external orientation parameters.

CONCLUSION
In this study, a new automatic UAV-based image co-registration technique is proposed.It is centred on image matching of corresponding common features and the original geo-tag between reference and input images in order to accurately and robustly co-register multi-temporal UAV images for monitoring changes on a construction zone.The main strength of this technique is that it does not require GCPs which are time consuming both during field collection and processing steps involved.
It was shown that the RIBC approach can produce comparable co-registration accuracy to reference GCP-based for coregistration of multi-temporal UAV-based mage datasets if the images to be used as reference are well distributed over the area with static features.RIBC technique can thus be adopted for use mostly in areas with distinct static features such buildings corners, road intersections, lamp poles, and other features which are easily detectable.This methodology could be extended to the construction monitoring site to building damage assessment after catastrophic events, fire damages, flooding and destruction due to war.It can also be used in other applications such as archaeological studies as long as the area around the study area is static over time and with distinct features throughout the study.If we assume that at least the reference image has been georeferenced, the reached accuracy is acceptable to produce projects in a 1:200 scale for building or archaeological purposes.
Season and time of day should be considered when acquiring images for this technique.It is recommended that all multitemporal images should be acquired in the same season and same time of day to minimise image matching errors or failures which may be caused by environmental factors.RIBC technique is not software dependent.It can be implemented in any photogrammetric software which process UAV-based image datasets and allows giving individual weights for reference image EO parameters.While this study has developed this specific approach, more tests need to be independently carried out to incrementally improve the technique and make it more efficient, especially testing different case studies.Moreover, the use of this specific approach must be investigated and used for change detection analyses involving changes over time.For this reason, an automatic methodology can be developed as future work to limit the human manual intervention.This can further facilitate the use of such procedure.Moreover, the availability of an automatic procedure for image selection can be very useful if some problems occur with the reference epoch.In this way, it can be very simple to select new images from another epoch more similar to the input one.For this reason, future work will be related to the evaluation of the reference epoch selection and the correct number of image to further improve the accuracy.

Figure 4 .
Figure 4. Workflow of the RIBC processUnder RIBC, three different configurations were also carried out to analyse the influence of the image distribution (section 4.2).Using the marked CPs from reference Epoch 2 the discrepancies between the coordinates from the reference epoch block and the input block were determined and the RMSE (Root Mean Square Error) was calculated.It needs to be noted that due to the fact that no ground survey was available, block deformation and remaining systematic errors in the reference Epoch 2 remain undetected.In addition those errors are propagated into the other input epochs, However, for this relatively small image block and because of quite large height variations within the scene we assume that block deformation effects are not significant.

Figure 6 .
Figure 6.Example of a reference image selected in Epoch2 and Figure 8, RIBC and RGCP approaches are compared.The mean errors in the two components are:  RGCP: o horizontal: 0,035 m o vertical: 0,100 m  RIBC: o horizontal: 0,038 m o vertical: 0,116 m The results are totally comparable and they can also be increase considering the full resolution of the images in the BBA.

Table 1 .
CPs discrepancies results for epoch image alignment using only image geo-tag from onboard consumer grade GNSS.