OPTIMIZING LOW-COST UAV AERIAL IMAGE MOSAICING FOR CROP GROWTH MONITORING

High spatial resolution images acquired with drones can provide useful information to farmers for devising suitable management practices and increase crop yield. Data collected as individual frames or images have to be mosaiced using pattern recognition and matching process. Most flight missions collect hundreds of photos with high overlap and side overlap in order to generate mosaic without data gaps or distortion. These frames are aligned using the location information associated with each image. The same features are identified in multiple frames for generating the mosaic. In this process, it is common to use all or most of the images which requires a lot of resources. Uploading and processing hundreds of images could take several hours to days. Many farmers and crop consultants in developing countries may not have the necessary resources to upload hundreds of images. This study assessed the optimal number of images required to generate an image mosaic for a crop field without any data gaps or distortion. Images were collected at two different heights and directions. First, the mosaic was generated using all (100%) frames followed by subsets containing 90%, through 50% of images. Results obtained will assist us to plan the settings in future flight missions for acquiring optimal number of images required for generating image mosaic. * Corresponding author

Drones that can collect imagery data in visible and infrared regions of the spectrum. Infrared regions are relatively more expensive than those that collect data in the visible spectrum. Similarly, fixed wing drones are relatively more expensive than several quadcopter (four rotor) models. Hexa-and Octa-copter models are required to carry the heavier sensors used for collecting detailed data are expensive. In addition to the platforms and sensors, specialized software is required to process the imagery data collected on the ground. These software are available for both desktop and on online platforms where users have to upload individual images acquired for an area of interest and generate a mosaic, i.e., a single image that will cover the entire study area. These mosaics will show variations in the growth and health in a crop field. Image registration technique is the basis for combining two or more images (Ait-Aoudia et.al., 2012). These techniques find the distinct key point or feature vector in that image and also find the same feature points in other images to align the photos. When many photos are acquired or at very high spatial resolution, large computing resources are required for creating image mosaic (Carrio et.al., 2017). For very large imagery data volume, cloud-based processing solutions can be rented for remote computation. Based on the computing resources the processing time can take between few hours and days.
Farmers in most developing countries might not have access to recent drone and sensor technologies, and facilities to process aerial images for generating image mosaics of their fields. Recently introduced low-end (lower cost), quad-copter drones, have high quality sensors that are capable of acquiring high spatial resolution images. Sensors on these drones can acquire high quality images and videos. If mosacis of acceptable quality can be generated using these low-cost drones, more farmers can afford to invest in drones. Seifert et al., (2019) reported that images acquired at low flight altitudes with higher image overlap resulted in most reconstruction details. This study reported the effect of drone height and image overlap while reconstructing forest images.
The primary objective of this study was to assess whether a commonly available low-cost drone can be used for acquiring aerial images and generate good quality mosaic for a crop field. The second objective of this study was to determine the optimal number of images required to generate a mosaic without distortion and gaps (potholes). The third objective was to evaluate the effect of drone height during image acquisition on generating the mosaic. If acceptable mosaics can be generated with optimal number of images will reduce the processing time and eliminate the need to invest in high end computing systems.

Image mosaic
Panorama is the best example for image mosaicking technique (Szeliski et.al., 1994). Two or more images are taken at slightly different times for the study area are combined to form a single image where the entire target area is visible. The mosaic is constructed by aligning the images in order. Image registration techniques extract feature points available in the images, which are termed as the descriptor vector that describes or locate the local key point in the image. These techniques are used in satellite and UAV acquired image processing (Ait-Aoudia et.al., 2012) biomedical (Shao et.al., 2011), and many other applications.
Algorithms such as scale-invariant feature transform (SIFT) and speed up robust feature (SURF) are used in feature points extraction (Lowe, 1999). Other optimized methods such as BRIEF (Binary Robust independent elementary features) (Calonder et.al., 2011) and ORB (Oriented Fast and Rotated Brief) (Rublee et.al., 2012) are used in other applications. These algorithms find the key point in the image and to compute the descriptor vector that helps to focus more on the extreme feature point. The key point is computed by blurring the image with different level of Gaussian blur and are stacked. The stacked blurred images are subtracted to find the key point in the image. Then the descriptive vector is computed by calculating the local neighbourhood to know the surroundings of the key point. Using the descriptor vector from different images, they are aligned based on transformations such as projective, similarity, affine functions. Finally, the overlapping region of two images are stitched (Szeliski et.al., 2006) based on the pixel intensity value (i.e., Gradient domain) (Levin et.al., 2004). Agisoft™ uses SIFT to identify the key point (Bert, 2018). For entire study, the recommended default values were used: key point limit = 40,000 and tie point limit = 4000). The key point determines the feature point in 2D image and tie point are used to compute 3D position of the feature.

Data collection
A DJI Spark™ drone was used to capture aerial images of Brinjal (Solanum melongena) field located in Vedasandur, Tamil Nadu (India). Aerial images were acquired from two different altitudes: 15 and 13 meters. Flight heights were determined based on safe to fly height in order to avoid obstacles such as trees that were planted along the field boundary, utility poles, and other elevated objects (Sajithvariyar et al., 2019). The study area is in a safe to fly zone.
Hammer App™ is an open-source software available for iOS platform, can be used to plan a flight mission over the study area. Hammer App™ enables users can input boundary points and other manual waypoints for the field and additional settings like altitude, front and side overlap based on the available flight time. Settings used in this study are summarized in Table 1 The front overlap was set at maximum (90%) as it does not affect the flight time. The side overlap settings were adjusted based on the flight time and battery longevity. Flight direction were set at -70 degrees to align the flight path in straight pattern rather than zig-zag pattern as shown in the ( Figure 1).

Figure 1.
Flight mission plan for the study area set for altitude 15m in Hammer app (™).
Next, the white balance setting in DJI GO4 app was adjusted depending on the flight condition. In this study both missions were flown under sunny conditions (Figure 2).

Reference panels
Black and white reference panels were placed throughout the field (Figure 3) in order to determine whether the minimum and maximum values change during the mosaic creation process. Methods described by Jeong et al., (2018) was adopted for placing the reference panels. High-end sensors use internal or external methods to calibrate their measurements which are not available for low-end sensors. Minimum (black panel) and maximum (white panel) values after the mosaics generated under different settings will provide insights about the precision of the pixel values in the mosaic.

Mosaic generation
Leave two blank lines under the key words. Type Aerial images collected from both missions (15 m and 13 m) were mosaiced under different settings. The image mosaics were generated with Agisoft software™ installed in a Windows 10 OS, i7 9th generation processor, 8GB RAM and 1660TI graphic card with 6GB VRAM. The process flow for creating mosaic is shown in (Figure 4), The first step was to import all photos that were collected at each flight height.

Figure 4.
Workflow for building mosaic in Agisoft™ using aerial images acquired at 13 and 15 m above the study area.
Camera calibration steps described by (Agisoft, 2011) was applied. Next, the image alignment process and the missing data part were built by meshing process to generate the mosaic. Time taken to generate each mosaic was recorded. The final mosaic products were saved as TIFF files.
The flow diagram shown in (Figure 3) was followed in the present work. The mosaic was generated for each height; each mosaic was created by reducing 10% of individual images until 40% or even less until potholes and data loss were noticed. The time taken for alignment of images and mesh building remains the same. This is because the images that were removed to reduce the overlap after the image alignment and mesh were built. So, the change in time will be only based on mosaic building process. Then time needed to generate the outputs (mosaics) at each step was recorded. Following settings were set in Agisoft™'s Reduce Overlap tool: capture distance = 40m; Image overlap = High; Max images = Number of images which is to be removed at each %, for this study the no. of images to be removed is listed in (  The quality of mosaic was visually assessed, and the average minimum (black panel) and maximum (white panel) reflectance value was computed for the mosaics generated with different amount (%) of images acquired for the study area.

Time required to generate mosaics
Processing time recorded for each image is a combination of two steps: a) alignment of images and mesh building, and b) mosaic building process. Since the first step was completed with 100% of images at both elevations (Tables 3 and 4), the processing time for the first step will remain the same for all settings. Differences in time will be based on the mosaic building process step.   Table 4. Time taken to generate mosaics with different percent of aerial images acquired at a flight height of 15 m. All 319 aerial images were used for aligning and mesh building, prior to generating the mosaic.
Removal of photos reduced the processing time, but it was not linear and in few instances the time taken remained the same (Tables 3 and 4). Images from both datasets acquired at 13 and 15 meters aligned without any noticeable problems. For aerial images acquired at 13m, the tie point identification was 94% at each removal. However, for 15m the tie point varied from 66% to 97%.
Based on visual inspection, mosaic generated with 100% of images (n = 241) acquired at 13 meters appeared to be of high quality ( Figure 5). No noticeable distortion, loss or potholes were noticed until 60% of the images (n = 145) were used for generating the mosaic.
Loss of information was noticed when 40% of the images (n = 96) and there were more gaps in the mosaic generated with 20% of the images. Mosaic generated with 100% of images (n = 319) acquired at 15 m appeared to be of high quality ( Figure 6). When 20% of the images were eliminated, minor loss was noticed in the south eastern corner of the field. Removing additional 20% of images in each step resulted in increasing loss in information. When The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLIV-M-3-2021ASPRS 2021Annual Conference, 29 March-2 April 2021 only 40% of the images were used, several gaps along the field boundary.

Changes in the reference panel values
The minimum (black panel) and maximum (white board) values were obtained for all reference panels in each mosaic generated with 100%, 80%, 60%, 40% and 20% of aerial images acquired at 13 m and 15 m are presented in Figures 7 and 8 respectively. The minimum (black panel) values recorded from the mosaics generated with aerial photos acquired at 15 meters, revealed that they were also above zero in all three (RGB) bands (Figure 8 above). Minimum values in the blue band (blue dots) were higher in comparison to the red (red dots) and green (green dots) bands. The maximum (white panel) values varied between 251 and 255 for all the RGB bands (Figure 8 below). Unlike the minimum values, the maximum values did not show any distinct pattern across the three (RGB) bands.
Minimum and maximum values in the mosaics generated with different number of aerial images acquired at 13 and 15 meters showed random variations, instead of increasing or decreasing patterns. In other words, the number of photos used for generating the mosaic will not affect the minimum or maximum values. It is also evident that the minimum values were well above zero. Figure 8. Minimum, black panel (above) and maximum, white panel (below) values measured in the blue, green, and red bands (as colored dots) from the mosaics generated with different percent of aerial images acquired at 15 meters.
Based on these results, we can conclude that elevation at which the images were flown will influence the quality of the mosaic. Mosaics could be generated with relatively fewer number of images (20%) acquired at 13 meters indicating that the flight height plays an important role. These findings concur with those reported by Seifert et al., (2019).

CONCLUSIONS
Low-cost drones can be used to generate image mosaics for monitoring crop growth. In this study, a widely available drone was used, but we hypothesize that similar low-cost drones can achieve similar results.
As the number of photos used for generating the mosaic decreased, the time required to generate the mosaic slightly reduced. However, when working with larger number of images considerable time can be saved. However, we recommend users collect the maximum number of images for their study area depending on the drone's flight time. This will ensure that there will be no information loss while generating mosaics with fewer number of images aimed at reducing processing time and computing resources. We recommend that the front overlap can be set as high as 90% and it will not change the flight time. Also, we recommend that side overlap can be set at least 50%. Results obtained in this study indicate that it is possible to generate mosaics without gaps and distortions using 80% of the images.
Relatively fewer number of images acquired from 13 meters can be used to mosaics without gaps and distortions. In contrast, more images acquired from 15 meters were required to acceptable mosaics.