HONEY CROP ESTIMATION FROM SPACE : DETECTION OF LARGE FLOWERING EVENTS IN WESTERN AUSTRALIAN FORESTS

Recent studies have shown that in the spectral space there is often a better spectral separation between leaves and flowers and even between flowers of different species than between leaves of different species. In this study we assess the ability of satellite remotely sensed data to detect the flowering of Red Gum trees (Corymbia calophylla) in Western Australia, the state’s largest annual honey crop. Spectroradiometer measurements of flowers, leaves and groundcover from Red Gum forests were subjected to ANOVA analysis, which showed that flowers are spectrally different to their environment for 92% of the wavelengths between 350 nm and 1800 nm. A more detailed assessment, using the JM Distance calculation, showed that the spectra can be reliably separated using 10% of the wavelengths, with peak separation between 518 nm and 557 nm. To assess the ability of satellite-borne sensors to detect the presence of flowers, the spectroradiometer data were convolved with satellite instruments’ response curves to create synthetic remotely sensed datasets on which JM Distance analysis was performed. MODIS blue bands achieved a median JM Distance of greater than 1.9 and therefore should be able to detect the presence of flowers from the environment. Further assessment showed that the shortest wavelength bands for MODIS, VIIRS and Sentinel 3 all occur where the flower spectra have lower reflectance than their natural background. A sensitivity analysis of percentage flower cover for a pixel showed that the highest sensitivity was obtained by dividing the band closest to 520 nm by the shortest wavelength band for data from these three sources. The MODIS band 10/band 8 metric was tested for its ability to detect flowers in real-world data using 15 years of qualitative honey harvest data from one apiary site as a proxy for flower density. This test was successful as, while there was some overlap between good, moderate and poor years, the poor years could be separated from the other years with nearly 80% accuracy. * Corresponding author


INTRODUCTION
Remote sensing classification relies on the spectral response of different objects of interest having key spectral regions of difference, and those differences being measurable within the detection limits of the system.While often used to map vegetation types, the accuracy of remote sensing can be limited in this application due to the overlap of spectra between different species (Price, 1994).Indeed, the variation in spectra between individual leaves on a single tree can be greater than the difference in spectra between species in some cases (Cochrane, 2000).
Recent studies have shown that in the spectral space there is often a better spectral separation between leaves and flowers (Gross and Heumann, 2014) and even between flowers of different species than between leaves of different species (Shrestha et al., 2013).
Even with the limited spectral dimensionality of a standard 3band Digital Single Lens Reflex (DSLR) camera, it can be possible to clearly differentiate between flowers and leaves of some species (Campbell and Fearns, 2018), however the appropriate spatial resolution required for effective spectral separation is not typically available except via the use of Unmanned Aerial Vehicle (UAV) platforms.As a consequence, low spectral resolution sensing is not considered suitable for large spatial scale or high temporal frequency mapping.
In this paper, we explore the potential for the use of imagery with a higher spectral resolution but lower spatial resolution than 3-band DSLR data for the detection of flowering Corymbia calophylla (Red Gum) trees in Western Australia.Honeybees foraging on this species produce the state's largest annual honey crop (Painter, 2010) and can produce honey with some of the highest antibacterial activity of any honey variety in Australia (Irish et al., 2011).As the species occurs across a large portion of the South-West Floristic Province (see Figure 1), the ability to remotely detect where Red Gum trees are flowering may help apiarists to better manage their seasonal apiary movements and thereby increase production of higher-value honey.A total of 325 reflectance spectra of leaves, groundcover and flowers were acquired on clear, sunny days in September 2015 (spring) and February 2016 (summer -peak flowering period) between 10 am and 2 pm.White reference measurements (Spectralon) were collected at least every 10 minutes (as per ASD manual) and 10 records taken and averaged per measurement.A minimum of 5 measurements were collected per target, with the spectroradiometer optic fibre held by hand between 5 -10 cm from vegetation targets and 1 m height above ground targets.Typically 2 or 3 different targets were measured on individual trees (e.g.multiple different leaf or flower clusters).Measurements of groundcover spectra were also taken, which included leaf litter, low vegetation (< 0.5 m tall), gravel and sealed roads.
2,151 spectral bands were acquired at between 350 -2,500 nm wavelengths.Spectral data at wavelengths greater than 1,800 nm were deleted to remove atmospheric water absorption features.The spectral resolution was also reduced from the 1 nm bandwidth exported by the ASD software to the actual spectral resolution of the sensors.That is, 3 nm from 350 nm to 1,000 nm and 10 nm from 1,000 nm to 2,500 nm.Spectra were also corrected for steps in sensor brightness calibration (Dallon, 2003).
While some studies have normalised individual reflectance spectra to reduce the impact of differences in brightness from target orientation and illumination (Feilhauer et al., 2010), this was not done for this study as work on multi-band DSLR data has shown a better discrimination of flowers in raw reflectance data (Campbell and Fearns, 2018) due to the overall pixel intensity being able to separate pixels close to white rather than grey or near black pixels with the same normalised ratio of red, green and blue reflectance.Accordingly, additional care was taken to only record spectra of targets in full, direct sunlight.

SPECTRAL SEPARATION OF CORYMBIA CALOPHYLLA FLOWERS
Previous studies have used a range of methods to assess correlation between spectral data and vegetation species and/or assemblages.Often a staged approach is used to progressively discriminate hyperspectral bands using progressively more complex algorithms and this has often been successful.The primary reason for this approach is that by using the simpler but less quantitative algorithm first significantly reduces the dimensionality of the dataset, reducing the volume of data required for the more complex and time-consuming, but more quantitative, algorithms (Vaiphasa, Ongsomwang et al. 2005).
A review of eight different studies that included band selection from spectroscopy data showed that the most commonly used approach is an analysis of variance (ANOVA) analysis to reduce dimensionality, followed by calculation of the Jeffries-Matusita (JM) Distance for the bands with the highest mean separation from the ANOVA results.This approach has been used for applications such as plant species discrimination from leaf reflectance for broadleaf trees, papyrus, mangroves and grasses, as well as fungal effects on soybeans (see Table 1 for a summary of the methods used).As the expected spectral separation between flowers and leaves was expected to be greater than many of these applications, particularly spectral separation between leaves of different species (Gross and Heumann, 2014), a more complex approach to quantify subtle spectral differences was not warranted.For this staged approach to assess spectral separation initially, the spectra were divided into three groups: Flowers (F), Leaves (L) and Ground (G).The range and mean of these groups are shown in Figure 3.
Figure 3. Range and median of spectra for the three groups The first stage of separation analysis, an ANOVA assessment for dimensionality reduction, was performed on the three groups of spectra using a custom Python script, testing the null hypothesis that there was no statistically significant difference between the means of each pair of groups (results in Figure 3).Based on a -value of less than 0.05 to disprove this hypothesis, reflectances were significantly statistically different for: -94% of the Flower vs Ground wavelengths -98% of the Flower vs Leaf wavelengths -92% of the Flower vs Ground AND Leaf wavelengths To further reduce spectral dimensionality and assess the ability of different groups of bands to reliably differentiate between flower and other objects, the Jeffries-Matusita (JM) Distance was calculated for a series of bands (Schmidt and Skidmore, 2003).This started with all bands with a -value greater than 0.05 and then using groups of progressively higher mean separation (ANOVA F-value -the blue line in Figure 4).After being calculated for all bands with the JM Distance calculation on all data with -value < 0.05, the JM Distance was then calculated on the highest 50% of the F-values, highest 25% Fvalues, etc until two bands were remaining (0.5% F-values).
The progressive increase in mean separation was done to evaluate the effectiveness of higher or lower data dimensionality versus higher or lower data quality (i.e. are fewer bands of higher separation better than more bands (and therefore data points) of lower separation).As Figure 4 shows, the mean separation is generally higher in the visible bands, particularly in the region of 500 nm to 550 nm.
Graphs of JM distance for each class pair (flower-ground and flower-leaves) are shown in Figure 5.There is a general increase in JM Distance (and therefore spectral separation) with restriction of the spectral bands to those associated with increasing F-value from the ANOVA analysis.Using the median values as a reliable estimate of separability and a filter of median JM Distance > 1.9 for accurate classification (Vaiphasa, Ongsomwang et al. 2005), the top 10% of spectral bands from the ANOVA classification should be able to achieve reliable classification results between flowers and other objects (28 bands of wavelengths 476 -566 nm).
Using the top 14 of the spectral bands, or less, means that over 75% of the data points are clearly separated (wavelengths 518 -557 nm).
Note that this finding only applies to the detection of flowers versus leaves or ground in Red Gum forests for spectral data of equivalent spectral and spatial resolution as this dataset.

MULTISPECTRAL SEPARATION ASSESSMENT
While a proven spectral discrimination ability from hyperspectral data is a useful finding, the ability to detect flowering plants typically relies on the temporal variability of the area of interest to track phenological changes (Blomstedt, 2014).Given the cost of acquiring hyperspectral images, acquiring repeat hyperspectral datasets at regular intervals over a season is unlikely to be a practical solution.
As a result, the raw spectra acquired with the FieldSpecPro were convolved with the spectral response functions of a range of different multispectral satellite-borne sensors to produce synthetic pixels of the hypothetical measured reflectance of flowers, leaves and ground classes.The convolution was applied to each of the field-measured spectra, creating 325 measurements for each band of each satellite sensor.Graphs of the spectral bands for each sensor overlain on the median spectra for leaves, flowers and ground are provided in Figure 6.Note that, as per the previous section of this paper, the peak separation is between 518 -557 nm from the JM Distances calculated from the original spectroradiometer data.
The JM Distances of the synthetic satellite reflectances were calculated, using progressively greater mean separation to select the bands (similar to the assessment of the original spectroradiometer JM distances from the ANOVA results).
The results from this process are shown in Table 2.With the highest separation from the ASD spectroradiometer data being achieved across a bandwidth of 39 nm, this result correlates with Figure 4 and Figure 5, as while the majority of the wavelengths between 350 nm to 1800 nm are statistically different based on ANOVA analysis, only a small portion of the wavelengths are able to reliably separate the target classes.
Satellites with too broad a bandwidth are not able to adequately resolve this key spectral zone, as are satellites with higher spectral resolution but not attuned to the 518 -557 nm wavelengths.
Table 2 shows that only the MODIS sensor has a median JM distance greater than 1.9 and thus has sufficient spectral resolution to reliably distinguish between the flower, leaves and ground classes in the spectral space.Two bands from MODIS achieved this degree of separation; Band 4 (538 -568 nm) and Band 11 (519 -540 nm).Both of these bands are within the portion of the spectrum that achieved a median JM Distance of 2 from the ASD spectroradiometer data (476 -566 nm).6. Spectral bands for satellite sensors overlain on median reflectance spectra from ASD spectroradiometer measurements.

MINIMUM FLOWER COVERAGE FOR DETECTION
While JM Distance analysis has indicated that the MODIS sensor is capable of spectrally separating Red Gum flowers from background reflectance, the very high spatial resolution of the data in this analysis (field of view typically less than 50 cm 2 ) means that the flowers made up almost the entire field of view of the spectroradiometer.With the optimum MODIS bands having a spatial resolution of 500 m (Band 4) or 1,000 m (Band 10), it is considered improbable that a single MODIS pixel would consist entirely of Red Gum flowers.
In order to estimate the percentage flower cover required for flowers to be detected, the percentage of flower coverage was calculated to increase the reflectance by 1 standard deviation (SD) for backgrounds ranging from 0 -100% leaf cover.This was done for the MODIS bands with the highest JM Distance (bands 3, 4, 10, and 12), as well as for several derived spectral products.These derived products included calculated NDVI and EVI products, as well as other combinations of MODIS visible spectral bands to determine the effectiveness of a 'visual band intensity'metric.This is in a similar vein to the work completed by Sulik and Long (2016), who developed a Normalised Difference Yellowness Index (NDYI) to better predict canola yields based on the coverage of yellow canola flowers as it was found that NDVI decreased as flower cover decreased.
The results from this process are presented graphically in Figure 7 for the MODIS bands and Figure 8 for the MODIS derived indices.The minimum, maximum and mean flower coverages required for 1 SD variation for all background scenarios are provided in Table 3.
For a minimum detection limit cutoff of 1 standard deviation (i.e. the intensity of the pixel would be at least 1 SD different to the background median value), the percent flower coverage limit is between 20 and 30% for the visible MODIS spectral bands tested here, that all had a median JM distance of greater than 1.9 between flowers and other classes.The best performing band was Band 4, with a 24.9% mean flower coverage.The vegetation indices required a minimum of 50 -60% leaf cover for the background for flowers to create a 1 SD variation (NDVI required as much as 69% flower cover at 50% background leaf cover and EVI required 84% flower cover at 60% leaf background cover).These minimum coverage percentages decrease with increasing flower coverage, which is not surprising due to the larger decrease in chlorophyll content with increasing flower coverage for higher leaf coverage.This correlates with the research by Sulik and Long (2015) who found the same result in canola, with NDVI decreasing as the yellow flowers became a dominant influence on the spectra.
The best results are from indices created by dividing the visible band data with the highest JM Distance (bands 3 or 4 in the 500 m spatial resolution data and bands 10 or 11 in the 1,000 m data) by Band 8 (ultraviolet).As flowers have predominantly higher reflectance in the visible bands and lower reflectance in the UV and near-UV bands (see detail of spectra in Figure 9), this process highlights both differences.
As a result, these calculated indices reach the 1 SD criteria at a mean of 1.4% flower coverage for Band 3 (500 m resolution data) and 1.7% for Band 10 (1,000 m resolution data).The response was also consistent regardless of leaf versus ground present in the background for the synthetic pixel, with Band 3 requiring no more than 1.9% flower cover and Band 10 no more than 2.1%.This is a significant improvement over using individual visible band data, visible band combinations or vegetation indices for flower detection.
Figure 9. Flower, leaf and ground spectra from 350 -700 nm A similar process of assessing flower coverage detection was performed on synthetic datasets for other satellite-borne sensors (Figure 10 and Table 4).The results for individual sensor bands correlated with the results of the JM distance analysis, showing that the bands most sensitive to flower detection for all sensors are less sensitive than the MODIS bands with higher JM distances.This correlates with the JM distance analysis of the synthetic data summarised in Table 2.Where satellites had UV and near-UV bands (Band 1 for both Sentinel 3 and VIIRS), the same indices were calculated as were done for the MODIS data (visible band divided by the UV or near-UV band).The results of this, shown in Figure 11 and Table 4, show greater sensitivity to flower detection at low flower coverages than visual band data alone, producing detection limits more sensitive to flower coverage than when the same process was applied to MODIS data.The highest sensitivity was from Sentinel 3 (Band 6 / Band 1), with a mean flower coverage of just 1.1% required to make a difference of 1 SD to the synthetic pixel reflectance.
The improvement relative to MODIS is likely due to the specific bandwidth of the UV to near-UV bands, for example Band 8 in MODIS ranges from 405 nm to 420 nm, while Sentinel 3 is from 350 nm to 405 nm.These slightly different bands mean that the Sentinel 3 band is located entirely in the low reflectance spectral region for flowers (< 410 nm) whereas the MODIS band includes a portion of the spectra where the flower reflectance is starting to increase (see Figure 9).As a result, the UV band reflectance of flowers from Sentinel 3 and VIIRS is generally lower than for MODIS, thus having a larger impact on the band ratio.To assess the ability of satellite data to detect flowers from realworld rather than synthetic data, the flower coverage as inferred from MODIS data was compared to qualitative honey harvest data from 2003 -2017 for an apiary site near Mundaring (near Perth, Western Australia).This site was also one of the locations where the spectroradiometer data were collected (Figure 2).The vegetation consists of Red Gum forest for more than 4 km in each direction from the apiary site.
Honey harvest data were in the form of the times when hives were placed and removed from the site (corresponding to the main flowering event for the season) and a rating of the year by the apiarist as poor/failed, moderate or good.While not a detailed quantitative dataset, it does give a reliable indication of honey flow.The harvest data are summarised in Table 5, with the week of the month that the hives were placed on and removed from the site shown by the width of the bar for the year.
It was suggested by the apiarist that poor to failed harvests were due to either hot, dry or wet summers.This claim is substantiated by a comparison of the harvest data in Table 5 with mean maximum temperature and rainfall over January and February from the Bureau of Meteorology, which are shown in Figure 12.The data were from the closest meteorological station to the apiary site, which were Mundaring Weir for rainfall (6.7 km north) and Bickley for temperature (12.5 km west).
The good to moderate years are clearly bounded by years of less than 40mm but more than 0 mm of rainfall and a mean maximum temperature of less than 32 degrees Celsius.There is only one failed year within these bounds (0.6 mm rainfall, mean maximum temperature 31.4 degrees Celsius).Daily MODIS data were processed for a 3x3 pixel region at the apiary site (9 pixels, or a 3 x 3 km area).The median value of the most sensitive metric of consistent pixel size (Band 10/Band 8) was calculated for February each year.For 9 pixels and 15 years of data, this created 135 datapoints for comparison.Figure 13 shows the annual MODIS average band ratio from 2003 to 2017 with the honey harvest data represented by colours indicating the poor/failed, moderate and good years.
Using the ANOVA algorithm on the null hypothesis of there is no difference in the means between the good, moderate and poor years, it was found that there was a statistically significant difference between good vs poor and moderate vs poor years ( < 0.0002 in both cases).However, there is not a statistically significant difference between the means of good vs moderate years ( = 0.0657).The data ranges are shown in Figure 14, where the difference in the medians of the groups is clear.The ability of the MODIS data to generate a classification tool was also tested, by considering minimum cutoff values for good and moderate vs poor years and calculating how accurately these cutoffs separated the data into the correct classifications.
The results show that a cutoff for the band 10/8 ratio of between 1.34 and 1.35 gives the best result, with a classification accuracy of 78%.A lower cutoff than this results in classification accuracy of the poor years decreasing.Above this cutoff, the accuracy of prediction of the good and moderate years decreases.While limited in spatial extent, this analysis does show that the quality of honey flow can be assessed using MODIS data and classified on a site-specific basis.

CONCLUSIONS
Based on an ANOVA analysis of field spectroradiometer data we can conclude that Red Gum flower reflectance spectra are significantly statistically different from the spectra of the surrounding environment for most wavelengths between 350 nm and 1800 nm.The JM Distance assessment suggests flowers are separable in hyperspectral data for 10% of wavelengths.The highest degree of separation is between 518 -557 nm.
However, this difference does not translate well to bands of most satellites.Using the JM Distance assessment on synthetic satellite data generated from the field spectroradiometer data, only the MODIS data achieved a JM Distance of greater than 1.9 (using bands 3, 4, 10 or 11).
Further analysis showed that this spectral separation could be improved by dividing these bands by Band 8, which is on the blue/ultraviolet boundary and where flowers have a lower reflectance than their surrounding environment.A simple sensitivity assessment showed that as little as 1.0% flower coverage in a pixel may increase the band 10/8 ratio value by more than 1 SD of the background.
The Sentinel 3 and VIIRS synthetic satellite data also perform well with this approach, as their shortest wavelength bands are at slightly shorter wavelengths than the MODIS Band 8 and therefore increase the effect of dividing by this band.
The MODIS Band 10/Band 8 metric was tested for its ability to detect flowers in real-world data using 15 years of qualitative honey harvest data as a proxy for the presence of flowering Red Gum trees.This test was successful as, while there was some overlap between good, moderate and poor years, the poor years could be separated from the other years to nearly 80% accuracy.

Figure 4 .
Figure 4. ANOVA assessment results.Numbers less than 0.05 represent a statistically significant separation.Areas shaded in grey show where the -value is less than 0.05 between flowers and both ground and leaves.Blue line is the F-value.

Figure 5 .
Figure 5. Number of spectral bands versus JM Distance values

Figure 7 .
Figure 7. Percentage cover of flowers require to change the reflectance by 1 standard deviation for different background ratios of ground and leaves for MODIS bands

Figure 10 .
Figure 10.Percentage cover of flowers require to change the reflectance by 1 standard deviation for different background ratios of ground and leaves for other sensors and satellites

Figure 11 .
Figure 11.Percentage cover of flowers require to change the reflectance by 1 standard deviation for different ratios of ground and leaves for Sentinel 3 and VIIRS derived indices

Figure 12 .
Figure 12.Annual summer weather conditions versus honey harvest quality MODIS data were chosen for the correlation with honey harvest data as it has a longer temporal range of Sentinel 3 and VIIRS (since 2002 for MODIS, since 2016 for Sentinel 3 and since 2011 for VIIRS) and better temporal resolution than VIIRS (1 -2 days for MODIS versus 16 days for VIIRS).

Figure 13 .
Figure 13.Median February MODIS and honey harvest data

Table 2 .
Median JM Distance results for common multispectral satellite sensors

Table 3 .
Minimum, maximum and mean flower coverage required for 1 SD change in reflectance for MODIS synthetic pixels

Table 4 .
Minimum, maximum and mean flower coverage required for 1 SD change in reflectance for non-MODIS synthetic pixels

Table 5 .
Honey harvest data from Mundaring, Western Australia