Spatially Constrained Geospatial Data Clustering for Multilayer Sensor-Based Measurements
- 1Department of Bioresource Engineering, McGill University, Macdonald Campus, Ste-Anne-de-Bellevue, Canada
- 2Department of Plant Science, McGill University, Macdonald Campus, Ste-Anne-de-Bellevue, Canada
- 3Department of Agronomy and Horticulture, University of Nebraska-Lincoln, Lincoln, USA
Keywords: Precision agriculture; proximal soil sensing; geospatial data clustering; management zones
Abstract. One of the most popular approaches to process high-density proximal soil sensing data is to aggregate similar measurements representing unique field conditions. An innovative constraint-based spatial clustering algorithm has been developed. The algorithm seeks to minimize the mean squared error during the interactive grouping of spatially adjacent measurements similar to each other and different from the other parts of the field. After successful implementation of a one soil property scenario, this research was to accommodate multiple layers of soil properties representing the same area under investigation. Six agricultural fields across Nebraska, USA, were chosen to illustrate the algorithm performance. The three layers considered were field elevation and apparent soil electrical conductivity representing both deep and shallow layers of the soil profile. The algorithm was implemented in MATLAB, R2013b. Prior to the process of interactive grouping, geographic coordinates were projected and erroneous data were filtered out. Additional data pre-processing included bringing each data layer to a 20 × 20 m raster to facilitate multi-layer computations. An interactive grouping starts with a new “nest” search to initiate the first group of measurements that are most different from the rest of the field. This group is grown using a neighbourhood search approach and once growing the group fails to reduce the overall mean squared error, the algorithm seeks to locate a new “nest”, which will grow into another group. This process continues until there is no benefit from separating out an additional part of the field. Results of the six-field trial showed that each case generated a reasonable number of groups which corresponded to agronomic knowledge of the fields. The unique feature of this approach is spatial continuity of each group and capability to process multiple data layers. Further development will involve comparison with a more traditional k-means clustering approach and agronomic model calibration using a targeted soil sampling.