A CONCEPTUAL MODEL FOR CONVERTING OPENSTREETMAP CONTRIBUTION TO GEOSPATIAL MACHINE LEARNING TRAINING DATA
Keywords: Volunteered geographical information, GeoAI, data quality, OpenStreetMap, machine learning, crowdsourcing
Abstract. In the recent decade, Volunteered Geographical Information (VGI), in particular the OpenStreetMap (OSM), has helped to fill substantial data gaps in base maps, especially in Global South, thus has become a promising source of massive, free training data together with rich and detailed semantic information for geospatial artificial intelligence (GeoAI) applications. Although intensive works have explored the potential of generating training data from OSM, a systematic approach of harvesting OSM contribution as quality-aware training data for different GeoAI tasks is still missing. To fill this research gap, we proposed a conceptual model consisting of three major components: historical OSM and external datasets, quality indicators, and GeoAI models. As a proof of concept, we validated our conceptual model with an example task of detecting OSM missing buildings in Mozambique, where the impact of different error sources (e.g., completeness, alignment, rotation) in training data were compared and investigated in a quantitative manner. The lessons learned in this paper shed important lights on cooperating OSM data quality aspects with the development of more explainable GeoAI models.