LOCATION DISCOVERY OF VECTOR DATA UPDATE DRIVEN BY MAP IMAGE TILES FOR NATIONAL GEO-INFORMATION SERVICE PLATFORM

With the popularization of geographic information data applications, new requirements are put forward for the rapid update of vector data. The overall update of vector data is expensive and time-consuming. Therefore, we need to use various technical means to intelligently sense the changes of geographical entities and realize the active monitoring of the changes of vector data. In this paper, the building layers that are closely related to human beings and gradually become active geographic entities with the urbanization process are selected to monitor the location of the vector data to be updated. This paper first trains the model using single building layer tiles and image tiles. Then, based on the trained model, the location where the building layer tiles are inconsistent with the image tiles is found in the area to be detected. According to different situations, we set different thresholds to find the position to be updated in the vector data. After manual discrimination, the overall accuracy of the method proposed in this paper is 89%. This paper provides new insights into the update discovery of vector data. In addition, by further improving the boundary accuracy of extracted buildings, the extracted building results can be directly applied to the fusion update of vector data. * Corresponding author


INTRODUCTION
With the development of economy and society, the demand for geographic information data in different application fields continues to increase, such as earth science, environmental protection, natural resource management, urban and regional planning and so on. The release of the National Geo-Information Service Platform "Tianditu" aims to promote the sharing and efficient utilization of national geographic information resources (Zhang et al., 2021), improve the ability and level of geographic information public services, and better meet the needs of national informatization construction. "Tianditu" provides the public with high-resolution images, terrain and vector data that can be browsed through the Internet in the form of tiles. Vector data usually need to be collected and edited manually, which has the characteristics of high precision and strong reliability. With the popularization of geographic information data applications, new requirements are put forward for the efficiency of vector data update, which requires rapid discovery and update. However, the acquisition of vector data is a difficult, expensive and time-consuming process (Yilmaz, A. and Caniberk, M., 2018), which often lags behind remote sensing images in timeliness. In the vector data, the feature information of some geographic entities changes is not updated in time.
Most areas in the vector data are usually unchanged. Thus, in order to avoid the overall update of the vector data, we need to use various technical methods to intelligently perceive the changes of geographic entities, and realize active monitoring of changes of vector data. Based on the information of change discovery, vector data can be updated with direction and fixed points, saving the update cost. According to different data sources, change discovery technology can be divided into image-based change discovery (Zhao et al., 2020;Hou et al 2014), point cloud-based change discovery (Czerniawski et al, 2021;Fuse and Yokozawa, 2017), and Internet-based change discovery. Although some geographic entity change information (such as attribute change, ownership change, etc.) can be found through the Internet, the mapping of change information to space still requires technologies such as place name address matching.
With the abundance of image resources and the development of change detection methods, image-based change discovery has received more attention. For example, Yoon et al. (2003) utilized the change vector analysis approach to detect change areas affected by the flood. Liu et al. (2018) proposed a deep convolutional coupling network based on optical and radars data for change detection. Im et al. (2008) investigates five different change detection methods to determine how new contextual features can improve change classification results and whether object-based methods can improve change classification compared to pixel-by-pixel analysis.
Vector data includes multiple layers such as roads, water systems, buildings, and green spaces. Among them, as an artificial feature closely related to human beings, with the continuous advancement of industrialization and the acceleration of urbanization, buildings are gradually becoming active geographical entities. The re-planning and construction of roads, the planning of green space and the changes of water systems are closely related to the changes of buildings. Therefore, by identifying changes in buildings, we find areas where the vector data needs to be updated.
Obtaining the spatial position and state change information of buildings from images is an important direction of remote sensing applications. In order to realize intelligent, accurate and rapid building recognition and change detection, many scholars have done a lot of research, which is mainly divided into traditional methods and methods based on deep learning. Traditional methods mainly identify buildings through discriminative features and construct corresponding feature sets to separate buildings from the background. For example, Huang and Zhang (2011) proposed a classic Morphological Building Index by analyzing the shape, orientation, brightness and contrast characteristics of buildings on the image, which achieved good recognition results on medium and lowresolution satellite images. To improve robustness, some scholars introduced machine learning classification algorithms combined with hand-designed feature sets to identify buildings (Avudaiammal et al., 2020;Du et al., 2015). Due to the powerful nonlinear expression ability of deep learning, semantic segmentation technology based on deep learning provides a powerful tool for building recognition. Vakalopoulou et al. (2015) combine convolutional neural networks (CNN), support vector machine and Markov random field to realize building recognition using QuickBird high-resolution satellite imagery. Some scholars combine CNN and instance segmentation to obtain instance segmentation results of buildings using high-resolution remote sensing images . Fully convolutional networks (FCN, Long et al., 2015) modifies the output layer in CNN from the full connection layer to the convolution layer, so that it can receive input of any size and output prediction results in an end-to-end and pixel-to-pixel manner. FCN is one of the most effective neural networks for pixel-based semantic segmentation, and is also used for building recognition (Shrestha and Vanneschi, 2018).
Map image tiles are generated from image data by tiling, which has uniform length and width, and are beneficial to be combined with deep learning algorithms. Through the above analysis, this paper proposes a method for finding the location to be updated in vector data based on map image tiles. This paper takes the building layer in vector data as the research object. Based on the image and label information obtained from "Tianditu" map tiles, FCN was trained. Then, the trained FCN was used to extract buildings in the area to be detected. Finally, we found the position to be updated in the vector data by calculating the coincidence rate between the extracted buildings and the building layer tiles.

ARCHITECTURE
The left part of Figure 1 shows the operational flow of finding the vector data location that needs to be updated and the right part of Figure 1 shows the result diagram of some steps in the process. It mainly includes three parts: the generation of building layer tile data, the extraction of buildings based on image tiles, and the discovery of areas to be updated in vector data.

Figure 1
Operational flow about location discovery of vector data that needs to be updated.

Generation of single building layer tiles
The buildings layer is the label data for training the FCN in the second step (Section 2.2) and the data to be detected in the third step (Section 2.3). In the FCN training process, it is necessary to have an accurate correspondence between labels and image tiles. Therefore, a single building layer tile needs to be obtained and processed into a form corresponding to the size and position of the image tiles.
There are two methods for obtaining building layer tiles. The first method is to slice an existing building layer. Firstly, through the coordinate information of the building layer, we obtain the row and column numbers of the tiles covered by the building data. Then, the bounding coordinate range of each tile is calculated, and the vector data within the range is taken out and rasterized. Finally, as shown in Figure 2(b, c), we get the building layer tiles corresponding to the building layer (the selected part) of Figure 2(a).

Figure 2
Slice an existing building layer.
client-side for the user to view (please refer to the Mapbox official website for details). Building layer tiles can be extracted from raster tiles by pixel value and feature shape. Figure 3(a, c) shows the raster tiles and Figure 3(b, d) shows the single building layer tiles. Figure 3(b, d) is just a schematic diagram. The first method is used in this paper.

Figure 3
Extract single building layer tiles from raster tiles.

Extraction of buildings based on image tiles
The fully convolutional network is very suitable for dense prediction and semantic segmentation (Long et al., 2015), and building extraction from image tiles can be regarded as a dense pixel classification problem.
FCN uses convolution layer to replace the full connection layer in CNN, and adopts up-sampling techniques such as spatial interpolation and deconvolution to restore the scale of images, and finally makes pixel-by-pixel prediction to realize end-toend separation. Compared with CNN, FCN has the following advantages: 1. The convolution layer is used instead of the full connection layer in CNN, so that it can receive inputs of any size; 2. Sampling the image to the same size as the original image by transposition convolution; 3 Using skip architecture (Figure 4), semantic information from high and coarse layers is combined with appearance information from low and fine layers to produce accurate and detailed segmentation.

Figure 4
Skip architecture combines coarse, high layer information with fine, low layer information (Long et al., 2015).
Before training, we transformed and augmented the training data by cropping, flipping, rotating, and normalizing it with mean and standard deviation. Through multiple iterations of training, we obtain a trained model and then apply it to building extraction in the area to be detected. Finally, we obtain the building extraction results of the area to be detected and store them in tiles corresponding to the row and column numbers.
It should be noted that we use the building layer tiles obtained from "Tianditu" web map as the training label. The reason is that there are fewer areas in the vector data that do not match the image (areas where the vector data needs to be updated). In addition, using the building layer tiles does not require preprocessing such as geometric correction and building plotting, which can effectively reduce the workload and obtain a large number of training samples.

Location Discovery of Vector Data Update
Similar to the first step (Section 2.1), we need to convert the raster tiles of the area to be detected into the building layer tiles. Then, the building layer tiles are compared with the extracted building results in the second step (Section 2.2) to judge whether they are consistent.
Change detection methods are mainly divided into pixel-based and object-based methods (Hussian et al., 2013). Compared with pixel-based change detection methods, object-based change detection methods use multi-scale segmented objects as processing units, which can effectively improve the integrity of detection results. However, the scales of buildings acquired under different imaging conditions are often difficult to keep consistent, and how to overcome the impact of segmentation scale inconsistency on detection accuracy is still a difficult problem to overcome. Therefore, this paper uses a pixel-based change detection method to determine whether the vector data needs to be updated.
The comparison method is to calculate the coincidence rate of the building layer tiles and extracted building results. When the coincidence rate is less than the set threshold value, it is judged that the difference between the two tiles (Figure 1d, 1e) is relatively large. This shows that the vector data is inconsistent with the image data, and the vector data of the tile coverage area needs to be updated.

DATA
We evaluate the effectiveness of the method for finding locations to be updated in vector data in parts of Nanjing, Jiangsu Province, China. Four areas near Nanjing and one area near Xuzhou were selected to train the FCN model, and one area near Nanjing was selected for vector data update location discovery ( Figure 5). The training area covers a total area of about 60 km 2 and 13,100 buildings (Table 1).

Name
Number of Buildings Tain1  1731  Tain2  2256  Tain3  1653  Tain4  5497  Tain5  1960  Predict  1441   Table 1. The number of buildings involved in training and validation.

Figure 5
Four areas near Nanjing and one area near Xuzhou were selected for training, and one area near Nanjing was selected for vector data update location discovery.
This paper uses 3079 tiles to train the FCN model and divides the tiles into three parts: training, validation and evaluation. The proportions of the three parts are 80%, 10% and 10% respectively and each part contains image tiles and label tiles ( Figure 6). In addition, we need to calculate the distribution weights for each class before model training due to the uneven distribution of foreground and background. After the model training is completed, the trained model is used to detect the location of vector data in the region to be detected that needs to be updated.

Figure 6
Image data and label data for training the model.

RESULTS AND DISCUSSION
Firstly, we use the trained model to extract the buildings in the area to be detected. Then, the consistency between the extracted buildings and the building layer tiles obtained from the raster tiles is detected. Finally, the specific location where the vector data needs to be updated is marked with point shapefile.

Figure 7
The buildings were extracted by the trained model.
The area to be detected contains a total of 256 tiles and about 1500 buildings. The extracted building results are shown in Figure 7. It can be seen that the extracted buildings can better cover the buildings in the image data. There are two main problems with the extracted results: 1). The results of building recognition are still rough, especially the localization accuracy of building edges is not high enough (Yuan et al., 2020;Chen et al., 2017) (Figure 8, red box). 2). Shadows occlude some building objects in the image, distorting the spectrum and texture of the building objects in shadows, increasing the difficulty of identification and change detection (Jin et al, 2020) ( Figure 8, yellow box). Although some methods have improved on these issues, the robustness of the methods still needs to be further examined. This paper mainly uses the extracted building results and building layer tiles for consistency detection. Boundary blur and shadow issues are not the focus of this paper. In this paper, a different method is used to judge whether the building extracted results are consistent with the building layer tiles. After obtaining the difference value, it is necessary to set a threshold to judge whether the vector data at this position needs The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLIII-B3-2022 XXIV ISPRS Congress (2022 edition), 6-11 June 2022, Nice, France to be updated. The setting of the threshold will affect the robustness of the method. Therefore, this paper adjusts the threshold setting by analyzing different situations to increase the effectiveness of threshold screening.
It is mainly divided into three cases: the first is that the extracted building results and the building layer tiles contain a small number of buildings (Figure 9(a)); the second is that both contain a large number of buildings (Figure 9(b)); the third is that both contain a moderate number of buildings (Figure 9(c)).
In the first case, most of them are non-building areas, so the pixel coincidence rate is very high. When the building coverage rate of building layer tiles or extracted building results is less than 0.03, the threshold of coincidence rate is set to 0.9. In the second case, due to the inaccurate identification of building boundaries, it is easy to cause a low coincidence rate. Therefore, when the building coverage rate of both is greater than 0.3, the threshold of coincidence rate is set to 0.6. When the conditions of the first two cases are not met, the threshold is set to 0.7.
Through the set threshold filtering, it is found that the vector data of 43 tile positions need to be updated. After manual discrimination, we found some omissions in the detection, with an overall accuracy of 89%. The main reason for the omission is that the threshold setting for the third case may be lower. The situation as shown in Figure 9(d) is omitted. In addition, when the building is relatively high, there will be a deviation between the extracted building position and the building position in the building layer tiles (Figure 9(e)).

Figure 9
Comparison of building layer tiles and extracted building results.

CONCLUSION
This paper proposes a method for finding the location that needs to be updated in vector data based on map image tiles. The main work of the paper is as follows: Firstly, the building layer tiles were generated corresponding to the image tiles. Secondly, the FCN model is trained based on building layer tiles and image tiles, and then the buildings in the area to be detected were extracted using the trained model. Thirdly, by comparing the extracted results with the building layer tiles in the area to be detected, the location of vector data that needs to be updated was found.
We use 256 tiles to verify the method and find that the vector data of the area covered by the 43 tiles needed to be updated. After manual discrimination, we found that some omissions occurred in the detection, and the overall accuracy rate was 89%. The omission is mainly affected by the threshold setting of the coincidence rate and the boundary accuracy of extraction results.
In future research, we need to use more data to verify the robustness of the method proposed in this paper and explore the adaptive setting of the threshold. Furthermore, we need to focus on improving the boundary accuracy of the extracted buildings so that it can be directly applied to the fusion update of vector data.