Research on broken road connection method after road extraction from high-resolution remote sensing image

Aiming at the problem of disconnection after road classification of remote sensing image, this paper proposes an optimization method for broken road connection considering spatial connectivity. The method extracts the road skeleton based on the binarized image after road extraction, and uses the eight neighborhood detection algorithm to find the road breakpoints after road extraction of high-resolution remote sensing image, and removes the isolated points of the road edge according to mathematical morphology filtering. Secondly, use K-means clustering algorithm to search for road breakpoints, and eliminate invalid breakpoints; then, fit the breakpoints of each category through polynomial curves, and record the mathematics of each fitted curve expression; Finally, the coordinate sequences between each kind of breakpoint is calculated according to each fitted polynomial, and the corresponding pixel is filled with the width of the road to realize automatic detection and connection. In this paper, the images after road extraction based on the U-Net network is used to test the method. The results show that the proposed method can better connect the roads formed by road or building shadows. Especially, the single broken road , has a high integrity of the road shape after repairing. The method proposed in this paper has certain reference significance for the classification and repair of linear objects such as roads, power grids and tracks. * Corresponding author The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLII-3/W10, 2020 International Conference on Geomatics in the Big Data Era (ICGBD), 15–17 November 2019, Guilin, Guangxi, China This contribution has been peer-reviewed. https://doi.org/10.5194/isprs-archives-XLII-3-W10-387-2020 | © Authors 2020. CC BY 4.0 License. 387


INTRODUCTION
Road, as the important part of basic feature of geographic information, is the key extraction content from remote sensing images. Traditional techniques for road extraction include knowledge-based, classification-based, morphology-based and dynamic programming (Kahraman et al.,2018). The extraction result highly relies on the image quantity and algorithm effectiveness. In particular, when road is covered by trees or buildings, the extracted road will be disconnected, which affect the shape integrity of the road network. And it is also impossible for a broken road to apply directly to spatial decision analysis.
The most important feature of road as a transportation infrastructure is spatial connectivity, which is the basis for ensuring the availability of extracted roads. Therefore, it is necessary to connect the broken road. In the early days, the researchers mainly by using mathematical morphology methods to deal with broken roads. Li et al. (2005) firstly processed the road image with binarization, used the mathematical morphology open operation to remove the noise points for the binarization image, used the corrosion and morphological reconstruction algorithm to extract the main road. Finally, the road extraction result was improved with the mathematical morphology closed operation and refinement processing, and the road centerline was obtained. However, the expansion and thinning algorithm of mathematical morphology can only solve the problem of broken road at a small distance after road extraction, when the distance of the breakpoints is far away, the road shape deformation will occur. In order to avoid the defects of mathematical morphology in processing linear elements with long-distance breakpoints, Arrighi et al. (2008)uses mathematic morphology to process contour lines on a binary image. The algorithm utilizes propagation function to detect two extremities and then uses a skeletonization with anchor points to thin contour lines. At last, a combination of Euclidean distances between extremities, and differences between their directions are used joining the disconnected lines. And Samet & Hancer (2012)proposed a semi-automatic approach to reconstruct broken contour lines based on its local and geometric properties, which does increase accuracy and performance. By improving the method proposed by Samet, Gao et al. (2015) improved the accuracy and integrity of disconnected lines. Pradhan et al. (2013) proposed a knowledge-based disconnection method, but this method mainly relies on the information of various elements in the topographic map, it cannot be extended to the disconnection of road after the extraction of remote sensing images without necessary information.
In this paper, a new approach for reconnection of broken road is proposed. The proposed method is based on the spatial connectivity property of the road network. Benefit from the smooth characteristic of extracted road from remote sensing image, the disconnected road image can be used without excessive preprocessing. Thinning operation be used to extract road skeleton, and eight neighborhood tracking algorithm be used to detect road breakpoints. To remove isolated short road lines or points, mathematical morphology method is applied; Secondly, K-means clustering algorithm is used to cluster the scattered road breakpoints searched to form different breakpoint category point sets and remove invalid breakpoints. Then, all kinds of breakpoint sets are fitted by a polynomial curve, and the mathematical expressions of each fitted curve are recorded. Finally, the coordinate sequence between various breakpoints is calculated according to each fitting polynomial, and the corresponding pixels are filled with the width of the road. Above method realizes the automatic detection and connection at the broken line of the road, and effectively ensures the integrity of the road morphology.
To sum up, in order to solve broken road problem caused by lack of local characteristics of the road. In this paper, broken road repair is carried out, the method not only enhances the shape integrity of roads after extraction, also improved the precision of road extraction, and final result images can be directly applied in the spatial decision support and analysis.

Road thinning
The purpose of road thinning is to transform a road into a kind of simple structure, which only one-pixel width road be keeped, while the geometric shape of the original road to the maximum. Road thinning can reduce a large amount of pixel data and improve the running time of breakpoint join algorithm. We refer to the K3M algorithm (Tabedzki et al.,2016) for road thinning, the specific process is as follows: (1) marked the road boundary of the road extraction image after binarization; (2) if a point in the road boundary is adjacent to three non-zero points (i.e. non-background points), the point shall be deleted; (3) if there are 3 or 4 non-zero adjacent points around a point in the road boundary, the point shall be deleted; (4) if there are 3, 4 or 5 non-zero adjacent points around a point in the road boundary, the point shall be deleted; (5) if there are 3, 4, 5 or 6 non-zero adjacent points around a point in the road boundary, the point shall be deleted; (6) if there are 3, 4, 5, 6 or 7 non-zero adjacent points around a point in the road boundary, the point shall be deleted; (7) cancel the marking of the remaining road boundary points. Return to step (1) if there are no modifications in steps (2) to (6) above in one iteration.
Repeat these steps several times until you get a single pixelwide road skeleton.
Neighborhood weight is used to quickly determine the basic attributes of each pixel in the neighborhood. Eight points in the small and does not affect the overall trend of the road.
Due to the above characteristics of U-Net network road extraction, using U-Net network extraction results as experimental data can not only highlight the necessity of linking disconnections, but also reduce the complexity of data preprocessing.
Broken road and noise points are mainly caused by the fact that the road in the original image is covered by other objects, or the phenomenon of "the same spectrum of different objects". Analysis of the type of breakpoint is conducive to the construction of the disconnection algorithm. This paper refers to the contour classification types (Xin et al., 2006), divides breakpoints into three categories according to the results of breakpoint matching: (1)Non-matching breakpoint. These breakpoints cannot find matching points in the image, such as breakpoints caused by noise points, and by only one road breakpoint in the intersection.
(2)The only matching breakpoint. These breakpoints can find unique matching points, such as road breakpoints caused by occlusion.
(3)Multi-value matching breakpoint. These breakpoints can find multiple matching points, such as breakpoints caused by intersections. The three types of breakpoints are shown in Figure 1.
The proportion of broken road in the whole road length is neighborhood pixel are encoded in binary, and different values correspond to different states in the neighborhood (Saeed et al.2010).The bit-value matrix (as show in formula (1) Where w (x, y) = the neighborhood weight of pixel (x, y) img (x, y) = the binary value of pixel (x, y)

Breakpoints detection
Breakpoint detection is the key point of broken road repair. In this paper, eight neighborhood detection algorithm (Hu et al.,2018) is selected to realize road breakpoint detection. The eight neighborhood detection algorithm determines the type of the target pixel according to the pixel value and size of the eight adjacent directions of the target pixel. Actually, the breakpoints are mainly found in the road skeleton after road thinning. The detection rules are as follows: Assuming that the target road breakpoint is set as p1, and the neighboring pixels of eight directions around it are set as p2->p9,in which p1->p9 corresponds to the pixel value of the road breakpoint, and 0 represents the background value,1 represents the road pixel value, then: (1) when the value of p2+p3+p4+p5+p6+p7+p8+p9 is equal to 1, it means that p1 is the breakpoint at this time.
When traversing the road skeleton image pixel by pixel, according to the rule (1), it indicates that the road edge breakpoint is successfully retrieved and marked with the breakpoint marker symbol.
The experimental data in this paper is from the results of road extraction by U-Net network. U-Net network can be trained well by a small amount of sample data, and the edges of target object extraction are smooth and clear (He et al.,2019). However, in the deconvolution of the decoding layer and the operation of cropping and splicing during prediction, the spatial resolution of the image will be reduced, and even some of the target features cannot be extracted . The road extracted through the U-Net network has the following characteristics: (1) The road has smooth edges that are more in line with real road conditions; (2) The extraction result contains less noise. Compared with traditional supervised classification, convolutional neural networks have a greater receptive field and thus have stronger road recognition capabilities. (3) The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLII-3/W10, 2020 International Conference on Geomatics in the Big Data Era (ICGBD), 15-17 November 2019, Guilin, Guangxi, China

Mathematical morphological filtering
Mathematical morphology filtering is a nonlinear method. It mainly plays the role of noise points reduction and enhancement in image processing, and has the characteristics of parallelism and rapidity (Richard,1995). Although the experimental data has smooth characteristic, there are still some wild value noise points in the global receptive field. Figure 2 shows types of noise points. Through morphological filtering, most of the small spurs on the road line will be removed, hence, the number of noise points will be reduced. In this paper, hit and miss transitions in the mathematical morphology are used (Xin et al.,2006), which usually denoted as AB, and the expression is as follows.
Where B1∩B2=∅ B=B1∪B2 B1 = the target object of interest(in this paper is the wild value noise point in the road) B2 = the background feature

Clustering of breakpoints based on K-means
2.4.1 K-means Algorithm: K-means algorithm is an unsupervised clustering analysis algorithm, which clusters the target objects into different categories according to the degree of similarity of the target objects (Kuntal et al.,2019). K-means is an iterative solution clustering algorithm. In this paper, firstly K road breakpoints are randomly selected as the cluster center, and then the distances from the rest of the road breakpoints to the selected cluster centers are calculated. According to the principle of maximum similarity with the shortest distance, other breakpoints are automatically classified into the nearest clustering center. Through multiple iterations, until no other road breakpoints fall into the corresponding cluster center, or reach the preset number of iteration, then it finishes cluster (Krishna, Murty,1999). K-means algorithm is reduced to a formula and can be expressed by the following expression.

Road breakpoints clustering rules:
In order to determine the rule of the number of clusters, this paper discusses the relationship between the classification ratio and accuracy of the road breakpoints. Four pictures m1, m2, m3, and m4 are selected for cluster analysis, and the number of clusters for each picture is set to k=[1, N], and then the correct number of classifications and the number of misclassifications of the cluster be counted. Table 1 shows the statistics result of one picture. In the above table, bn represents the breakpoint numbers, cn represents the classification numbers, rcn represents the real classification numbers, ccn represents the correct classification numbers, ecn represents the error classification numbers, and cr represents the classification ration, which we define as.

cr = cn/ bn
In order to more clearly demonstrate the relationship between cr and accuracy, we present these two variables in the four statistical tables through line charts, as shown in Figure 3. It can be seen that when cr is in the interval [0.4,0.7], the correct rate is higher. Through the experiment, cr=0.5 is selected as a computational coefficient. Therefore, for a given number of breakpoints N, the number of categories is Int(0.5*N).

Elimination of Invalid Breakpoints:
There are some invalid breakpoints after clustering, which belong to misclassification and need to be eliminated. In this paper, there are two main types of invalid breakpoints: (1) Breakpoints on the same road line. These breakpoints are not filtered out during noise points filtering. After clustering, the two endpoints are divided into the same class, as shown in Figure 4(a), both breakpoints are invalid.
(2) Breakpoints in the same tangential direction, when two or more parallel roads are cut in the same horizontal direction and The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLII-3/W10, 2020 International Conference on Geomatics in the Big Data Era (ICGBD), 15-17 November 2019, Guilin, Guangxi, China This contribution has been peer-reviewed. https://doi.org/10.5194/isprs-archives-XLII-3-W10-387-2020 | © Authors 2020. CC BY 4.0 License. 391 the distance in the vertical direction is greater than the horizontal direction, after clustering, three breakpoints are divided into the same category; or there is an intersection not far from the road breakpoint, and there is a breakpoint at the intersection. After clustering, three breakpoints are divided into the same category, as shown in Figure 4(b), the red circle breakpoint in the figure is an invalid breakpoint.

Figure 4. Schematic diagram of invalid breakpoints
For the first type of invalid breakpoints, it is easy to eliminate these invalid breakpoints by analyzing both of them are belong the same connected road. For the second category, the tangential direction of the breakpoint is introduced for determination. For each breakpoint, backtracking 10 pixels, if there are two directions in the backtracking process, the backtracking is terminated and the ending point is used as the tangent endpoint. Draw a line from the breakpoint to the end of the tangent to form a tangent direction. Take the horizontal line to the right as the axis direction, the angle from the tangent to the axis direction is defined as ∠α, and the clockwise direction is the positive direction. Therefore, the angle value limit to [0, 2π]. When the angle difference of two breakpoints is less than π/2, one of the breakpoints is considered to be an invalid point. By calculating the difference between the cut angles of other breakpoints in the same type, the specific invalid breakpoints are determined and eliminated.

Broken road fitting
Polynomial curve fitting coefficient solution requires multiple pairs of coordinates to be available. In this paper, a cubic polynomial is used to fit the broken road, so at least five pairs of coordinates are required to be combined with the logarithm of the breakpoint. In order to fully ensure the curve fitting is close to the real road image, five points are backtracked for each breakpoint, the principle of backtracking: Backtracking is terminated when there is a two-way backtracking or when the number of given backtracking points is reached.
In this paper, the cubic polynomial is used to fit the road breakpoint, which fully considers the curvature of the road. The cubic polynomial mathematical expression (Soumen et al.,2016) is as follows, the unknown coefficients can be obtained by the least square method according to the coordinate pairs of backtracking points.

EXPERIMENT AND ANALYSIS
The experimental data used in this paper selects the four road images extracted from the Massachusetts Roads dataset (Mnih,2013) in the U-Net network. We picked up four extracted road images to do experiment. Broken road problem is an important influencing factor of road shape integrity and road extraction accuracy. In order to solve the problem of broken road, this paper organically combines breakpoint detection, breakpoint clustering and curve fitting based on spatial connectivity. The experimental broken road repair model and the final broken road repair result is shown in Figure 6, where the numbers in the repair model legend represent the breakpoint matching labels after road breakpoints are clustered by K-means and the white points are road breakpoints and backtracking points.

(a) (b)
The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLII-3/W10, 2020 International Conference on Geomatics in the Big Data Era (ICGBD), 15-17 November 2019, Guilin, Guangxi, China The qualitative description indicators recorded during the experiment are shown in Table 2 below: Through the observation of the visual effect, the image (a) in Figure 6 completed the repair of seven broken road, and the fitting effect is good. There is an unrepaired broken road and its position is by the 11th broken road, because its position is at the intersection of roads, it is a mistake caused by multi-value matching points, but the overall repair accuracy is high, reaching 88%; There are six broken road in the image (b), through the algorithm of this paper, the repair of four broken road is realized, and the repair accuracy at 67%, mainly because there is an error cluster (class 7) in the upper left corner of the image. The broken road that should have been on the right side of the 7th class needs to be repaired. In addition, there is still a broken road line that is not matched, which is a defect of the algorithm; Image (c) got the best repair, repairing the four broken road that the image itself has and the fitted curve can be truly reflecting the degree of bending of the road, the repair accuracy reached 100%; Image (d) itself has four broken road, completed three repairs, and the degree of repair is good, there is one broken road is not repaired because of multi-value match points problem, its position is a little to the left of the class five broken road line, but the overall restoration accuracy of 75%.Through the broken road repair method based on spatial connectivity, which combines breakpoint detection, breakpoint clustering, and curve fitting, the road integrity can be improved obviously. The quantitative analysis results show that the broken road repair method is implementable, which not only ensures the shape integrity, but also improved road extraction accuracy, which is almost above 70%. It even fully meet the repair needs of broken road during work. The method in this paper can solve the broken road line repair of the unique matching point, but there are repair defects of non-matching points and multi-value matching points.

CONCLUSION
In this paper, from the local feature expression defects, commonly found in road extraction of remote sensing images, that is, road appears disconnections, the breakpoint detection, breakpoint clustering, and curve fitting are organically integrated based on spatial connectivity. The method proposed in this paper has certain implementability. The experimental result data in this paper can be directly applied to spatial analysis. The proposed method is suitable for post-classification of linear features and has certain reference significance for classification and post-processing of roads, power grids, and tracks. The method of this paper also has certain defects. There are repair defects of non-matching points and multi-value matching points, the algorithm improvement research will be carried out for these cases in the future.