HIGH PRECISION AUTOMATIC EXTRACTION OF CULTURAL RELIC DISEASES BASED ON IMPROVED SLIC AND AP CLUSTERING

Automatic and high-precision detection and quantitative expression of cultural relics diseases are important contents of cultural relics science and technology protection. Aiming at the problem of automatic extraction of boundary cultural relics diseases, this paper proposes an adaptive SLIC0 combined with AP clustering method to achieve high-precision detection of disease areas. Firstly, based on the SLIC0 algorithm, the selected area of the disease orthophoto frame is segmented, and the Canny edge detection is used as the true value. The number of superpixels is iterative until the accuracy meets the requirements, so as to achieve the best fitting of superpixel edges. Then the AP clustering method is used to merge the superpixels of the disease area to obtain the edge information. Finally, taking the surface shedding disease of painted cultural relics as an example, this method is applied to realize the high precision extraction and quantitative expression of the edge of the disease. The correctness, feasibility and advancement of the algorithm are proved by comparing with the existing manual methods. The method in this paper provides an efficient and high-precision means for the quantitative expression of cultural relics diseases, and can provide accurate data support for the scientific and technological restoration of cultural relics.


INTRODUCTION
Grottoes have their own life cycle like other material cultural heritage. During which it will be ' sick ', such as peeling, cracking, flaking, collapse, damage and so on. Some of the statues have also received ' treatment ', such as the protection and restoration of some statues. Protecting the repair process has the effect of delaying the life cycle, but if health data are not accurate enough to protect the repair measures are not appropriate, the repair process can also lead to accelerated degradation of the image. Therefore, it is urgent to investigate the diseases of cultural relics, accurately and accurately grasp the type, quantity, distribution, degree and formation mechanism of diseases, and scientifically protect them (Andrey V et al.,2014, Anna M et al.,2012. At present, most cultural heritage protection units have carried out digital preservation of cultural relics, and applied digital products such as orthophotos to manually extract and record the disease status (Fang Mingzhu.,2009). However, this method has low efficiency and low accuracy. The automatic extraction of disease area based on orthophotos is the main way to solve the above problems. Currently, there are mainly edge extraction and image segmentation methods for automatic region extraction. Among them, the edge extraction algorithm mainly includes differential edge detection, Reborts operator, Sobel operator, Prewitt operator, Kirsch operator, Canny operator and so on. However, the edge extraction operator is mainly used to extract all possible edges, and it cannot extract closed regional edges with certain properties such as diseases. For cultural relics diseases, the robustness is weak. The traditional image * Corresponding author: 862899353@qq.com segmentation methods mainly include the multi-threshold segmentation method (Otsu N.,1975), the clustering-based segmentation method (Davis L S.,1975), and the region-based segmentation method (Meyer F.,1990). These segmentation methods are all based on pixels, focusing on the gray changes between images, and do not consider the relationship between the spatial positions of pixels. It is easy to cause problems such as over-segmentation, under-segmentation and poor segmentation results of images. They can only roughly locate the target, and cannot accurately segment the accurate edge of the target region. Therefore, it is necessary to develop more favourable segmentation methods in edge accuracy and robustness. Around 2000, the super-pixel segmentation method appeared (Ren Xiaofeng et al.,2003). , Ren (X, Malik J., et al.,2003 proposed the concept of super-pixel. Compared with the original image pixel, the super-pixel has the characteristics of homogeneity and irregular geometric deformation. The merging of similar regions is more conducive to the extraction of disease contour. The representative super-pixel segmentation methods include Normalized cut algorithm (SHI Jian-bo et al.,1997). The main idea of this algorithm is: firstly, an objective function needs to be constructed; secondly, a certain segmentation criterion is used to segment the image. Normalized cut algorithm segment super-pixel results are more compact, but the edge fit is poor. Bergh et al. proposed the SEEDS algorithm in 2012(Bergh M V D et al.,2012 the edge fit of this method is not ideal. In 2012, Achanta  proposed a simple linear iterative super-pixel segmentation method SLIC. This method can customize the number of super-pixels and compactness, and can best adjust the edge fit and segmentation compactness. It has fast calculation speed and strong robustness. SLIC0 is an optimized version of SLIC, which can adaptively select tightness parameters for each superpixel (Chu Jinghui et al.,2017, Diniz P et al.,2018. SLIC0 generates regular shape superpixels in both texture region and non-texture region by adaptively compact parameters. The evaluation of various segmentation algorithms generally considers the performance indicators such as the edge recall probability, the error probability of under-segmentation, the segmentation reachability accuracy, the complexity of algorithm calculation, the controllability of the number of image blocks, and the controllability of the compactness of image blocks , Perbet F et al.,2011, Schick A et al.,2012. The boundary recall rate is the coincidence rate between the target edge after super-pixel segmentation and the target edge after artificial segmentation. Under-segmentation error rate refers to the algorithm segmentation results and artificial segmentation results are compared to measure the superpixel segmentation boundary and artificial segmentation boundary does not coincide, both ' artificial standard segmentation boundary area ratio ' (Buyssens P et al.,2014); Segmentation accuracy refers to the ratio of the number of pixels correctly marked in all superpixels to the total number of images, and the reliability of the algorithm can also be evaluated from this level. ; algorithm controllability and super controllability and superpixel compactness refers to the incorporation of the controllability of super-pixel number and compactness into the algorithm evaluation system(Cheng , which can be repeatedly debugged according to the actual application according to the requirements until the super-pixel number and super-pixel compactness meet the expected conditions. According to the above analysis, in view of the disease area of the orthophoto of cultural relics, this paper intends to adopt the SLIC0 superpixel segmentation method to adaptively iteratively improve the number of superpixel parameters, and apply AP clustering to the clustering results to automatically obtain the accurate edge of the disease area. The edge information extracted by Canny operator is used as a reference to verify the advantages of the algorithm in terms of accuracy.

2.1SLIC0 segmentation algorithm and its advantages
SLIC is a simple linear iterative clustering segmentation algorithm. The clustering feature vector of this method is 5dimensional vector, which is the X and Y values of each original pixel of the image and the L, a and b values of CIELAB color space. The values of L, a and b are obtained from the RGB image of the segmented image. Two parameters in the SLIC algorithm are fixed values, one parameter represents the number of superpixels, and the other parameter represents the compactness of superpixels. SLIC0 is an improvement of SLIC segmentation algorithm, which can adaptively select tightness parameters for each superpixel. No matter how the image texture is, the superpixel with regular shape will be generated, and the computational efficiency will not be affected. The specific implementation steps of SLIC0 algorithm are as follows.
(1) Initializing the clustering center: Given the number of superpixels K, K superpixel clustering centers are evenly distributed in the image with N pixels. The size of each superpixel is N / K, and the distance between adjacent superpixels is s = √ / .
(2) Reselect the cluster center in the n × n neighborhood of the cluster center, generally n=3. The gray gradient of all pixels in the field is calculated, and the gradient value of pixels in each field is traversed. Finally, the pixel with the smallest value is used as the adjusted clustering center.
(3) With the adjusted clustering center as the search center and twice the center spacing as the neighborhood search range, the center identification is constructed for each original pixel of the segmented image to determine the clustering center that each pixel may belong to.
As mentioned above, the five-dimensional vector includes X, Y values and L, a, b values, which are divided into spatial distance and color distance according to their respective attributes. The specific calculation formula is shown in formula (1): In formula (1), dc represents color distance variable, ds represents space distance variable, Ns represents maximum spatial distance, Nc represents maximum color distance. Through the above algorithm process, it can be seen that each original pixel is likely to be compared and calculated by multiple clustering centers. Then, according to the 5 -dimensional vector distance calculation formula, each pixel will get n distance values. Finally, the corresponding point with the smallest distance value is selected as the clustering center of the pixel, and it is identified as the corresponding type of the clustering center.
(5) Iterative optimization. The maximum value of spatial distance and color distance obtained in the previous iteration is selected as the value in this iteration until the clustering center of each pixel is not changed, and the error converges.
(6) Enhance the connectivity between superpixels after segmentation. Superpixel segmentation size is too small or originally belong to the same region is cut into too many discontinuous superpixel phenomenon, at this time, need to be connected with the adjacent superpixel processing.

Attribute
Segmentation Method Based on Graph Theory Segmentation method based on gradient rise The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLIII-B2-2022 XXIV ISPRS Congress (2022 edition), 6-11 June 2022, Nice, France been introduced above. GS algorithm finds an appropriate reference distribution, and compares the dispersion degree of the data set with the dispersion degree under this reference distribution to obtain a Gap, namely the statistics. Through the analysis of the statistics, the estimation of the optimal number of clusters is obtained. Superpixel lattices algorithm (SL) uses the method of continuously searching the optimal path to segment the image into smaller regions in both vertical and horizontal directions, and finally obtains the result of superpixel segmentation. The basic idea of Water-Shed (WS) is to construct a watershed from the local minimum point of the image along the upward direction of the gradient, which is used to segment different catchments. The watershed line is the segmentation line of the image. Mean Shift (MS) generates superpixels by clustering all pixels with the same modulus by determining the local maximum of the density function. The basic idea of Turbopixel algorithm (TP) is that the seed points are gradually expanded to initialize, and the superpixels of the image are gridded under the constraint of geometric flow set. This method can achieve the result of regular distribution of superpixels as much as possible. Table 1  respectively count the comparison of super-pixel segmentation methods based on graph theory and gradient ascending algorithm in determining the number of super-pixel segmentation, compactness, complexity, and number of parameters. SLIC0 can achieve the best choice in compactness. The two algorithms can achieve a better balance between edge fit and compactness, and the complexity and parameters are the lowest, and the segmentation efficiency is the highest.

Adaptive SLIC0 segmentation algorithm
The adjustable parameter in SLIC0 algorithm is the number of cluster centers K. The meaning of clustering center K is how many clustering results are generated by a given number of K, and each superpixel represents a category. The influence of parameter K in SLIC on the algorithm is analysed below. The SLIC algorithm is applied to select two cultural relics disease areas, and the compactness parameters are unchanged. The clustering centers K are set to 50,100 and 200, respectively. The classification results are as follows:

Fig 1. SLIC segmentation results of different clustering centers
Fig1. (a) shows the powdery peeling disease. It can be seen from the classification results that the boundary generated by the superpixel is quite different from the disease boundary, and there is no good fit together. Fig1. (b) is flake spalling, the disease edge gradient changes greatly, so the superpixel segmentation results are ideal, and with the increase of the number of superpixels, the edge fitting rate increases. It can be seen from the above experiments that the number of superpixels K is positively correlated with the edge fitting rate of the disease. However, excessive number of superpixels leads to serious oversegmentation, as shown in Fig1.b (3). The experimental results show that the number of superpixel clustering centers has a significant impact on the segmentation results. The superpixel clustering center K has a positive correlation with the segmentation results, and the larger the K value, the higher the edge fit of the segmentation results. Thus, the parameters of SLIC0 algorithm need to be dynamically optimized to obtain the optimal segmentation accuracy. In view of the above analysis, this paper proposes SLIC0 adaptive segmentation algorithm, the specific algorithm process is: (1) Canny edge feature extraction of orthophoto images of cultural relics diseases; (2) SLIC0 segmentation of orthophoto images of cultural relics diseases; (3) Taking Canny edge feature as the true value, the edge recall rate of SLIC0 segmentation results is calculated. If the edge recall rate is above 0.9, the number of current clustering centers is output. If the edge recall rate is below 0.9, the number of clustering centers increases by K0 superpixels, and the segmentation of step (2) is carried out again; (4) The segmentation ends after the edge recall meets the requirements.
Canny is a classical edge extraction algorithm, which is often used as the standard and true value of edge detection. This paper directly calls the Canny algorithm in OpenCV open-source library for edge detection. The mathematical model of edge recall rate is as follows: (2) where p =the number of edge pixels in the artificial standard segmentation results; q=the number of edge pixels in superpixel segmentation results; = the set of all edge pixels in the artificial standard segmentation results; =collection of all edge pixels in a superpixel segmentation result The specific technical route is as follows:

AP Clustering Based on SLIC0 Segmentation
Through the adaptive improvement of SLIC0, the optimal segmentation edge accuracy can be obtained. To get the final disease edge, superpixel clustering is also needed. AP clustering is a clustering algorithm based on message passing proposed by Frey and Dueck, which belongs to a relatively new unsupervised clustering algorithm. The basic idea of the clustering algorithm is to treat all the sample data points as a node data in the network, and realize the clustering in the set elements through message transmission between nodes (Xia Dingyin., 2010). Clustering based on information transmission needs to input the similarity matrix between nodes. In the clustering process, there are two kinds of information transmitted between nodes, namely, attraction and attribution. In the algorithm operation, attraction and attribution update the centroid through continuous iterations, and other data points are marked into their corresponding categories. Two information iterative mathematical models are as follows: where r(i, j) = attraction; a(i, j) = attribution; i, j = two nodes; Updates of attraction and attribution are as follows: Where −1 ( , ) and −1 ( , ) = the degree of attraction and attribution updated after the t-1 iteration; λ -denotes the damping coefficient, the greater the value, the faster the convergence rate.
Let R(i, i = 1,2,3,··· P) be a superpixel block after SLIC superpixel segmentation, R(i) corresponds to the mean value of the color component ( ) = { ̅ , ̅ , ̅ } of the color space. Then the element s(i, k) in the similarity matrix S is:  . The author reduced the number of clusters generated by AP clustering algorithm under different p values by reducing the p value to obtain the optimal clustering results. The image is clustered by AP clustering algorithm to obtain K clustering blocks ( = 1,2,3 ···). a(i) is the average distance between superpixel R(i) in and other superpixels in in color space, d(i, ) is the average distance of superpixels in color space from superpixel R(i) to the image segmentation block , the contour coefficient of superpixel R(i) is: ( ) represents the clustering quality of AP algorithm, and the larger the value is, the higher the clustering quality of the algorithm is and the better the image segmentation effect is. According to the optimal clustering number theory, when n data points are clustered, the optimal clustering number should be less than or equal to √ (ALEXANDRE E B et al.,2015). The specific steps of clustering algorithm are as follows: (1) The optimized SLIC superpixel segmentation algorithm is used to segment the image, and K superpixel blocks are obtained.
(2) Calculate the mean value of color component ( ) of all points in superpixel R(i) in Lab color space, and use it as the color feature vector of the superpixel; (3) The Euclidean distance between any superpixel blocks is calculated according to formula (9), and the similarity matrix S is obtained ; (4) With as the initialization parameter, the damping coefficient λ = 0.5, and the parameters of attraction and attribution are both 0; (5) The attribution and attraction are calculated according to formula (3) ~ formula (5) ; (6) Update attribution and attraction according to formula (6) ~ formula (8); (7) The clustering center is determined by formula ( 12 ): = √ { ( , ) + ( , )} (12) When i = k in Formula (12), i is the clustering center. If i ≠ k, k is the clustering center of i. If the iteration number is more than 1000 times in the iteration process, the iteration is terminated, and the current clustering center value is obtained. (8) The number of clusters K = 2 shows that the foreground and background of the image are segmented, and the segmentation effect is the best. If no clustering converges to 2, then calculate the ( ) according to the formula (11), and continue the iteration with p = p + 0.1 min( ) decreasing p value. When the average contour coefficient decreases more than three consecutive times or the number of clusters is equal to 2, the clustering algorithm stops. (9) The number of clusters corresponding to the maximum average value of contour coefficient is taken as the clustering result.

Comparative analysis of different segmentation algorithms
Four methods GS, NC, TP, SLIC and SLIC0 in the two methods were selected to segment the actual disease image, and the obtained image is shown in the following Fig 3: GS NC TP SLIC SLIC0

Fig 3. Comparison of Different Segmentation Algorithms
It can be seen from the segmentation result map that GS segmentation method is not easy to cluster the ideal result edge data and is not easy to extract for cultural relics disease images. NC algorithm is suitable for gray image, but it has requirements for image complexity, and it is suitable for diseases with clear boundaries, but it is not suitable for diseases with blurred boundaries. Tuprbopixel algorithm has compact super-pixel block spacing and uniform distribution, but the edge fit is not very ideal. SLIC and SLIC0 super-pixel segmentation results have high compactness, compact super-pixel block rules and controllable parameters, which are convenient for later processing.
The under-segmentation rate, edge recall rate and segmentation accuracy of the above classification results are calculated, and the results are compared and analysed as follows.

Fig 4. Comparison Graph of Segmentation Rate, Edge Recall Rate and Segmentation Accuracy
In terms of under-segmentation rate, edge recall rate and segmentation accuracy, SLIC, SLIC0 and SEEDs algorithms are more prominent, but in Table 2-1, it can be seen that SEEDs algorithms cannot be open in the controllability of tightness. SLIC0 algorithm is developed from SLIC algorithm. The improved SLIC0 algorithm has little difference with SLIC algorithm in edge recall rate under the condition of compactness adaptive, but SEEDs algorithm is slightly worse than SLIC and SLIC0 in accuracy. In view of the above analysis, SLIC0 algorithm has good performance in terms of image segmentation quality, processing speed and superpixel compactness. This paper will take SLIC0 superpixel segmentation method for subsequent research.

Feasibility analysis of this algorithm
In this paper, the adaptive SLIC0 combined with AP clustering is used to automatically extract the edge information of the disease by taking the orthophoto of the cultural relic disease area as an example. The following figure is the segmentation process and results. (b) is the edge feature extracted by Canny operator. Taking K = 50, m = 30 as the initial value, where m is the compactness, which is the parameter in the SLIC algorithm, and represents the importance measurement of space and color space, the adaptive SLIC0 algorithm in this paper is applied to the iterative segmentation of Fig5. (a). The number of superpixels in each iteration is increased by 10. The compactness parameter is segmented according to the optimal compactness of each superpixel. Fig5. (c) -(j) are the results of each iteration. When K = 120, the edge recall rate reaches 0.9, and the iterative

Applicability analysis of the algorithm
The orthophoto maps of pigment layer peeling, scale peeling, powder peeling and upwarping, powder peeling are used to segment and cluster superpixels. The results are as follows: It can be seen from the above figure that the proposed algorithm has good segmentation results for pigment layer peeling, flake peeling, pulverization peeling, fish scale peeling and other diseases, and the edge recall rate is above 90 %. But for more complex composite diseases, especially involving threedimensional diseases such as upwarping disease segmentation accuracy is not high. For the disease area extracted automatically and manually, the area and perimeter were calculated respectively. The statistical results are as follows:

Fig 9.
A comparison chart between this method and conventional manual method From the above comparative analysis, it can be seen that the difference rate between the extraction of cultural relics disease area and manual disease extraction is less than 8 %, and the difference rate of statistical perimeter results is less than 17 %. Remove disease with three-dimensional characteristics, area statistical difference rate will be less than 5 %, edge length statistical difference rate will be less than 7 %.

SUMMARY
Aiming at the problem of automatic edge extraction of cultural relics, an automatic and high-precision extraction method based on orthophoto adaptive SLIC0 and AP clustering is proposed in this paper. By analysing the advantages of super-pixel segmentation, based on the adaptive compactness parameter SLIC0 algorithm and combined with the adaptive iterative super-pixel number parameter, the super-pixel segmentation results are well balanced in compactness and fitness, and the results are optimal. The disease edge information is obtained by AP clustering, and statistical analysis can be carried out. The experimental results show that the proposed algorithm has the advantages of the lowest complexity and parameters, which can make the segmentation results achieve the desired effect. Compared with the manual segmentation results, it shows the superiority of the automatic segmentation algorithm, which provides a new means and high-precision data support for the extraction and statistics of cultural relics diseases.