AN IMPROVED IMAGE MATCHING METHOD BASED ON SURF ALGORITHM

: Many state-of-the-art image matching methods, based on the feature matching, have been widely studied in the remote sensing field. These methods of feature matching which get highly operating efficiency, have a disadvantage of low accuracy and robustness. This paper proposes an improved image matching method which based on the SURF algorithm. The proposed method introduces color invariant transformation, information entropy theory and a series of constraint conditions to increase feature points detection and matching accuracy. First, the model of color invariant transformation is introduced for two matching images aiming at obtaining more color information during the matching process and information entropy theory is used to obtain the most information of two matching images. Then SURF algorithm is applied to detect and describe points from the images. Finally, constraint conditions which including Delaunay triangulation construction, similarity function and projective invariant are employed to eliminate the mismatches so as to improve matching precision. The proposed method has been validated on the remote sensing images and the result benefits from its high precision and robustness.


INTRODUCTION
Image matching is a process of a geometrically matching up two images of a same scene, which widely used in various applications (Zitova and Flusser, 2003;Arévalo and González, 2008), including image mosaicing, change detection, 3D construction, etc. Generally speaking, image matching algorithms can be classified to feature-based and intensity-based methods (Zitova and Flusser, 2003;Wu et al., 2015). Compared with intensity-based algorithms, feature-based algorithms not only have advantages of illumination changes and complex distortion, but also take less time consuming due to the whole image information replaced by a finite number of points. So feature-based algorithms have been a research hotspot of image matching. Feature-based algorithms firstly extract feature points from two images, and the matching method is performed to generate the matching result from the feature points pair.
In recent years, feature-based algorithms have been extensively studied. Lowe proposed scale invariant features transform (SIFT) algorithm (Lowe, 2004) to solve some problems such as illumination, speckle, rotation, scale, translation, etc. However, the algorithm is not fit for dealing with a large number of images because of SIFT itself complexity. Then many researchers have already been attracted to improve SIFT (Mikolajczyk and Schmid, 2005;Juan and Gwun, 2009;Teke et al., 2011). Speeded Up Robust Features (SURF) algorithm (Bay and Van, 2006) is the improved version of SIFT, having an advantage of time efficiency which is attributed to introduce integral image and box filter to reduce time-consuming. But it has a disadvantage of lower accuracy. Lee et al. (Lee et al., 2010) presents a Coarse-to-Fine approach of image matching based on Haar Wavelet Transform and SURF algorithm. The Coarse-to-Fine strategy from Harris operator to normalized cross-correlation and RANdom SAmple Consensus (RANSAC) algorithm are used to achieve the fine points. Bouchiha et al. developed a matching method which separated the detector from the descriptor and proposed an extension to the SURF descriptor. Zheng et al. (Zheng et al., 2015) proposed an improved SURF method by combining color invariant model and a series of constraint conditions (CC_SURF), this method has a good matching rate and feature points are well-distributed. Anzid et al. (Anzid et al., 2017) proposed an improvement of the SURF algorithm and it could automatically remove the outliers by means of both Distance and Orientation filtering strategy (DO_SURF). However, in the aforementioned methods the number of the detected feature points is not at a high level owing to the loss of color information caused by color images often converting to grayscale images in the matching process. Therefore, this paper proposes an improved SURF mean which introduces color invariant transformation, information entropy and constraint conditions aiming at improving a number of feature points and matching precision. First, orthogonal color transformation model is introduced to generate the orthogonal color space of two images from RGB color space of both images, and the information entropy is used to selected the most rich color channel as the subsequent matching image. Then SURF algorithm is applied to extract and describe feature points from the image, and constraint conditions which including constructing Delaunay triangulation, triangle similarity function and projective invariant are employed to eliminate the mismatch match and increase image match precision and robustness.
The rest of this paper is organized as follows: Section 2 describes the proposed method. In Section 3, experimental results obtained on remote sensing images by the proposed method is compared with other related SURF algorithms. Section 4, the conclusion is drawn.

Overview of the proposed method
In this paper, the overall flowchart of the proposed method is shown in Fig. 1. It can be divided into 3 main steps: (1) Color transformation and Information entropy are introduced for obtaining the more information data of two matching images; (2) SURF algorithm are used to detect and extract the feature points; (3) methods of mismatch points removal (Li and Zhang, 2009;Zheng et al., 2015) which including Delaunay triangulation, triangle similarity function and projective invariant, are applied to eliminate the mismatch feature points.

Color transformation and Information entropy
Due to the RGB color space of images without invariant space, we transform RGB color space into orthogonal color space with color invariant characteristic. So each color channel of image which including R, G and B band, is done by calculated the mean and the standard deviation of pixel value. The orthogonal color space calculation expression can be defined as: where R, G and B are the mean of pixel value in each channel, respectively. R, G and B are the standard deviation value of each band respectively. R, G and B respectively represent the pixel value of every band in original image. R 0 , G 0 and B 0 are the pixel value of every band via orthogonal transformation.
Information entropy is calculated to select the most abundant information for each channel. Information entropy is a measure of image information. The greater the information entropy of an image, the more the information content has. Therefore information entropy is introduced to calculate the information of each channel and select the best channel as import information data to be processed. More feature points would be detected by SURF detector in the both images. Information entropy expression can be described as: where k represents the three corresponding color channels; j represents pixel value level of color channels; (j) represents occurrence probability of j-level; min and max are the pixel value of minimum and maximum, respectively. E(j) represents information entropy of color channels.
As the above mentioned by equation 1 and 2, the most entropy of color channels is used for the feature detection and extraction of the next step.

SURF detector and descriptor
2.3.1 SURF detector: For the two matching images, we calculate the information entropy for all the channels and select the channel with the maximum entropy as input data. SURF detector is used to detect the feature points of two image. Integral image is set up and multi-scale space is built by box filter. Hessian matrix is applied as it has good performance and accuracy. The Hessian matrix can be defined as: where is a scale. Lxx(x,) is the convolution result of the second order derivative of Gaussian filter and similarly for Lxy(x,) and Lyy(x,).
In order to improve the calculation efficiency, we construct a Fast-Hessian matrix in SURF detector. We equate the initial Finally, the threshold is appropriately selected and a Nonmaximum suppression in a 3×3×3 neighborhood of each point is applied to detect extremum points. The point is regarded as a candidate only if its response value is larger or smaller than all the 26 neighbors at the current and adjacent scales. Then, the steady feature point location and its scale are obtained by interpolating in scale space and image space.

SURF descriptor:
To realize the rotation invariant of feature point, the dominant orientation for each feature point should be determined. First, Harr-wavelet responses are calculated for the pixels within a circular neighbourhood of radius 6s around the feature points, where refers to the scale of point detected. Then weighted Gaussian function is performed for the feature points, and we give greater weights only if the feature point is close to the circle center and contributing a lot in orientation. The dominant orientation is estimated by calculating the sum of all responses within a sliding orientation window covering an angle of 60°. The x-and y-responses within the window are summed up to produce a new vector. So the longest vector's orientation is selected as the feature point dominant orientation via scanning the entire circular neighbourhood.
After generating feature descriptor, the rectangular region with the side length 20s×20s is selected, and the dominant orientation of the area are rotated to the dominant orientation of the feature points. The rectangular region is divided into 16 smaller 4×4 square sub-regions and 5×5 sampling points are selected in each sub-region to calculate the corresponding Harrwavelet responses dx and dy. Furthermore, summate Harrwavelet response values and their absolute values of the 4 subregions separately. So we get a four-dimension descriptor vector v=(  dx,  dy,  |dx|,  |dy|) for each sub-regions and form a 64 (4×4×4)-dimension feature vector.

Mismatch points removal
SURF detector and descriptor only detects and extracts feature points, but it can hardly eliminate a lot of mismatch points. For this reason, this paper introduces the Delaunay triangulation, triangle similarity function and projective invariant under the matching process.

Delaunay triangulation construction:
In this paper, incremental insertion algorithm (Lawson, 1977) which has some advantages of simple thought, easy implementation and high efficiency, is used to build triangulation network for all feature points in both images. Then LOP (Local Optimization Procedure) is used to optimize the quality and performance of Delaunay triangulation. The procedure of triangulation building and LOP optimizing are defined as follows: (1) Incremental insertion algorithm: As shown in Fig. 2, the basic steps of building triangulation which use the incremental insertion algorithm, is described in Table 1: Step1 build a large triangle as the initial triangle which contains all of the points.
Step2 freely select one from the points as interpolation point in the large triangle.
Step3 search the triangle which includes this point, then link this point with other three point of the triangle to generate three new small triangle.
Step4 call the LOP optimization approach to update all the triangle generated by Step2.
Step5 repeat Step2 to Step4 until all other points are to process.
Step6 delete the triangles which contains initial triangle vertices to generate Delaunay triangulation network ( Fig. 2(f)). (2) LOP (Local Optimization Procedure): LOP is based on the nature of maximum-minimum angle, which means that the minimum of the six angles won't be increased by exchanging two diagonals of a convex quadrilateral. So we can usually use the related properties of angle of circumference to judge whether the minimum of the six angles changes or not. As shown in Fig. 3(a), when the point D locates at the circumscribed circle of the triangle ABC, it can meet the condition of ∠A+∠D=π and sin(∠A+∠D)=0. So we could not exchange two diagonals of a convex quadrilateral. In Fig.  3(b), when the point D lies outside of the circumscribed circle of the triangle ABC, two diagonals would not be also exchanged due to meet the condition of ∠A+∠D<π and sin(∠A+∠D)>0. When the point D locates in the circumscribed circle as shown in Fig. 3(c), we can exchange two diagonals to meet this nature of minimum angle. For the two similar triangles of ABC and A ' B ' C ' (A corresponds to A ' , similarly for B and C), the similarity Ia of ∠A (its value is a) and ∠A ' (its value is x) is defined as: where d(x) equals to exp{-(x-a) 2 /2 2 },  equals to a/6.
For a pair of triangles, the similarity Ii of other two angles can also be calculated by equation (7). So the similarity of both triangles can be indicated as equation (8).
The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLII-3, 2018 ISPRS TC III Mid-term Symposium "Developments, Technologies and Applications in Remote Sensing", 7-10 May, Beijing, China In this paper, we search and select the triangles which are greater than 0.75.

Projective invariant:
According to the projection relation of the two images, projective invariant can be used to judge whether points remain the same nature and quantity after the projective transformation. Based on the above process, projective invariant process can make the two images to obtain fine matching. In this paper, cross-ratio is used to analyze a pair of triangles from two images. As stated above, the cross-ratio of straight line to point A and the corresponding A' is taken as an example, as shown in Fig. 4.  Fig. 4(b), the cross-ratio of point A represents IA=(sin∠FAC*sin∠BAE)/(sin∠FAE*sin∠BAC), and the cross-ratio of point A' expresses IA'=(sin∠F ' A ' C ' *sin∠B ' A ' E ' )/(sin∠F ' A ' E ' *sin∠B ' A ' C ' ). These straight lines which consist triangles are regarded as the right match, only if IA equals to IA'. Finally, the feature points which compose of these straight lines, are accurately extracted from both images.

EXPERIMENTAL RESULTS
In order to verify the reliability and advantage of the proposed method, three reference methods (SURF+RANSAC, CC_SURF, DO_SURF) and the proposed approach are compared to analyze their performance and the obtained results. Then evaluation criterion which consists of correct matches number (N) (Li et al., 2015;Ma et al., 2017) and root mean square error (Gong et al., 2014;Kupfer et al., 2015;Ma et al., 2017), is used to verify the accuracy and robustness of the proposed method. In the paper, we selected the two images with different perspectives that both images as shown Fig. 5 (Zheng et al., 2015). These four methods stated above are used to perform and operate matching process of the both image, and the corresponding results of feature points detection, feature points matching and matching results are shown in Fig. 6, Fig. 7 and Fig. 8, respectively. The corresponding value of N and RMSE are shown in Table 2 The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLII-3, 2018 ISPRS TC III Mid-term Symposium "Developments, Technologies and Applications in Remote Sensing", 7-10 May, Beijing, China   Fig. 6, we can see that the proposed method detected about ten times feature points more than other three methods due to the more rich color information obtained by color invariant transformation and information entropy. Fig. 7 and Fig.  8 show that all four methods greatly removed many outliers and remain correct matches point pairs. However, the number of correct matches obtained by the proposed method, are more than those by three comparison methods, and the evenness of match points obtained by the proposed approach, are better than the others. In addition, as shown in Table 2, the RMSE value of SURF+RANSAC, CC_SURF and DO_SURF are larger than the proposed approach. On the contrary, the Correct Matches Number (N) of SURF+RANSAC, CC_SURF and DO_SURF are less than the proposed approach. It can be concluded that the proposed approach outperforms the three comparison methods in terms of the correct matches number and matching accuracy.

CONCLUSION
In the paper, we propose a method based on SURF to improve detecting and matching performance. The proposed method introduces color invariant transformation and information entropy to greatly maintain color information and detect more feature points. Then a series of constraint conditions including Delaunay triangulation, triangle similarity function and projective invariant, are used to filter out the mismatch feature points, and are to ensure a high correct matching accuracy of matching results. The above experiments also showed that the robustness and precision of the proposed approach are superior to the others three methods.