IMPROVING RANSAC FEATURE MATCHING BASED ON GEOMETRIC RELATION

: Feature Matching between images is an essential task for many computer vision and photogrammetry applications, such as Structure from Motion (SFM), Surface Extraction, Visual Simultaneous Localization and Mapping (VSLAM), and vision-based localization and navigation. Among the matched point pairs, there are typically false positive matches. Therefore, outlier detection and rejection are important steps in any vision application. RANSAC has been a well-established approach for outlier detection. The outlier ratio and the number of required correspondences used in RANSAC determine the number of iterations needed, which ultimately, determines the computation time. We propose a simple algorithm (GR_RANSAC) based on the two-dimensional spatial relationships between points in the image domain. The assumption is that the distances and bearing angles between the 2D feature points should be similar in images with small disparity, such as the case for video image sequences. In the proposed approach, the distances and angles are measured from a reference point in the first image and its correspondence in the other image, and the points with any significant differences are considered as outliers. This process can pre-filter the matched points, and thus increase the inliers’ ratio. As a result, GR_RANSAC can converge to the correct hypothesis in fewer trial runs than ordinary RANSAC.


INTRODUCTION
A key aspect of all vision applications, such as image registration and alignment, structure from motion applications, Visual Simultaneous Localization and Mapping, and vision-based localization and navigation is how to find correct correspondences between the images, therefore feature matching plays a pivotal role in these applications. A primary concern of the feature matching is the correctness of the matched point pairs, so one of the biggest challenges is how to refine the correspondences by rejecting the mismatched point pairs. Traditional methods for outlier rejection as shown in Figure 1, rely on RANSAC. The performance of RANSAC (Yang and Li., 2013.) depends primarily on the features which are obtained from different feature detection and extraction methods. Scale-Invariant Feature Transform (SIFT) (Lowe., 2004), Speeded-Up Robust Features (SURF) (Bay et al., 2008), and ORiented FAST and rotated BRIEF (ORB) (Rublee et al., 2011) are common feature extraction and detection methods. The main challenge for these features is to be invariant in both scale and rotation changes so that the same features can be detected for the same object under different projections. Several feature matching modules are also proposed. The most popular ones are brute force matching, approximate nearest neighbor, and local sensitivity hashing (Li et al., 2015). RANSAC has been widely adopted in many different computer vision solutions, such as estimation the fundamental matrix. The fundamental matrix is a key factor since it contains the relative transformation between image pairs. The relative transformation helps in the projective reconstruction of a scene (Bharati et al., 2018). Many algorithms have been introduced to estimate the fundamental matrix from the correspondences. These algorithms can be categorized into three approaches: linear, iterative, and robust estimation (Lowe., 2004). RANSAC is considered a robust estimation. It is the most widespread method used to enhance the correspondences and robustly fit a model to a dataset in the presence of outliers. RANSAC has also been proposed to remove the false positive pairs (Brown et al., 2005;Turcot et al.,2009;Zhang et al., 2011). The RANSAC algorithm performs poorly and has increased iterations when there is a higher outlier's ratio. An image set with a high outliers' ratio processed using RANSAC can lead to a bad hypothesis and poor results even after a large number of iterations (Bhattacharya et al., 2012). Consequently, a good hypothesis might not be found by RANSAC even after many iterations if there is a high outliers' ratio. RANSAC has been studied extensively, demonstrating the importance of the algorithm. Several approaches have been introduced before RANSAC, such as M-estimator, L-estimator, R-estimator, and least median of squares (LMedS) (Fotouhi et al., 2019) which used nonlinear minimization techniques and complex loss functions. Several studies focus on optimizing RANSAC, such as NAPSAC algorithm which was proposed as a guided sampling approach to speed up RANSAC (Myatt et al., 2002). PROSAC algorithm uses prior information to generate a matching score for the guided sample technique. Spatially consistent random sample consensus (SCRAMSAC) was proposed as a spatial filter (Sattler et al., 2009). Various algorithms were proposed to speed up RANSAC as GroupSAC (Kai Ni et al., 2009), GASAC (Rodehorst et al.,2006), and ANSAC (Otte et al., 2014). Moreover, some studies have been proposed for Robustness such as AMLESAC (Konouchine et al., 2005) (Choi and Medioni, 2009). Different algorithms are used to improve the accuracy of RANSAC like MAPSAC (Torr, 2002), IMPSAC (Torr and Davidson, 2003), and LO-RANSAC (Lebeda et al., 2012;Chum et al., 2003). More recently, literature has emerged such as SuperGlue (Sarlin et al., 2020) which Learns Feature Matching based on Graph Neural Networks and LP-RANSAC (Wang et al., 2020) which uses RANSAC with locality preserving constraint. In this paper, we propose a method for removing outliers by using a reliable, fast, and simple algorithm (GR-RANSAC) as shown in Figure 2. The primary purpose of GR-RANSAC is to pre-filter the data to improve RANSAC algorithm to fit the model correctly and fast. GR-RANSAC utilizes the geometric relationship between local features to generate an acceptable model in fewer iterations. The assumption is that the distances and bearing angles between the 2D feature points should be similar in images with small disparity, such as the case for video image sequences. This study has examined on RANSAC algorithm to fit a homography model by making use of the geometric relations between the 2D feature points in the image domain to construct a refined set of matches. The GR-RANSAC algorithm has been applied to a variety of images from the Oxford datasets to examine its efficiency of removing mismatched pairs. Results indicate that the GR-RANSAC algorithm fits the model correctly in fewer iterations than ordinary RANSAC, therefore, the method requires considerably less processing time.

Feature Extraction and Matching
Feature extraction methods are a major element and play a vital role in computer vision and photogrammetry. The most common feature extraction algorithms are SIFT, SURF, and ORB. SIFT is the most widely used method because of the invariance to be scaling and rotation of images. SIFT is particularly useful in datasets with significant changes in illumination and real-time applications. SIFT uses Difference-of-Gaussian (DoG) which yields a faster solution than a normalized Laplacian of Gaussian (LoG). The SURF algorithm is fast and robust feature detection and extraction algorithm. ORB is the combination of oriented FAST (Features from Accelerated Segment Test) and rotated BRIEF (Binary Robust Independent Elementary Fast) with some modification to enhance the performance of key point identification. The process of finding correspondences between two images of the same scene or object is called keypoint matching. Matching features extracted can be obtained by comparing feature descriptor sets in image pairs using the nearest neighbor method. The correspondences can be achieved when the ratio between the shortest distance and the second shortest distance is smaller than a given threshold.

RANSAC Algorithm
RANSAC (Random Sample Consensus) is a robust estimator that was proposed by (Fischler and Bolles, 1981) to fit a model to data and can remove the false positive matches among a set of matched features. Iteratively, a random subset of matched pairs from the matched points list is picked to fit the model. This subset of matches differs based on the model; a minimum of five matching points is required to estimate an essential matrix (Lowe et al, 1999) or a minimum of seven matched pairs is required to estimate the fundamental matrix (Luong and Faugeras., 1996;Yang and Li., 2013). The resulting model is applied to the other matches in the matched list. The matches that fit the model will be considered as hypothetical inliers, in contrast, the correspondences that do not fit the model are labeled as hypothetical outliers. After many iterations, the model that has the highest number of hypothetical inliers is considered the best model. The number of iterations is dependent on the inliers' ratio in the dataset. As a result, with a lot of noise (false positive matches) in the matched list, many iterations may be required before RANSAC can find a correct hypothesis. The number of iterations M used with the RANSAC algorithm to achieve a certain performance level is as follows: where ε is the outlier ratio, m is the minimum number of points necessary to fit the model, and P is the probability that at least one out of M samples does not include an outlier. The number of iterations depends on the number of matches necessary to fit the model and the outlier ratio. The most interesting aspect of the number of iterations is the exponential increase in iterations with the increasing outlier ratio. Figure 3, demonstrates the exponential increase of iterations given the number of points necessary to fit the model, where P = 0.95.

THE PROPOSED ALGORITHM (GR_RANSAC)
A key objective of this algorithm is to use geometrical relations between local features to remove the false positive correspondences. The focus is on geometrical relations because the distances and bearing angles between the 2D feature points should be similar in adjacent images, acquired in a sequence. Therefore, the matches will be considered as outliers if there is a significant difference in the geometrical relations as shown in Figure 4.  An important factor that emerged at the initial stages of the algorithm development was how to choose the reference point to which the geometric relation is measured. The best candidate reference point has the best ratio test score (ratio of distances is closest divided by next closest) (Lowe., 2004). A random set of matches was chosen from the matches list which contains the pairs of correspondences, then the distances and bearing angles were measured between the candidate reference point in each image separately and the rest of the random set. If there is no significant difference between the results found, the candidate pair was accepted as a reference point, otherwise, the second-best score candidate pair was chosen, and the process was repeated until an acceptable point has been found, as shown in Figure 5. shows the reference pairs. To start the algorithm the reference point is chosen by measuring the geometrical relation between the candidate reference points and the rest of the random set, and then making the decision whether the differences are less than the threshold, so this point as a reference point is accepted.
After finding the reference points, the geometric relation between these points and all other points are calculated. Assume a set of N points, we calculate the Euclidean distances between one reference point = { . } to all other points in the same image = { . } with ≠ in addition to the bearing angle . = atan 2( − . − ) and the same for the other image as shown in Figure 5 a and b. If the difference in the distance or in the angle is higher than the predefined tolerance threshold, the point will be labelled as an outlier.
Where is the distance difference and d . is the angle difference.

RESULTS
The Oxford landmark dataset was used to evaluate the performance of the GR_RANSAC algorithm. The steps of the algorithm are applied to several image pairs to reject the false positive pairs of the matched points set, as shown in Figure 6. Analyzing the result of applying the algorithm to many image pairs shows that the number of the inliers greatly depends on the selected distance and angle thresholds. If a high threshold is applied, the number of outliers could include inliers. Obviously, the threshold values depend on whether the two images are close to each other in terms of disparity. The case with significant rotation or translation between images as well as larger thresholds is shown in Figure 7. It can be seen from Figure 8, that the number of outliers increases significantly with increasing the threshold values.
(a) Matched points between two images of the Ashmolean Museum Oxford (b) Inliers matched with distance and angle thresholds of 10 pixels and 30, respectively.
(c) Ouliers detected with distance and angle thresholds of 10 pixels and 30, respectively.
(d) Inliers matched with distance and angle thresholds of 5 pixels and 1, respectively.
(e) Ouliers detected with distance and angle thresholds of 5 pixels and 1, respectively. The main objective of developing this algorithm is how to exploit the geometrical relation to define a subset of matched pairs, and, thus, reduce the outlier ratio and make RANSAC execute faster. In this investigation, we tested the proposed method on the images from the Oxford landmark image database. Figure 9. shows the required number of iterations of RANSAC with and without running GR_RANSAC. The results clearly demonstrate the benefit of the proposed preprocessing to filter outliers from the matched pairs. we set P = 0.95 and outlier ratio = 0.75. Figure 9. Comparison of execution time obtained from the GR_RANSAC and ordinary RANSAC.

CONCLUSION AND FURTHER WORK
The main goal of this study was to reduce or eliminate the outliers from the matched points based on exploiting the two-dimensional relationships between points in the image domain. RANSAC can solve these problems but performs poorly with high outlier ratios. The GR-RANSAC method utilizes the 2D distance and angle relation in the image domain. This allows for more accurate outlier detection and subsequent removal from the list of match pairs from feature detection prior running RANSAC. The GR-RANSAC algorithm outperformed RANSAC algorithm on the Oxford data set producing matched point sets with more inliers in fewer iterations. The proposed method could be beneficial to computer vision and photogrammetry applications that heavily depend on time dependent RANSAC operations or applications that utilize large datasets. Future work may include optimizing and testing the algorithm on more datasets, including images acquired from challenging environments, such as snow or dense forests, etc., that are of particular interest because of the low number of features extracted and the higher rate of false-positive matches. Another possible data consideration is testing different threshold values and comparing the results with the dfferent RANSAC algorithms. The limitations of this experiment were tied to the dataset used in the comparison. GR-RANSAC was calibrated with thresholds optimized for the Oxford data set. Using GR-RANSAC for other datasets will require recalibration of thresholds for the optimal detection of inliers and outliers.