TRIPLE EPIPOLAR IMAGES GENERATION AND MATCHING IN TRISTEREO IMAGING ON THE SAME ORBIT MODE

This paper proposed the method to generate triple epipolar images in tristereo imaging on the same orbit mode. It generated projection tie points based on double projection surface in object space, and designed conversion model from epipolar images to its corresponding original images on the basis of projection tie points and created the RPC of epipolar images. During stereo matching one stereo image pairs are used as the basic pair and proposed the rapid transformation model between the current matching points and the corresponding point on the third image, which accomplished the combination of epipolar stereo matching and object space geometric constraints. After left/right bidirectional matching and occlusion area checking, a third matching was implemented. The cost of occlusion area will use the smallest one among the three and these areas were revamped. SGM was used as the optimization strategy. Experiments in three typical areas using Pleiades and ZY-3 as data recourses show that the loss of precision is negligible in the proposed epipolar image generation method. The proposed matching method combines the best of both object geometric constraint method and isolate epipolar image pair method, and the computational efficiency significantly better than these two methods. Lastly, the precision of the proposed method is significantly better than these two methods in building area and weak texture area.


INTRODUCTION
In the process of dense matching of aerial and satellite stereo images to obtain DSM, data and algorithm are complementary and indispensable. In terms of data alone, in addition to the geometric and radiation quality of a single image, redundant observation from three or more views is also an effective means to improve the matching accuracy and robustness. In particular, the tristereo images obtained on the same obit can achieve a large degree of stereo overlap and the maximum consistency of acquisition time. The same obit tristereo image includes two kinds: acquiring by swinging along the obit direction through a single camera and acquiring by three-line-array cameras. The former is represented by Pleiades tristereo images, while the latter includes Alos, ZY-3, etc. The use of object geometric constraints in three view stereo matching can include two to three pairs of matching costs at the same time, which is a strong constraint condition, but the processing of the occlusion area is not ideal, and the establishment of image point correspondence through object projection transformation requires a certain amount of calculation. Another strategy is to generate the epipolar image pair first, match the image pair by pair, and fuse the matching results. This method does not need to establish the corresponding relationship of matching points through object projection transformation, and can detect the wrong matching caused by occlusion through the consistency of left and right checking. However, it is not easy to introduce object geometric constraints, and each image pair needs two independent matches. Based on the two characteristics of image deformation under the condition of same orbit stereo imaging, which mainly focuses on the direction along the orbit and the stability of satellite platform operation, this paper proposes a three view epipolar image production and coordinate transformation model, which realizes the combination of algorithms and data.

Projection connection point
The projection connection point between the reference image and a matching image and the corresponding ground point on an elevation reference plane are obtained through two projection datum planes, but this paper extends the algorithm to the same orbit three view mode, and uses a completely different epipolar image coordinate conversion model. The specific steps are as shown in Figure 1. The forward view image point P 1 calculates the ground point Pg 1 through the forward view image RPC (rpc N ) and elevation plane H 1 , projects Pg 1 onto the backward view image through the backward view image RPC (rpc B ), calculates the ground point Pg 2 through the rpc B and elevation plane H 0 , calculates the forward view point P 2 through the rpcn, and calculates the ground point Pg 3 through P 2 , rpc N and elevation plane H 1 , Calculate the backward view point of Pg 3 through rpc B , and obtain the projection connection point between front view and rear view and the corresponding ground point Pg n (n = 1,3,5,...) on elevation plane H 1 through this "catch-up" method. The obtained ground point Pg n (n = 1,3,5,...) is projected onto the forward-looking matching image through the forward-looking image RPC (rpc F) , and the projection connection point of the third image is obtained. The specific steps are as follows: (1) Determine the sequence of "left and right" slices and the average inclination of epipolar Since the order of actually importing images may not be determined, the order of "left and right" images is required. The purpose is to determine the positive or negative correlation between the elevation change of elevation datum and the change of projection coordinate, which is expressed by a parameter s of 1 or -1. The method is to obtain a projection connection point from one end of the image (the lower end in this paper). If the line coordinate of the point is less than the initial coordinate s = 1, otherwise s = -1. According to parameter s, the initial elevation plane is reset to obtain a projection connection point on the projection trajectory line. Calculate multiple dip angles with multiple groups of adjacent connection points, and take the average value to obtain the average core line dip angle.
(2) The range of epipolar image and other auxiliary parameters are determined according to the inclination of epipolar Before calculating the projection connection point, several parameters need to be calculated first, including the elevation resolution parameter dh, the distance dh between the two projection planes, and the expansion range of the epipolar image in two directions relative to the original image. The centre coordinates of the reference image are calculated according to the HEIGHT_OFF of the reference image RPC, obtain the projection coordinate p 0 of the right image (the first matching image), and project the centre coordinate of the reference image according to the right image (the first matching image) by elevation HEIGHT_OFF+1, and obtain p 1 . The image plane distance between P0 and P1 is dp, the elevation resolution is dh = 1.0/dp (m). The elevation interval between the two projection datum planes H 0 and H 1 is set as hoff = H/ M × dh (M is the number of points on each projection trajectory line). The outward expansion range of column and row direction is: The number of reference points (starting points) of the projection starting point in the line direction of the epipolar line image is set as n, so the distance between adjacent projection starting points in the line direction of the epipolar line image is 2 W ⋅ ∆ / n.
(3) Generate projection connection points covering the whole epipolar image range As described above, n projection trajectory lines need to be generated between the reference image and the first matching image. The starting point of projection trajectory n on the reference image is: Where epioff is the average interval between two adjacent obits perpendicular to the direction of the projected obit. It depends on the number of projected obit lines. In this paper, the total number of obit lines n is set to 10, so epioff = w / 10. (4) Generates the projection connection point of the third image Project the projection connection point on the reference image onto the third image through the elevation reference plane H 0 (as shown by the red arrow in Figure 1) to obtain the projection connection point of the third image.

Figure 1
Process diagram of obtaining projection connection points by double projection plane method.

Coordinate transformation model
After obtaining the projection connection points of the images from three perspectives, how to establish the coordinate transformation model between the epipolar image and the corresponding original image is still a key problem to be solved. (2) Calculate the "offset scale coefficient" of the image along the epipolar and the image perpendicular to the epipolar on the matched image In this paper, the scaling coefficient is expressed by the average value of the distance ratio between the first and last two projection points on each projection obit of the matched image and the corresponding projection point of the reference image.
The scaling coefficient can also be calculated in segments and expressed as a function of image plane coordinates. The symbols sclc i and sclr i (i = 1, 2, 3 for image serial number) are used to represent the average proportion along and perpendicular to the projection trajectory. Where sclc 0 = 1, sclr 0 = 1.
(3) Calculate projection trajectory tilt parameters The projection trajectory tilt parameter represents the angle between the projection trajectory line and the row direction of the original image. It is calculated by using the coordinates of the starting points of the first and last projection trajectories. The calculation method is the row coordinate difference of two points divided by the column coordinate difference, which is represented by the symbol slp.  The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLIII-B2-2022 XXIV ISPRS Congress (2022 edition), 6-11 June 2022, Nice, France

Generate RPC for epipolar image
To calculate the epipolar image RPC, we need to first calculate the image point coordinate ground coordinate mapping grid of the epipolar image. In this paper, the object geodetic coordinates (L, B) are calculated by converting the image plane coordinates (c, r) of the epipolar line image to the image plane coordinates (c o , r o ) of the original image, and then through the RPC coordinates of the original image and the object elevation H. (c, r, l, B, H) constitute a control point for calculating the core image RPC. The image coordinate density is 21×21, the object elevation plane is divided into 5 layers, and the elevation of the middle layer is set as the HEIGHT_OFF parameter of the original image RPC. The interval between two adjacent floors is height_ Half of the scale parameter. When calculating the control points, we do not directly use the rules such as square grid coordinates and object square regular elevation surface, but add a random number on the basis of the regular position. The range of adding random number to image square coordinate is 0 to half of the space between image square grids, and the range of adding random number to object square elevation is 0 to half of the space between two regular elevation planes. In addition to the control grid, the checkpoint grid is generated at the same time to check the RPC accuracy.

Epipolar image resampling
According to the coordinate transformation relationship from the epipolar image calculated in 2.2 to the original image, calculate the image point coordinates of the original image corresponding to the image point of the epipolar image point point by point, and obtain the DN of the point through bilinear interpolation as the DN of the epipolar image.

Global optimization and semi-global optimization methods
The most widely used global optimization strategies in the field of stereo matching can be divided into two categories: global optimization method based on message propagation and global optimization method based on graph. The most representative of the former is the belief propagation (BP) algorithm, and the latter generally refers to the graph cuts (GC) algorithm. The real global optimization is a NP hard problem. Most so-called global optimization methods can only approach the global optimization as much as possible to obtain the suboptimal solution. Although these methods have achieved good experimental results, the amount of calculation is large. Hirschmüller's semi-global matching algorithm (SGM) approximates global energy minimization through multidirectional confidence propagation and matching cost aggregation. The idea of the algorithm is similar to that of BP. The main contribution is to use P 1 and P 2 smoothing parameters to punish small parallax changes and large parallax changes, and to obtain the optimal solution through the matching cost propagation and aggregation of multiple paths. The algorithm has achieved satisfactory results in accuracy and execution efficiency, which is the optimization strategy adopted in this paper.

Matching cost
Matching cost represents the similarity between two or more matching points. Common matching costs include normalized correlation coefficient, color difference, mutual information, census distance and so on. Census distance is the matching cost based on window, which not only has good edge retention characteristics, but also has strong robustness. In this paper, census distance under 9 × 7 windows is used as the matching cost. The census transformation can be carried out independently on the whole scene image. The coordinates of two points are known, and the census distance between them is the bit comparison process of two census values, so it can realize fast calculation.

Pyramid matching strategy
The image pyramid is established by wavelet transform, matched layer by layer, and the matching result of the current layer is transmitted to the next layer as the initial value. The search space of each point is determined by counting the parallax range in the neighbourhood of the point, that is, each point has an independent search range. Each pyramid image is matched three times. 1, 2 and 3 represent the image sequence of the first matching, then the image sequence of the second matching is 2, 1 and 3. The first two matching uses the sum of the three matching costs between 1-2, 1-3 and 2-3, and checks the suspicious areas through the consistency of the two matching results. These areas are generally occluded areas. The sequence of the third matching image is the same as that of the first one, but the smallest of the three matching costs will be used for the area tested as suspicious. view (reference image) image points are projected to multiple elevation planes within a certain range. Projecting these coordinates onto the matching image in turn will form a projection line. The correct matching point is on the projection line, and the correct matching points on multiple matching images correspond to the same object elevation, that is, to the same object coordinate. Although this calculation process can be simplified by local fitting algorithm, there is still a large amount of calculation, which affects the speed of matching. After the triepipolar image is generated, this object geometric constraint is still applicable, but this will lose the main value of the three view epipolar image. After the triepipolar image is generated by the method in this paper, the geometric constraints can be directly incorporated into the image side without the projection transformation of the object side, so as to improve the calculation speed. As shown in Fig. 2 (b), it is assumed that the image point coordinates of the reference image used to calculate the matching cost are P N (r N , c N ), and the search point coordinates on the matching image are P F (r F , c F ). The ground coordinates are obtained by the front intersection of the two. The image point coordinates P B (r B , c B ) are obtained by projecting the coordinates through the RPC of the second image. Under the condition of triepipolar image, r N = r F = r B . Although this is rigorous in theory, it still needs to obtain the coordinate mapping relationship through the object projection, which has no advantage over the step in Fig. 2 (a). In practical application, the known coordinates P N and P F can be quickly calculated by the model c B = a + b × (c F -c N ) or c B = a + b× (c F -c N ) + c×r N + d×c N , where a, b, c and d are fitting parameters. Generally, the fitting error can be controlled at about 1‰ pixel only by translation parameter a and scaling parameter b. Parameter b depends on the image sequence. In this paper, the fitting parameters are calculated by the coordinates of the four corners of the image and the coordinates projected onto the second and third images.

Checkpoint error of epipolar image RPC
The method mentioned in 2.3 is used to calculate the checkpoint. The image plane coordinates of the checkpoint are randomly floating on the basis of regular grid, and the object elevation is also randomly floating on the basis of average elevation. The distribution and size of checkpoint error are represented by twodimensional surface graph. Figure 3 shows the error distribution (in pixels) of RPC checkpoints in the core image of ZY-3 (range about 50km × 50km) and Pleiades image (range about 4km × 4km). It can be seen that the inspection point error of ZY-3 with a large image range is significantly greater than that of Pleiades epipolar image, which is about 2‰ pixel. The inspection point error of Pleiades epipolar image is significantly less than this order of magnitude, but the distribution characteristics of single error area are very similar to that of ZY-3. The reason for this feature is uncertain, which is estimated to be caused by the RPC solution process. The error shown in Figure 3 reflects the accuracy of the epipolar image RPC relative to the original image RPC, which is also the basis of the follow-up work.  Figure 3 Distribution of RPC checkpoint error of epipolar line image.

Statistics of line direction error of triple epipolar image
The greatest significance of the triepipolar image line is to restrict the image points with the same name to the same image line of three images. Therefore, it is necessary to evaluate the line direction accuracy of the epipolar line image by comparing the deviation between the lines coordinates of the image points of the reference image and the line coordinates of the matched image projected by the object side. Figure 4 shows the experimental results of a scene ZY-3 triepipolar image. The accuracy is counted through the grid evenly distributed on the image and expressed in the form of three-dimensional surface map. It can be seen that the row direction error of front view, front view and back view can be controlled within 2‰ pixels. It should be noted that this is only the accuracy evaluation in the sense of orientation parameters. If there is a certain error in orientation parameters (mainly the error of obit crossing direction under the condition of three-dimensional on the same obit), there will also be a certain line coordinate error in the real image point with the same name. This requires that the relative error of the orientation parameters should be controlled at a small order of magnitude as far as possible in the orientation parameter correction stage. It can also be corrected in the triepipolar production stage through the image points with the same name obtained in advance, but the former is the preferred strategy.

Accuracy statistics of coordinate mapping model
In this paper, the relative error of image point coordinates of the second matching image obtained by the two methods (b) and (c) in Figure 2 is also statistically analysed. Figure 5 is the statistical results of two typical regions. The terrain of region 1 is flat, while the terrain of region 3 fluctuates greatly. It can be seen that the maximum error is less than 1‰ pixel, which is a small amount and independent of the terrain. It is proved that the coordinate mapping model used in this paper can completely replace the method of material transformation, so as to reduce the amount of calculation under the condition of ensuring the accuracy.
The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLIII-B2-2022 XXIV ISPRS Congress (2022 edition), 6-11 June 2022, Nice, France (a) Zone 1 (b) Zone 3 Figure 5 Error distributions of two typical regional coordinate mapping models.

Evaluating indicator
The accuracy evaluation indexes used in this paper include mean (mean), median (MED), standard deviation (STDEV) and normalized absolute deviation median (NMAD). Among them, the mean and median represent the systematic error, the standard deviation is the accuracy evaluation index excluding the average error factor, and the median of standardized absolute deviation is the accuracy evaluation index with strong robustness.

Experimental analysis
DSM extraction and accuracy evaluation use the Pleiades three view stereo image of three regions in a certain region of Europe and the ZY-3 three-linear-array stereo image of the same region. The reference data in this paper is lidar data with an interval of 0.5m, and the plane coordinates are UTM projection coordinates. Therefore, it is necessary to convert the matching points into the ground point cloud data under UTM projection coordinates, and then generate the regular grid DSM through the reciprocal distance weighting method. The interval between the generated Pleiades DSM plane and the ZY-3 DSM plane is 1m and 5m respectively. The points whose absolute deviation between DSM and lidar is less than 8m participate in the statistics. Three matching strategies are used for comparative analysis, which are object geometric constraint method, independent epipolar image pair method and epipolar image pair method with geometric constraints. The object geometric constraint method refers to a matching method that forms a projection trajectory on two matching images by transforming the ground elevation corresponding to the image points of the reference image, then establishes the corresponding relationship between the matching points and calculates the matching cost. The independent epipolar image pair method refers to the matching of the reference core image with the matching image 1 epipolar image and the matching image 2 epipolar images respectively, and the parallax is converted into elevation and fused together in the image side, or the parallax is converted into ground point cloud and fused together in the object side.
The former is adopted in this paper. The epipolar line image pair method with geometric constraints refers to the matching image 2 eppolar line image is also included through the method shown in Fig. 2 (c) based on the epipolar line image pair of the reference image and the matching image 1, and three pairs of matching costs are integrated for matching. The main difference of these three matching methods is the calculation method of matching cost, which can also be said to be the establishment method of coordinate mapping relationship. The following global optimization strategies adopt semi-global optimization method. The final method to obtain the ground point cloud is the fitting calculation method in 3.5. Table 1 and table 2  It can be seen from Tables 1 and 2 that the mean and median values of the three methods are quite different, which is caused by the working mechanism of the three methods and the residual error of image orientation parameters. The difference between the mean value and median value of the same method is often greater, which may lead to the "contradiction" between STDEV and NMAD. This is because all errors in STDEV calculation process subtract the systematic quantity of mean error, while all errors in NMAD calculation process subtract the systematic quantity of median MED. As shown in the results of area 1 in Table 2, compared with methods ① and ③, STDEV index is improved, while NMAD index is reduced. However, in general, the corresponding statistical indexes of the three methods are not significantly different, but the calculation speed of methods ② and ③ is significantly faster than that of the geometric constraint method. The calculation speed of method ③ introduced in this paper is only slightly lower than that of method ② (less than 15%). However, method ③ only needs three matches under the condition of considering the left-right consistency detection and the refined matching of suspicious areas, and the third match focuses on the areas that do not meet the left-right consistency detection, and the efficiency is significantly higher than the first two. The overall time is about 2.5 times that of single image pair matching in method ② independent image pair strategy. Method ② took about 1.6 times as long as method ③ for four independent matches. Therefore, method ③ not only realizes the geometric constraints of the object, but also greatly improves the calculation efficiency compared with the two methods. Figure 6 shows the cities and industrial areas of Pleiades image area 1 with dense buildings and easy to be blocked. Figure 7 and figure 8 show the weak texture areas of Pleiades image areas 2 and 3 in the shadow of mountains. Table 3 shows the statistical results of DSM accuracy indicators corresponding to these three special areas. It can be seen that in addition to the STDEV index of area 1, the methods ① and ③ containing the geometric constraint function of the object are higher than those of method ② , especially the decimetre accuracy improvement in weak texture areas 2 and 3, which fully shows the importance of the geometric constraint of the object. Comparing methods ① and ③, it can be seen that except that STDEV index method ③ in area 1 is lower than method ①, other index methods ③ are better than method ①. Therefore, compared with method ①, method ③ not only has the advantage of computational efficiency, but also has better accuracy in occlusion, weak texture and other areas. Fig. 9 is the DSM error diagram of method ③ in three regions, which shows that high-quality DSM extraction can still be carried out in these difficult regions. Fig.  10 is a comparison of DSM obtained by three matching methods of building occlusion area. It can be seen that method ① there are a large number of mismatches on the left side of the building due to the lack of occlusion detection function. Method ② although occlusion detection can be carried out, it does not include object geometric constraints, so it is rough. Method ③ has the function of geometric constraint and occlusion detection, so the effect is significantly better than methods ① and ②. Figure 11 is a comparison of DSM obtained by three matching methods in weak texture region. It can be seen that the results of methods ① and ③ are significantly better than those of method ②, which once again reflects the importance of geometric constraints.

Figure 6
Image of artificial building area in Pleiades area 1.

Figure 7
Weak texture area of Pleiades area 2.

Figure 11
Comparison of DSM obtained by three methods in weak texture area (the upper left is the reference image, and the upper right, lower left and lower right correspond to methods ①, ② and ③ respectively) .

CONCLUSION
In this paper, a method of generating triepipolar image based on projection connection points is proposed. Compared with the original image, the accuracy loss can be controlled in a very small order of magnitude. In stereo matching, based on an epipolar image pair, a fast coordinate transformation model between the current matching point pair and the corresponding point of the third epipolar image is proposed, which realizes the combination of epipolar image pair matching strategy and object geometric constraints. Compared with the object geometric constraints method based on object projection transformation, the calculation efficiency is significantly improved. The fast conversion from matching parallax map or elevation map to ground point cloud data is realized by fitting algorithm. The third matching is carried out on the basis of twoway matching to detect the occluded area. The occluded area adopts the smallest of the three matching costs, which can realize the occluded area detection and repair function of the independent core line image pair method, but the overall calculation efficiency is greatly improved. In the dense building area and weak texture area, the accuracy of the matching algorithm is obviously better than the object geometric constraint method and the independent epipolar image pair method.