THE PARALLEL IMPLEMENTATION OF ALGORITHMS FOR FINDING THE REFLECTION SYMMETRY OF THE BINARY IMAGES

In this paper, we investigate the exact method of searching an axis of binary image symmetry, based on brute-force search among all potential symmetry axes. As a measure of symmetry, we use the set-theoretic Jaccard similarity applied to two subsets of pixels of the image which is divided by some axis. Brute-force search algorithm definitely finds the axis of approximate symmetry which could be considered as ground-truth, but it requires quite a lot of time to process each image. As a first step of our contribution we develop the parallel version of the brute-force algorithm. It allows us to process large image databases and obtain the desired axis of approximate symmetry for each shape in database. Experimental studies implemented on “Butterflies” and “Flavia” datasets have shown that the proposed algorithm takes several minutes per image to find a symmetry axis. However, in case of real-world applications we need computational efficiency which allows solving the task of symmetry axis search in real or quasi-real time. So, for the task of fast shape symmetry calculation on the common multicore PC we elaborated another parallel program, which based on the procedure suggested before in (Fedotova, 2016). That method takes as an initial axis the axis obtained by superfast comparison of two skeleton primitive sub-chains. This process takes about 0.5 sec on the common PC, it is considerably faster than any of the optimized brute-force methods including ones implemented in supercomputer. In our experiments for 70 percent of cases the found axis coincides with the ground-truth one absolutely, and for the rest of cases it is very close to the ground-truth.


INTRODUCTION
By analyzing binary images it is easy to notice that shapes of many objects, both artificial and natural, have the intrinsic property of reflection (bilateral) symmetry.It is obvious that real-world images can rarely be absolute reflection-symmetric. So, it is valuable to detect approximate reflection symmetry and evaluate the symmetry measure of a shape (see Figure 1).Estimation of symmetry could be used in many computer vision applications like analysis of plants growing conditions or bilateral symmetry of insects.The problem of approximate symmetry detection and symmetry measure calculation applying to binary images is well known but there are a few efficient methods for solving it.The main ones are based on: 1) Fourier series expansion of parametric contour representation (van Otterloo, 1988), 2) contour representation by turning function (Sheynin, 1999), 3) contour representation by critical points and computation of similarity measure for two sub-contours via vectors of geodesic distances (Yang, 2008), 4) pair-wise comparison of sub-sequences of skeleton primitives (Kushnir, 2016).All mentioned methods exploit known algorithms of shapes dissimilarity (or similarity) evaluation.
In order to assess and compare results of above-mentioned efficient methods the exact algorithm of reflection symmetry detection was proposed (Kushnir, 2016).It allows evaluating ground-truth symmetry axis and the symmetry measure as well.
The exact algorithm is based on pair-wise complete enumeration of the shape outer contour pointslines are drawn through all possible pairs, each of them is considered as a potential symmetry axis as follows.A line divides a shape into two parts; each part is represented as a set of pixels.Similarity between two sets is evaluated using the Jaccard measure (for binary sets also known as Tanimoto): where B is the binary image, the brightness of the black pixels is coded by 1, and the white ones is coded by 0; r B is reflection of binary image B relative to straight line, () SB is the set of pixels belonging to image B , the brightness code of which is equal to 1.
The line which divides a shape into two most similar sets (with the least Jaccard measure among all possible sets) is considered as a sought-for symmetry axis.Unfortunately, values of Jaccard measure, being calculated for all pairs of shape contour points, form extremely sophisticated surface in a search space (see Figure 2), so we have to use the brute-force enumeration algorithm for symmetry axis search but it is extremely time-consumingit performs dozens and even hundreds of minutes in an ordinary PC.
It was the reason why two accelerated versions of the algorithm were proposed in (Kushnir, 2016;Fedotova, 2016); the first of them is called the optimization considering semi-perimeters of a shape and the secondoptimization considering the center of mass of a shape.Although such optimizations significantly less computationally expensive, the speed of their execution isn't sufficient for processing large databases.They also need the correct set of input parameters to be chosen in order to achieve the most precise result, which makes the processing of large databases very complicated and laborious.In this paper two parallel versions of the algorithms proposed in (Fedotova, 2016) are described.The first one is the parallel version of brute-force enumeration for supercomputer system which allows evaluating accurate symmetry measures and symmetry axes for large image databases within reasonable time.Such automatically labeled image databases could be used as ground-truth in debugging and testing new fast procedures for searching symmetry axis.The second algorithm is designed as the parallel version of the one of such fast methodsthe method of symmetry axis adjustment (Fedotova, 2016).This method takes as an initial axis the axis obtained by superfast comparison procedure of two skeleton primitive sub-chains (Kushnir, 2016).The parallel version processes an image within sub-second time on the common PC with multi-core processor.Such performance allows solving image analysis applications in real-time conditions and getting the best or very close to the best result.

Parallel version of the reflection-symmetry axis detection algorithm for a supercomputer based on complete enumeration
Brute-force search algorithm always finds the best axis of symmetry, but the computational complexity of its operation depends quadratically on the number of points in the shape contour.It takes on average about two hours to process a single image with a resolution of 800 by 600 pixels on a normal laptop.It was decided to apply parallelization technologies in order to use significant computing power and achieve the appropriate evaluation time on large image datasets.In particular, we used resources of "Lomonosov" (Sadovnichy, 2013;Voevodin, 2012) supercomputer complex at Moscow State University (http://hpc.msu.ru/).
The serial realization of brute-force algorithm based on complete enumeration of all lines -candidates to the desired axis of symmetry: 1.The contour of the shape is represented as a sequence of its boundary points.
2. Take a point from the sequence.
3. Iterate through the remaining points; each of them, together with the point selected in step 2 determines some line.Calculate and store the value of symmetry measure (1) relative to each resulting line.When all possible lines passing through the point chosen in step 2 are examined, this point should be excluded from further iterating.
4. Repeat steps 2 and 3 for a decremented sequence of contour points.
As a result, all the possible lines crossing the shape will be examined.The sought-for symmetry axis will be that of the lines relative to which the maximum symmetry measure is got.
It should be noted that calculation of symmetry measure with respect to each line is performed independently, that means the existence of data parallelism.It can be used in a natural way to develop a parallel version of this algorithm by distributing the handled lines between the application processes implementing the exact algorithm.
To where mod is the modulo operation which gives the remainder after division of one integer by another.
The parallel realization of brute-force algorithm based on complete enumeration for P processes: 1.Each process defines a set of lines to handle according to the scheme described above.
2. Each process iterates through the lines from its set, calculates a measure of symmetry for each line and selects the line which gives the maximum value of symmetry measure.
3. Each process transfers information about the selected line to the zero process.4. The zero process receives information about detected lines from all other 1 P  processes and selects among P lines (including one found by itself) the one which gives the maximum value of symmetry measure.
It is obvious that such algorithm construction guarantees that a line which gives the maximum value of symmetry measure among the complete set of L lines will be found, and which, therefore, will be a sought-for axis of symmetry.This algorithm was implemented in C++ programming language by using MPI parallel programming technology (Michael, 2004) in order to organize data exchange between parallel processes.Selection of MPI programming technology can be justified by the following factors: 1) peculiarities of the algorithm, which contains many independent fragments requiring long calculations, 2) peculiarities of architecture of used computer systemthe "Lomonosov" MSU supercomputer complex has several thousands of eight-processor nodes.The developed implementation of the proposed parallel algorithm allows using resources of the computer system in the most efficient way.
Section 3 shows the results of performance tests for the parallel implementation using the resources of the "Lomonosov" Moscow State University supercomputer complex.It is able to reduce time of a symmetry axis searching to 1-5 minutes per image for images from the "Butterflies" databasecompared with about 2 hours per image by utilizing serial implementation.So, the obtained results show the efficiency of the proposed parallel method.It became possible to find symmetry axes for large image databases.For example, the entire Flavia database, which consists of 1907 binary images of plant leaves (Wu, 2007) with a resolution of 800 by 600 pixels, was processed in less than 9 hours using 1024 processes.
However, the work in production conditions requires speeds comparable to real time (tens of milliseconds in streaming video processing), or close real (hundreds of milliseconds for performing by industrial robots), and these speeds could be achieved on personal computers.A parallel version of the fast symmetry axis searching algorithm implemented for personal computer is described below.

Parallel version of the algorithm based on adjustment of reflection symmetry axis found with fast numeric method implemented for a personal computer
A parallel implementation of the exact complete enumeration algorithm for symmetry measuring is not applicable to problems of real-life decision-making, because it is impossible to involve a supercomputer for everyday tasks solving.So, another parallel realization for a conventional personal computer with multicore processor was developed.It is based on the numerical method which adjusts a reflection symmetry axis found by the fast skeleton primitive sub-chains comparison algorithm (Kushnir, 2016).This method, in general, gives an approximate solution, but has an extremely high operation speed.
Preliminary studies have shown that axis found by skeleton primitive sub-chains comparison algorithm usually gives a smaller value of symmetry measure in comparison with the axis, obtained by exact complete enumeration algorithm, which always gives the maximum value of the symmetry measure for the same image.Nevertheless, the skeleton axis is located so that it crosses the contour of the shape in an  -neighborhood of each of the points of intersection of the exact symmetry axis with the contour of this shape.Thus, the proposed approach is that we refine axis found by skeleton method, i.e. find a line in its neighborhood, which gives a value of symmetry measure larger than the value of symmetry measure corresponding to adjusted axis, that is, we improve the accuracy of skeleton method.The refined skeleton axis is called the seed axis, and any candidate axis in searching process is the probe one.The symmetry axis necessarily crosses the contour of the object, so we consider only the boundary points of the shape to get probe axes.The boundary of the image is represented by a sequence of points numbered as 0... 1 N  .
The results presented in (Fedotova, 2016) show that the second variant of the symmetry adjustment algorithm, based on the location of the center of mass of the figure, has the best timing and accuracy characteristics.The algorithm implies the fact that the symmetry axis must pass through the center of mass of the absolute symmetric shape.
As a rule, the axis of the approximate symmetric shape does not pass exactly through the center of mass, but in its neighborhood, which we will consider as a circle with a center coinciding with the center of mass of the shape.The radius R of such a circle is calculated as R kD  , where D is the distance from the center of mass to the outermost point of the contour, R k is the coefficient of proximity to the center of mass.If there are a priori knowledge about the quality of the seed axis, then it will be enough to go through only those test axes that are not further than the distance R from the center of mass specified through the parameter R k .Only they cross the circle with a radius centered at the center of mass.
This algorithm also has some resource of internal parallelism, connected with search of all probe lines in neighborhood and calculation of the corresponding values of the symmetry measure.
These operations, similar to the previous case, are independent and can be performed in parallel for different test lines.The use of parallel computations in this case leads to additional increase in productivity, which is the goal of this work.
The parallel version of the algorithm has to be developed for a conventional personal multi-core computer that is why OpenMP (Michael, 2004) was chosen as the parallel programming technology.It is designed for shared memory systems and provides convenient tools for manipulating threads within a single application.
The algorithm of adjustment of a symmetry axis found by the skeleton method, taking into account the center of mass of the figure, for T threads is follow: 1.The seed axis is defined by two points The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLII-2/W4, 2017 2nd International ISPRS Workshop on PSBB, 15-17 May 2017, Moscow, Russia This contribution has been peer-reviewed.doi:10.5194/isprs-archives-XLII-2-W4-179-20170,.., in  , where n -the number of segmentation parts, h - the stride of partition, which is calculated as 2/ n  .
4. Iterate through all pairs of the points belonging to sets Q and S obtained in step 3.This enumeration defines l Q S  probe lines; processing of them is distributed statically among T threads.Calculate the symmetry measure relative to each obtained probe axis, located in R -neighborhood of the center of mass, store one of them for which the maximum value of symmetry measure is obtained.5. Choose the one line which gives the maximum value of symmetry measure among T lines represented by each thread.

If the stride
h is bigger than 1, two contour points The proposed algorithm finds the axis, which gives the symmetry measure that is not less than the measure obtained by the skeleton method.
The proposed parallel version of the adjustment algorithm is implemented in C++ using OpenMP parallel programming technology.

EXPERIMENTAL STUDY
We have implemented experimental study of proposed methods on "Butterflies" database, consists of 30 images with a resolution of 400 by 600 pixels (see Figure 3).

Figure 3. "Butterflies" database
The experiments with exact brute-force algorithm were performed in "Lomonosov" supercomputer at Moscow State University.Table 1 demonstrates results for all 30 images from "Butterflies" dataset, namely, values of symmetry measure, processing time of each image for sequential and parallel versions, and achieved speedup.Parallel version examined on 64 processes.The implementation of complete enumeration algorithm allows parallelizing

CONCLUSION
Proposed parallel algorithm of exact search allows finding the etalon symmetry axis within reasonable time using massive processing power (with the aid of supercomputer).
Experimental studies implemented on "Butterflies" and "Flavia" image datasets have shown that the proposed algorithm takes several minutes per image to find the symmetry axis.So, it is acceptable for automatic labeling of databases consisting of hundreds and thousands of images in order to debug and test new fast procedures for searching symmetry axis.
However, in case of real-world applications we need computational efficiency which allows solving the task of symmetry axis search in real time (dozens of milliseconds in video processing tasks) or quasi-real time (hundreds of milliseconds for industrial robots).Moreover, it is not possible to use supercomputers for solving everyday computer vision tasks.So, for the task of fast symmetry calculation on the common multicore PC we elaborated another parallel program, which based on the procedure suggested before in (Fedotova, 2016).This method takes as an initial axis the axis obtained by superfast comparison procedure of two skeleton primitive subchains.In this case we use the OpenMP technique which makes possible to find an axis that close to the ground-truth axis of symmetry.This process takes about 0.5 sec on the common PC that is considerably faster than any of the optimized exact bruteforce methods including ones implemented on supercomputer.
According to Table 2, the adjustment algorithm finds correct symmetry axis for the 21 images of 30 (green filler in the table).
For other nine images the adjusted axes are very close to the etalon (see Figure 4).An inexact final decision depends on bad seed axis or too complicated contour configuration in the search area (feelers of a butterfly).

Figure 1 .
Figure 1.Examples of shapes with symmetric axis corresponding to the maximum symmetric measure according to Jaccard similarity

Figure 2 .
Figure 2. The example of a binary image and the surface formed by values of Jaccard measure calculated for all ( 1) / 2 NN  pairs of contour points ( a and b ) limit the finite set of certain points of the contour [ ; ] ab .Determine the points that are in a predetermined  -neighborhood of the second point: 2 cp  , 2 dp  .These two points ( c and d ) limit the finite set of certain points of the contour [ ; ] cd .3. On the segments [ ; ] ab and [ ; ] cd select the sets of equidistant points the line which gives the maximum value of symmetry measure, are passed to step 2d belongs to obtained line, this line is declared as the seed one and its points 1 p and 2 p are passed to step 1, otherwise this line is a sought-for symmetry axis.
got in our experiments is as much as 58 times, which is close to theoretical estimation.

.
The average time for test study shows the acceleration of 1,158 times for two threads.

Table 1 .
Results of sequential and parallel versions of exact symmetry detection algorithm implemented on supercomputerThe realization of parallel adjustment algorithm seeded by skeleton procedure was tested on PC (Intel® Core™ i7-2670QM CPU @ 2.2GHz, 16 Gb).Table2demonstrates results for all 30 images from "Butterflies" database, namely, etalon values of symmetry measures obtained by exact brute-force algorithm, values of symmetry measures obtained by adjustment algorithm, processing time of each image for sequential and parallel versions, and achieved acceleration.

Table 2 .
Results of sequential and parallel versions of symmetry detection algorithm implemented on PC