EPIPOLAR LINE-BASED LATERAL VIBRATION MEASUREMENT BY USING TWO CAMERAS

Vibration measurement techniques can be categorized into contact-type and non-contact-type techniques. These types of techniques can add mass-loading to a lightweight structure resulting in the negative performance of a structure, because sensors, high contrast speckles or targets should be mounted on a structure. Moreover, non-contact-type vibration measurement techniques have only been tested to detect vibrations using a single camera. As the vibrations occurring at the opposite sides of a rotating structure in a region of interest (ROI) can be different from each other. For 3-dimensional (3D) vibration measurement, the same position in videos acquired from two cameras should be used. Because the videos acquired by two cameras placed perpendicular to the structure can be used to detect the vibrations in the x-direction as well as y-direction. In this study, an epipolar line-based corresponding point selection on a rotating cylindrical structure was performed, to extract the same ROIs from videos recorded by two cameras. A fundamental matrix was constructed by using the targets attached on the structure and in the background. The coordinates of the mid-pixel of the ROI in a video acquired by one camera was used to determine the epipolar line for the same ROI in the video acquired by another camera. Then an edge-based vibration measurement technique was applied to measure the vibration in the extracted ROIs. The results were used to reconstruct a 3D vibration signal. The 3D vibration measurement results can be used to effectively recognize the deformations resulting in the negative performance of a structure.


INTRODUCTION
The vibration measurement of a structure is one of the most extensively used technique for structural health monitoring, mode characterization, and updating of structures (Doebling and Farrar, 1998). The current techniques can be categorized into contact-type and non-contact-type vibration measurement techniques (Shang and Shen, 2018). Contact-type sensors such as gap sensors or accelerometers should be mounted on a structure of interest to observe amplitudes and frequencies of structural vibrations (Sohn et al., 2002). Although these sensors can provide accurate and sensitive vibration measurement results, they are discrete sensors and can provide the vibration results only in one direction at the installed location. Furthermore, these types of sensors can add mass-loading to a lightweight structure resulting in the negative performance of a structure (Nassif et al., 2005). Their installation can become costly and difficult if a structure is large and already placed in hazardous areas. Recently, analysts have developed noncontact-type vibration measurement techniques such as digital image correlation (DIC) (Yu and Pan, 2017;Huňady et al., 2019;Cuadrado et al., 2020) and point tracking techniques (Wang et al., 2019;Gwashavanhu et al., 2016;Baqersad et al., 2015) for the vibration measurement of various types of structures. However, high contrast speckles or targets should be mounted on the surface of a structure before applying these techniques. This increases the surface preparation and target installation requirements particularly when a region of interest (ROI) is large or unreachable.
Lately, target-less photogrammetric approaches have been introduced to detect and track internal features such as edges of a structure for the vibration measurement, because of its low computational cost as well as minimum requirements of additional preparation of structures (Ji and Chang, 2008;Cigada et al., 2014;Shan et al., 2015;Son et al., 2015;Son et al., 2021). However, the vibration measurement results by using edge detection and tracking can be affected by an uneven background or brightness in a video. This problem was addressed in (Javed et al., 2022) by using a subpixel-based edge detection method and introducing an edge tracking technique that removes the falsely or closely detected edges caused by an uneven background or brightness in a video by utilizing frame-to-frame comparison with pixel-to-pixel comparison. However, these techniques have only been tested to detect vibrations using a single camera. As the vibrations occurring at the opposite sides of a rotating structure, a region of interest (ROI) can be different from each other. For 3-dimensional (3D) vibration measurement, the same position in videos acquired from two cameras should be used to measure the vibration. Because two cameras can image a rotating structure from multiple sides, the vibration on the rotating axis of a structure can be detected and monitored on the x-axis as well as on the y-axis in ROIs around a structure. These vibration signals can be used to reconstruct the 3D vibration measurement results, which can effectively recognize the deformations in a structure that are causing a structure to work abnormally.
In this study, we utilized an edge-based vibration measurement technique introduced in our previous work (Javed et al., 2022) with epipolar line-based corresponding point selection. In order to detect vibrations in two directions in the same ROI by using the videos acquired by two cameras that recorded a rotating cylindrical structure from two different sides. The epipolar line-based ROI selection technique computes the epipolar lines for stereo images based on epipolar geometry. For the accuracy assessment, the result of the edge-based vibration measurement technique performed on the simulation dataset was compared with the reference data by using a video created by adding multiple sinusoidal vibrations to an image of a structure acquired by a single camera. Afterward, for the experiments on the real dataset, two videos were acquired by using two cameras that recorded a rotating cylindrical structure from two sides. Firstly, to extract the same ROI in both videos, an epipolar linebased corresponding point selection was performed. Then the edge-based vibration measurement technique was applied, to detect the vibrations in both videos in the extracted ROI. The results were then used to construct a 3D vibration measurement result.

METHODOLOGY
This study is mainly divided into two techniques, epipolar linebased ROI selection, and edge-based vibration detection. Firstly, epipolar line-based ROI selection was performed to extract the same ROIs in both videos, then the edge-based vibration detection technique was applied to measure the vibrations in the extracted ROIs. Both techniques are explained in the subsections below.

Epipolar Line-based ROI Selection
Epipolar line-based corresponding point selection was performed by using the videos acquired by two cameras (i.e., Camera A, and Camera B) imaging a structure from different sides, for the vibration measurement of a rotating cylindrical structure. The epipolar line-based ROI selection technique computes the epipolar lines for stereo images based on epipolar geometry. Epipolar geometry is the geometry of stereo vision, when two cameras view a 3D scene from two different positions, there are several geometric relations between the 3D points and their projection onto the 2D images leading to constraints between the image points. If the cameras are uncalibrated, a fundamental matrix should be estimated from image correspondences which are used to determine the projective 3D view of the imaged scene. The computation of the fundamental matrix uses the least median of squares method to find the inliers.
In this study, firstly, the fundamental matrix was estimated by using the targets attached to the structure and in the background. Then an ROI was selected in the first frame of a video acquired by Camera A and the coordinate points of the mid-pixel in that ROI were extracted. Afterward, by using these extracted coordinate points of the mid-pixel in the ROI of Camera A were used to extract the corresponding epipolar line in a frame from Camera B. This epipolar line was used to get the corresponding mid-pixel coordinates of the same ROI in Camera B.

Edge-based Vibration Measurement
For the vibration measurement in the ROIs in the videos from Camera A and Camera B, an edge-based vibration measurement technique introduced in Javed et al. (2022) was used. This technique is mainly divided into three steps such as frame magnification, subpixel-based edge detection (Trujillo-Pino et al., 2013), and edge tracking.
The vibration measurement results computed by using a targetless photogrammetric approach such as edge detection and tracking can be affected by an uneven background and brightness in a video. Therefore, to remove the closely detected edges from the ROI around a structure, a frame magnification was performed by using a bicubic interpolation technique. The bicubic interpolation technique creates empty spaces in the given image and then fills those empty spaces by creating the influence sphere of 16 adjacent pixels and calculating the distance between the point of interest and these 16 adjacent pixels. The smaller the difference the more will be the influence of that pixel on the selected point. The size of the output image depends on the interpolation ratio.
Then a subpixel-based edge detection technique was used to detect the edges in a frame on subpixel-level. To this end, the pixels belonging to the edges in a frame were determined by computing a gradient of a frame via partial derivatives. A threshold (Te) was then applied to the computed gradient image to differentiate between the pixels belonging to edges in a frame and the background. If a pixel value is less than or equal to Te in a frame in the gradient image, then it will be considered as a pixel belonging to the edge in the frame. For the subpixel position of edges, it was assumed that an edge can be represented by a straight line that divides a pixel or a region into two areas with different intensity values. And if the coefficients of this straight lines are calculated then the subpixel location of the edges can be extracted. To this end, a 5×3 window centered on a pixel of interest was considered, and the sum of each column was calculated. Moreover, the intensity values for the two areas that were divided by an edge were determined. All of this information was used to calculate the coefficients of the considered straight line. A detailed explanation of subpixelbased edge detection is given in (Trujillo-Pino et al., 2013).
After the subpixel-based edge detection, an edge tracking technique that utilizes frame-to-frame comparison with a pixelto-pixel comparison was applied. A utilized edge tracking technique removes the falsely detected edges from the ROI around a structure and detects the vibrations in a structure. For frame-to-frame comparison, the subpixel locations of the edges in an ROI were extracted in a previous frame on the x-axis. As well as they were averaged for the vibration measurement. Then the subpixel edge locations were extracted in the current frame and the difference between the edge location in the current pixel of a current frame and the edge location in the current pixel of the previous frame was calculated. Then a threshold such as Tf was applied, if the difference is less than or equal to the considered threshold, the current edge location will be considered as an actual edge location, otherwise, it will be considered as a falsely detected edge and will be removed from the ROI around a structure. For pixel-to-pixel comparison, the distance between the edge location in the previous pixel and the current pixel of the current frame was calculated and the edge that had a minimum distance with the edge location in the previous pixel was selected as an actual edge.
For the vibration measurement of a structure, the aforementioned technique was applied to all the frames in a video of a rotating or vibrating structure one by one.

DATA ACQUISITION
To show the effectiveness of the proposed method, two datasets (i.e., simulation data, and real data) were prepared. A simulation dataset was used, to qualitatively and quantitively show the performance of the edge-based vibration measurement The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLIII-B2-2022 XXIV ISPRS Congress (2022 edition), 6-11 June 2022, Nice, France technique. For 3D vibration measurement by using the proposed method, a real dataset was used.

Simulation Data
A simulation dataset was prepared by adding multiple sinusoidal (i.e., sine & cosine) distortions with different amplitudes and periods to an image of a cylindrical structure (i.e., shown in Figure 1) while considering the number of frames to be created. To this end, a video of 10s with 600 frames and a frame rate of 59.94 frames per second (fps) was generated. Moreover, for the accuracy assessment, a reference vibration signal was created during the process of adding the distortions and creating frames.

Real Data
For the real dataset, two cameras such as Camera A and Camera B were placed perpendicular to the structure recording a rotating cylindrical structure from two different directions. Each camera recorded a video of around 14s duration with a frame size of 2,160 × 3,840 pixels and a frame rate of 59.94 fps. A frame from the videos recorded by Camera A and Camera B is given in Figure 2(a) and 2(b), respectively.

Experiments on the Simulation Data
To show the effectiveness of the edge-based vibration measurement technique, the vibrations were detected in the simulation data. To this end, firstly, a frame was magnified twice its original size, then the pixels belonging to edges in a frame were extracted and Te was selected as 16. Afterward, the subpixel location of the edges was determined. An ROI of 20×20 pixels was selected as shown in Figure 3.

Figure 3. A close-up view of the ROI selected for vibration measurement
An edge tracking technique was applied to measure and track the vibration in an ROI around a structure. During the edge tracking, for frame-to-frame comparison, Tf was set to 5 pixels. While for pixel-to-pixel comparison, the edge locations in the current pixel and previous pixel were extracted and the one with the minimum distance with the previous pixel was selected as an actual edge. Figure 4 shows the result of the measured vibration in the simulation data video. From Figure 4 it can be seen that compared to the reference signal, the edge-based vibration measurement technique effectively detected the vibration signal in an ROI around a cylindrical structure in the simulation data. Moreover, the root mean square error (RMSE) between the reference signal and the detected vibration signal was 0.0358 pixels.
Frequency analysis based on Fast Fourier Transform (FFT) was performed on the detected vibration signal and a reference signal. The FFT gives information about the frequency rate at which vibrations are occurring in a structure. The frequency analysis result is given in Figure 5. Two peaks are shown in Figure 5, meaning the vibrations are occurring at two different frequencies. Moreover, the peaks of the detected signal and reference signal occurred at the same frequency with similar amplitude.

Figure 5. Frequency analysis of vibration signals
The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLIII-B2-2022 XXIV ISPRS Congress (2022 edition), 6-11 June 2022, Nice, France

Experiments on Real Data
For 3D vibration measurement, the real dataset containing two videos acquired by Camera A and Camera B was used. Firstly, epipolar line-based corresponding point selection was carried out. The fundamental matrix was estimated by using the targets attached to the structure and in the background. Then the matching points in both frames were extracted and outliers were removed as shown in Figure 6. An ROI of 20×20 pixels in a frame from Camera A around the edge of the rotating structure was selected and the coordinates of the mid-pixel were extracted. To extract coordinates of the same position in Camera B, these extracted coordinates of the mid-pixel were used and an epipolar line was constructed in a frame from Camera B. Figure 7 gives a close-up view of the ROIs selected in Camera A and Camera B with epipolar line.  Figure 8 shows that the proposed method effectively detected the vibration in two ROIs. Moreover, from the FFT (i.e., Figure  9) it can be seen that both vibrations are occurring around 5.6Hz as shown by the peak in the graphs. However, the FFT also shows that there is some noise in the detected vibration signals. Therefore, before constructing a 3D vibration signal, the noise was removed from the videos by magnifying the vibrations in the videos occurring in between 5.2Hz to 6.2Hz through phasebased motion magnification technique introduced by Wadhwa et al. (2013). The phase-based motion magnification amplifies the motions in videos within a specific frequency band, which can suppress other existing or unwanted motions occurring at different frequencies as well. After motion magnification, vibrations in both videos were detected again and frequency analysis was performed. Figure 10 and Figure 11 show the vibration signals detected after noise reduction and their frequency analysis results. From the FFT graphs, it can be seen that the noise from the vibration signals has been effectively removed as well as the vibration signals are clearer than before. Rotating structures can cause lateral vibrations with relatively large amplitude at a certain speed and this large amplitude vibration is called rotor whirl. Its diameter varies with the speed of rotation and the location of measurement along the shaft axis. Thus, the collection of rotor whirls can be expressed as mode shapes of the rotation structures. Therefore, for 3D vibration measurement, the two signals in video acquired by Camera A and Camera B were considered as the structure vibrating in xdirection and y-direction, respectively. The actual size of the rotating structure was 20mm and its size corresponded to 35 pixels in a frame. Using the ratio between the units, we converted the units of the vibration measurement results from pixels to the mm. Then, by using these two signals, the 3D coordinates of the actual points in the object space were determined. Figure 12 shows the vibration trajectories at extracted ROI on the rotating structure. It can be seen that the structure is vibrating in a circular motion with around 0.167mm of diameter.

CONCLUSION
In conclusion, in this study, we have introduced a 3D vibration measurement technique for rotating cylindrical structure by using the videos acquired by two cameras placed perpendicular to the structure imaging a structure from two different directions. To this end, epipolar line-based corresponding point selection was utilized with an edge-based vibration measurement technique. The effectiveness of the edge-based vibration measurement technique was shown by experimenting with the simulation data. Then, the experiments were carried out on the real data. The same ROIs were extracted in both videos through epipolar line-based corresponding point selection technique, then vibrations in the x-direction and y-direction were detected in both ROIs by edge-based vibration measurement technique. The results showed that the proposed method can effectively extract corresponding ROIs even if a structure is recorded from two different directions and a 3D vibration signal can be effectively constructed. In the future, the results of the proposed method will be used to recognize the deformations that are causing a structure to work abnormally. Moreover, the proposed method will be tested on various types of structures such as buildings or bridges.