ON-SITE DATA-PROCESSING ALGORITHM AND OPTIMIZATION FOR AIRBORNE ICE SOUNDING RADAR CONFIGURED ON THE “SNOW EAGLE 601”

Airborne observation is an important approach to collect data in the remote, hostile Antarctica and study the relationship between the Antarctica and global climate. During airborne observations, it is necessary to conduct data processing and quality control on site, which can help to timely evaluate the status of airborne instruments, provide scientific clues, and develop ideal schemes for following airborne observations. As one critical component of airborne instruments, airborne ice sounding radar can delineate sub-ice bedrock topography and internal layers, which cannot be realized by other instruments. In this study, we present an on-site data processing algorithm for high-resolution and high signal-to-noise ratio (SNR) ice sounding radar data acquired by the “Snow Eagle 601”, the first fixed-wing airplane deployed by China for the Antarctic expeditions. In addition, the algorithm is further optimized in terms of static pre-allocated memory and parallel and block processing of data to enhance processing speed and meet the requirements for quality control and analysis of on-site data. Finally, we test the optimized algorithm with different volume of ice sounding radar data through implementing on different computer configurations, including i7, i5 CPU and 8G, 16G memory with the same disk. The results show that the average processing speed of the optimized algorithm is 5.143 times faster than the nonoptimized algorithm on different computer configurations. * Corresponding author


INTRODUCTION
In the context of global warming, it is critical to monitor and assess changes in polar ice sheets as well as their impact on global climate and sea level (Kennicutt et al., 2014). Because airborne platforms have the advantages of high data-acquisition efficiency and wide coverage, aerogeophysical surveys featuring synchronized observations achieved by multiple scientific instruments have become an indispensable means of data collection in polar regions. In particular, airborne ice sounding radar, with deep penetrating capability in ice, has been widely used in investigating the polar ice sheets since the 1960s (Waite and Schmidt, 1962). The electromagnetic wave in the very-high-frequency (VHF) band emitted by the ice sounding radar can penetrate into ice bottom to identify the bedrock interface several kilometers under the ice sheet and obtain highresolution distribution characteristics of ice layers. Hence, ice sounding radar is currently the most important method for measuring ice thickness of the polar ice sheets and mapping subglacial topography or subglacial water beneath the ice, as well as internal snow and ice layers (Cui et al., 2009). Meanwhile, a high-resolution, high signal-to-noise ratio (SNR), and high efficiency ice sounding radar data processing algorithm is needed to assist field data processing and quality control. The results contribute to assess whether the data are correct and rational, and subsequently determine the actual performance and operating conditions of the airborne instruments in a timely manner, thereby providing a basis for possible instrument maintenance and flight planning. In 2015, China deployed the "Snow Eagle 601", a BT-67 airplane for the Antarctic expeditions, on which a similar version of ice sounding radar with the High-Capability Radar Sounder (HiCARS) was configured (Cui et al., 2018). In the past five austral seasons, an international campaign has been launched to survey the East Antarctic Ice Sheet using the "Snow Eagle 601" (Cui et al., 2020a). It is an important prerequisite of field work to ensure that high quality ice sounding radar profiles can be achieved quickly and stably on site. So, an on-site, highresolution, and high SNR ice sounding radar data processing algorithm was developed to improve the imaging quality of the HiCARS data and meet the requirements of quality control (Cui et al., 2020b). Pulse compression, unfocused SAR implementation and Doppler frequency analysis are all part of on-site data processing (Peters et al., 2005;Cui et al., 2020b). Nonetheless, the massive amount of data acquired by the airborne ice sounding radar result in too long processing time, which cause the subsequent quality control and data analysis delay, as well as maintenance and adjustments to the following flight plans. A more efficient and adaptive optimization of ice sounding radar data processing algorithm is extremely needed in field. Therefore, we proposed a CPU-based parallelism model of ice sounding radar data processing algorithm. We rewrote the code by C language and analysed the characteristics of the onsite data processing algorithm, and optimized the algorithm implement includes static pre-allocated memory, parallel computing, and adaptive block processing to solve the adaptive and performance problem. We tested the above-mentioned optimized algorithm with different volume of ice sounding radar data through implementing on different computer configurations, including Intel i7, i5 CPU and 8G, 16G memory with the same disk. The resulted radar profiles are exactly the same before and after optimization, and optimized average processing speed is 5.143 times faster than the non-optimized program on different computer configurations.
The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLIII- B3-2021XXIV ISPRS Congress (2021 In this study, we simply introduce the radar system configured on the "Snow Eagle 601" and data acquisition firstly, and present the developed on-site processing algorithm for radar data secondly. We introduce the optimization of the algorithm in detail, and finally give out the test results.

RADAR SYSTEM AND DATA ACQUISITION
All data used in this study are acquired by the "Snow Eagle 601" airborne platform operated by the Polar Research Institute of China for the Chinese National Antarctic Research Expedition (CHINARE). The airborne ice sounding radar system is functionally similar to the HiCARS system, a phasecoherent radar system developed by the University of Texas, Institute for Geophysics (UTIG). In the past decades, the HiCARS has been widely used to measure the ice sheet in Antarctica (Blankenship et al., 2016(Blankenship et al., , 2017Young et al., 2011). Using a double flat-plate dipole antenna with a peak power of 8 kW and a center frequency of 60 MHz, the radar system can penetrate the Antarctic ice sheet several kilometers deep and guarantee a relatively high vertical resolution to characterize the internal ice layers (Cui et al., 2018). Moreover, the radar system has two gain channels, i.e., high and low channels. The highgain channel is mainly used for detecting the bed topography, while the low-gain channel detects shallow internal ice layers. During data acquisition, each of the 32 traces of the ice sounding radar data was stacked along the azimuth direction, and the data-acquisition frequency was reduced from 6250 to 195 Hz, which not only ensured the SNR of the data along the azimuth direction, but also greatly saved storage space for the ice sounding radar data. Data measured during each flight are divided into different PSTs (Project/Set/Transects) to facilitate data classification and management, thereby improving the efficiency of extraction, analysis, and interpretation of the airborne observation data.

ON-SITE DATA PROCESSING ALGORITHM
To meet the needs of data quality control, data analysis and interpretation, high-quality radar echograms are needed. An onsite data processing algorithm is developed to improve the clarity of the radar data profile and reduce the interference of various types of noise. The data processing algorithm includes down-conversion, removal of direct current (DC) offsets, pulse compression, coherent stacking, and incoherent stacking. After data processing, the sharpness and SNR of radar data is getting better, as shown in Figure 1. The algorithm makes the internal reflection horizons (IRHs), bed topography and other geometric features clearer to identify.

Down-conversion
Because the subsequent signal-processing algorithm needed to be performed in the baseband frequency, the observed data were modulated into the baseband frequency through downconversion. An orthogonal detection was performed on the sampled signal to extract the amplitude and phase information of the signal while simultaneously down-converting the baseband signal to the video signal. The traditional orthogonal detection is completed by a phase detector, which requires the receiver to be equipped with an analog mixer and a low-pass filter (LPF). The two gain channels require a total of four detectors, which increases the complexity of the receiver and the system cost. Moreover, the artificial circuit is greatly affected by differences among the devices. In addition, the phase imbalance of the In-Phase and Quadrature (IQ) channel caused by the temperature offset will produce an image component that is difficult to eliminate, resulting in a decrease in signal-processing performance. With the development of high-speed analog-to-digital converter (ADC) technology, intermediate-frequency (IF) sampling technology can be directly applied to this ice sounding radar system to solve the problems caused by the simulation of orthogonal phase detection. IF sampling technology uses the low-pass filtering method to directly mix the sampled signal with a 70-MHz local oscillator (LO) signal to remove the carrier frequency, and then filters the upper sideband signal and higher harmonic signal generated by frequency mixing using the LPF. A 128-order LPF was applied, and a far-infrared (FIR) filter with a cutoff frequency was used to perform orthogonal detection and down-conversion.

Removal of DC Offset
DC offset refers to the radio-frequency (RF) signal that is leaked to the antenna and radiated through the RF switch and power line when the ice sounding radar transmitter is off. The generated RF signal is superimposed on the useful signals in the mixer. To remove the DC offset, the average signal level of the data noise in each trace should be estimated first, and then the estimated value should be subtracted from the echo signal. By removing the DC offset, the overall waveform was shifted downward by approximately 500 dB to place the center of the waveform at the dead-center position, suppressing the interference of the DC offset near the null point on the overall signal.

Pulse Compression
The ice sounding radar data processed by pulse compression can improve the display of the interface between the ice surface and bedrock in the trace image, and make the subglacial topography in the acquisition area clearer to meet the highrange resolution of the ice sounding radar profile required by the quality control. Generally, the SNR of the received signal is increased by increasing the length of the transmitted signal to obtain a sufficiently accurate target parameter; however, the extended signal length will be greater than the signal length required by the resolution, which means the signal bandwidth will be too large, thereby preventing normal frequency points from appearing in the image. The pulse compression technique involves the echo signal with the matched filter to obtain short pulses that meet the resolution requirements, thereby improving the display of the ice surface and bedrock interface in the ice sounding radar profile.
The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLIII-B3-2021 XXIV ISPRS Congress (2021 edition)

Coherent stacking
Pulse compression effectively improved the distance resolution of the ice sounding radar data. However, because of the low SNR, there is too much image noise in the ice sounding radar profile that the middle layer cannot be effectively recognized, and the bedrock interface is blurred. Peters et al (2005) phaseadjusted a synthetic aperture ice sounding radar so that the echo signal was coherently superimposed over the application range of the aperture length of the focused Synthetic Aperture Radar(SAR), which further increased the resolution of the data along the flight direction. Thus, the ice sounding radar data were further de-noised through coherent stacking, and the number of signals added between the first and last signals for destructive interference was then calculated to obtain the maximum number of effective coherent stacking. Coherent stacking can effectively reduce the noise of ice sounding radar data and improve the resolution of the interface and intermediate layer.

Incoherent stacking
Coherent superposition effectively improves the SNR of the ice sounding radar data, but it also generates speckle noise in the ice sounding radar profile, which greatly interferes with the extraction and analysis of effective information in the echogram. Therefore, to remove speckle noise, incoherent stacking is required to eliminate some peaks and null values in the data.

OPTIMIZATION OF ON-SITE DATA PROCESSING ALGORITHM
The on-site data processing algorithm improves the imaging quality of the ice sounding radar data and meets the requirements of quality control. Nonetheless, the massive amount of data acquired by airborne ice sounding radar result in too long processing time, which cause the subsequent quality control and data analysis delay, thereby affecting the formulation of follow-up flight plans, as well as maintaining and adjusting the observation equipment. Extreme cold environment in field could also cause damage to utilization computer possibly. We optimized the algorithm implement includes the static pre-allocated memory, parallel computing, and block processing, to solve the adaptive and performance problem. Figure 2 shows the flowchart of optimized on-site processing program of ice sounding radar data.

Static Pre-allocation of Memory
The on-site processing of the ice sounding radar data involves a large amount of matrix calculation, which requires significant amounts of read-write operations in the central processing unit (CPU). According to the mechanism of the CPU reading data, every time the CPU finds the data in its memory, it caches the contiguous physical address data in the memory block, in which the data resides into the cache of the CPU. Because the highest read efficiency between the CPU cache and the CPU, programming the data to be processed should be kept in the cache as much as possible. There are two kinds of allocated memory: dynamic and static. Dynamically allocated memory causes the physical address to be discontinuous, which greatly increases the difficulty of finding the sequential ice sounding radar data to be processed in the cache. Moreover, the data may degrade to a lower priority once the read fails. On the contrary, statically allocated memory can distribute sequential ice sounding radar data to continuous physical addresses, which can significantly increase the hit rate for the CPU to read the next ice sounding radar data to be processed from the cache. Therefore, to improve the efficiency of the CPU to read and write ice sounding radar data, statically allocated memory was adopted in this study.

Figure 2.
Flowchart of optimized on-site processing program of ice sounding radar data.
Memory fragmentation can also affect the processing efficiency of ice sounding radar data. Memory fragmentation can also affect the processing efficiency of ice sounding radar data. For example, we divided the memory address into ten bits. At the first time slice, the DC offset was removed. The ice sounding radar data before the DC offset removed were stored in the first three bits of memory address, and the average electrical level of the noise was stored in memory addresses 4 and 5. After removing the DC offset, the data were stored in the sixth, seventh, and eighth bits of the memory address, and the first five memory addresses were released. At the second time slice, pulse compression was conducted. The matched filter was read from the hard disk to the first three bits of the memory address, and then the ice sounding radar data with the DC offset removed were placed into the CPU for pulse compression.
Although there was a free four-bit memory address, there was no continuous three-bit memory physical address to store the pulse-compressed ice sounding radar data, which causes the CPU to consume a significant amount of resources for reallocation of memory addresses. Parallel computing greatly increases memory usage, and memory fragmentation generated by high memory usage can significantly reduce memory usage efficiency. Hence, preallocated memory was used for the intermediate quantity to improve memory fragmentation; that is, a fixed memory address The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLIII-B3-2021 XXIV ISPRS Congress (2021 edition) was pre-allocated to each intermediate variable before processing the ice sounding radar data to ensure that all the intermediate quantities generated in each process had a corresponding memory address for storage. This approach avoided memory fragmentation caused by repeated operations of memory allocation and release, and effectively improved the processing efficiency of parallelized ice sounding radar processing programs.

Parallel Computing
The unmodified program processes each piece of data by looping the algorithm. The CPU processes the data in accordance with the sequence of acquisition. Only after once piece of data is processed would the next piece of data be processed, during which only one processing core is occupied, leaving the other three processing cores in an idle state, which wastes the computing power of the CPU. Thus, it is necessary to increase the utilization of CPU processing core, thereby enhancing the processing efficiency. By analyzing the on-site processing algorithm for the ice sounding radar data, it was found that the results obtained by ice sounding radar data processing were only correlated with the original data. Therefore, we increased the utilization ratio of the CPU processing core by parallelizing the on-site processing program of the ice sounding radar data, thereby improving the processing efficiency of the on-site processing algorithm of the ice sounding radar data. Because the scientific calculation of the matrix accounts for over 90% of the total calculation, the ice sounding radar data processing algorithm is a computer-intensive task. Because computer-intensive tasks mainly rely on the computing power of the CPU, during parallelizing of the serial ice sounding radar data-processing algorithm, the task should be divided into smaller tasks to reduce the number of threads executing in parallel and also reduce the scheduling time for the system to switch between tasks. Therefore, the number of threads exceeding the number of CPU cores was not considered. Moreover, to guarantee the highest efficiency of the CPU, each CPU processing core was assigned to the parallel threads and a four-thread parallel ice sounding radar on-site data-processing algorithm was adopted.
In this study, a parallel processing program for ice sounding radar data was designed. The program first read the total amount of ice sounding radar data, statically pre-allocated the memory for the four threads of data, divided the total amount of ice sounding radar data evenly according to the number of threads, and then simultaneously performed the reading and processing of the four threads of data until the tasks assigned to each thread were completed. Finally, the parallelized results were stored in the pre-allocated memory addresses to complete the data processing.

Block Processing of Data
If the amounts of ice sounding radar data are too large or too small, the efficiency of the parallel processing program will decrease. The parallel processing program consists of two parts, i.e., system scheduling and parallel data processing. Therefore, the total time consumed by the parallel processing program also consists of two parts, i.e., system scheduling time and parallel data-processing time. The system scheduling time is relatively fixed, which is mainly the time taken by the main thread to open the four sub-threads. To improve the efficiency of the program after parallelization, the amount of data processed each time should be increased as much as possible. Nevertheless, if the amount of ice sounding radar data for parallel processing is too large, the CPU will read and calculate the ice sounding radar data through the least efficient hard disk. Therefore, to balance the constraints, it is important to block the data and find the optimal number of partitions. In this study, the optimal number of partitions was estimated by monitoring the hard disk usage rate in the ice sounding radar data-processing program. Because a significant increase in hard disk usage rate usually leads to memory overflow, it is difficult for the CPU to read the corresponding data to be processed in the memory data, and the data can only be read through the hard disk. Determining the critical value of the number of partitions that might lead to memory overflow by monitoring the harddisk usage rate can obtain the largest chunked ice sounding radar data volume while ensuring that no memory overflow occurs. By monitoring the usage rate of the hard disk, it was found that the usage rate increased rapidly when the data were blocked in 1100 traces. Therefore, the on-site processing algorithm was the most efficient when each piece of ice sounding radar data was blocked in 1100 traces.

Test of Optimized Algorithm
To verify the effect of optimized algorithm application, we used different computer configurations for testing. We show the benchmark results as a series of graphs. Table 1 shows the average usage rate of the CPU before and after the optimized algorithm implementation. Table 2 shows the average usage rate of the memory before and after the optimized algorithm implementation. Speed is measured in "MFLOPS," defined for a processing of size n as (5nlog2n)/t, where t is the time in µs for one processing, not including one-time initialization costs. The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLIII-B3-2021 XXIV ISPRS Congress (2021 edition) optimized model improves the average usage of CPU and memory, and it is applicable to computers with different configurations. It is not difficult to conclude that the optimized algorithm implementation is effective and its experimental effect is very important for the subsequent quality control. The average processing speed of the optimized algorithm is 5.143 times faster than the non-optimized algorithm on different computer configurations (Figure 3).

Figure 3.
The different configuration computers process the different amount data before and after optimized.

CONCLUSIONS
Airborne ice sounding radar detection provides us the best way to map sub-ice properties in polar ice sheets with the advantages of high data-acquisition efficiency and wide coverage. During airborne survey, it is necessary to conduct on-site data processing and quality control, so as to assess whether the data are correct and rational, and to timely evaluate the status of airborne instruments and develop ideal schemes for following airborne observations. Therefore, an on-site ice sounding data processing algorithm is needed to assist field data processing and generate high SNR radar profiles efficiently for data quality control. In this work, we present an on-site data processing algorithm for ice sounding radar configured on the " Snow Eagle 601", the first fixed-wing airplane deployed by China for the CHINARE. The algorithm includes down-conversion, removal of direct current (DC) offsets, pulse compression, coherent stacking, and incoherent stacking. Furthermore, the algorithm is optimized in terms of static pre-allocated memory, parallel computing, and block processing to improve on-site processing speed and field adaptability, and meet the requirements for quality control and analysis of on-site data. We tested the optimized algorithm with different volume of ice sounding radar data through implementing on different computer configurations. When keeping ice sounding radar echograms exactly the same before and after optimization, the average processing speed of the optimized algorithm is 5.143 times faster than the non-optimized algorithm on different computer configurations. In future, we will apply this optimization model to the subsequent CHINARE.