TOWARDS DETECTION OF THERMAL ANOMALIES IN LARGE URBAN AREAS USING SIMULATION

Anomaly detection in imagery has widely been studied and enhanced towards the requirements of today’s available sensor data, whereas many of them require a background estimation in order to identify an anomaly or target. In this paper, we examine an analysis of simulation as background estimator for anomaly detection in thermal images of urban sceneries. We generate a surface temperature image and a sensor-like infrared image by combined image and elevation data and a thermal model suited for large scenes and fast simulation. With the simulated thermal image, we define anomalies as deviation between measurement and simulation. Pixel-wise image differencing of the measured and simulated temperatures and infrared images respectively are performed and evaluated concerning the full images as well as class-wise, including a material classification of the observed area. Our approach shows complementary results compared to RXD application on the measured infrared images. Metal roofs which appear warm in the thermal image and are not visually distinguishable from the residual image are detected.


INTRODUCTION AND PREVIOUS WORK
Anomaly detection algorithms have widely been studied and enhanced towards the requirements of today's available sensor data. The field of application thereof spans from civilian to military. Whilst the latter may deal with target detection in defense systems, e.g. small targets sensed by a shipborne warning system (Hui et al., 2016), or vessel detection (Islam et al., 2009), civilian applications include fire detection (Guo et al., 2016, Kato, Nakamura, 2017, earthquake analysis (Wu Lixin et al., 2006), urban heat islands (Sharma, Joshi, 2014), and many more. As much as the possible fields of application differ, as much differ the underlying detection techniques (Chandola et al., 2009). Many of them base upon machine learning, resemble classification approaches and try to define a normal behaviour in order to find data points deviating from it. When it comes to anomaly detection in images visualizing sensor data from other spectral ranges than the visible, statistics-based techniques appear to be the method of choice (Guo et al., 2016). The most commonly known statistical anomaly detector is the Reed-Xiaoli-Detector (RXD) (Reed, Yu, 1990), on the basis of which many further developments had been published until today, including modifications towards long-wave infrared (LWIR) imagery. The necessary assumption of the RXD is that an anomolous object appears rare in an image and shows a significant deviant signature from the background, which is assumed to follow a Gaussian distribution. However, in real data, the background does often not supply this condition due to a complex scene composition, shadows, reflections, or other clutter, which causes the RXD to result in a high false-alarm rate (Gao et al., 2014). Many RXD enhancements deal with refinement of either background or the target-background separation. The local-RXD (Gorelnik et al., 2010), for example, * Corresponding author only takes into consideration the nearest surrounding area of the pixel under test (PUT). The BACON algorithm (Billor et al., 2000) iteratively refines the background by representing it by a subset of pixels which undergoes an outlier detection procedure. The PAD method (Gao et al., 2014) evaluates a PUT not only in relation to the background pixels, but also in relation to the anomaly set of pixels to achieve a lower false-alarm rate. Separating target and background in feature space is used by the kernel-RXD (Heesung Kwon, Nasrabadi, 2005), which was also enhanced towards LWIR images (Mehmood, Nasrabadi, 2011). In the domain of LWIR imagery, concepts such as Mahalanobis distance (Islam et al., 2009) or visual saliency (Hui et al., 2016) were used similar to the visual domain to detect anomalies in a sea-sky-background. In quite unstructured environment, like a sea surface, it may appear feasible to accomplish the background estimation from random distributions (Yang et al., 2001), and the vast majority of unsupervised detection methods have in common that the background estimation is based on the image under test within which an anomaly would be detected. With usage of neural networks, background estimation based on a number of training data is enabled. Thereof deduced approaches may thus appear feasible in urban environments consisting of regions with piecewise constant geometrical (normal vector) and physical properties. (Khan et al., 2009) used a probabilistic neural network (PNN) as clutter rejector for background refinement. The PNN is trained by a set of IR images extracted from real IR video sequences, and a regionof-interest (ROI) selection on the image under test is carried out by a procedure of morphological operations. These ROI are then evaluated by the PNN as either clutter or target. A convolutional neural network was examined in (d 'Acremont et al., 2019) for target detection in mid-wave infrared images. The training data was supplied by simulation of IR images using OKTAL-SE as well as real IR video sequences. Both neural-network approaches require a preceeding target labeling in the training datasets, which makes them difficult employ if appropriate training data is not available or if a target can not be defined.
The unsupervised methods including the RXD and its derivatives share one drawback when it comes to LWIR imagery. Regarding a thermal image, i.e. a LWIR image, a conspicuous signature that falls into their detection range is mainly caused by a temperature difference. Depending on the surface, however, such a temperature difference might be hidden or falsely generated, i.e. by camouflage or polished metallic surfaces. These signatures will be missed or falsely classified. Furthermore, the anomaly detection schemes do not take into account the scenebased context of a thermal image, i.e. how the scene composition influences the temperature distribution. In this paper, we aim to tackle these issues. We present a novel approach for anomaly detection in LWIR imagery, where a synthetic thermal map of the observed scene is simulated. The synthetic thermal image represents the background and possible anomalies are found by comparison to the actual LWIR image under test. Our study focuses on the basic usage of simulation for anomaly detection, and outlines the similarities and differences to conventional anomaly detection by application of the RXD on LWIR images. To the best of our knowledge, simulation of thermal background for anomaly detection has not been studied before.
The rest of the paper is organized as follows. Section 2 introduces our methodology of simulated IR image generation and subsequent anomaly detection on real IR data. In section 3, our results evaluating our method on a dataset provided by the City of Melville (Council of City of Meville, 2017) are presented. Section 4 concludes with a short summary of our findings and an outlook to future work.

METHODOLOGY
In the first part of this section, we briefly describe how the data can be decomposed into infinitesimal patches (triangles) to which we assign important attributes, such as semantic class, material, and, finally, temperature. Next, we obtain the simulated thermal and infrared images. The final part presents our method for anomaly detection by combining simulated and real IR data, together with the following evaluation procedure.

Preprocessing
Several subsequent preprocessing steps had to be performed in order to apply our proposed simulation-based anomaly detection technique. The scene reconstruction and subsequent thermal simulation follow (Kottler et al., 2019) and (Bulatov et al., 2020). For the latter, actual environmental conditions as well as the temperature distribution at the beginning are required. The simulation model therein was chosen as it is easy to implement and performs fast. Focus was put on an average temperature distribution close to reality being calculated in short time rather than on high-precision simulation. In our opinion, this is sufficient and allows subsequent studies on the actually needed precision for our method.
2.1.1 Semantic triangle mesh generation: In order to perform the thermal simulation starting from raw sensor data, a semantic 3D model including the thermally relevant properties had to be established. Following the method of (Ilehag et al., 2018), a Random Forest Classifier (Breiman, 2001) is used to determine the occurring semantic structures and materials respectively. The selected classes are street, grass, soil, water, building with tile roofing, building with metal roofing, tree and forest boxes. While single trees with geo-specific 3D models (Bulatov et al., 2020) allow for a more precise simulation, forest boxes are derived from larger regions on the landcover map that have been assigned to the tree class. This allows to reduce the amount of polygons needed for representation and thus keeping low the overall simulation complexity. Triangulation and thermal property assignment are performed based on the material classification, resulting in the final semantic triangle mesh.

Thermal simulation
We used the thermal model of (Bulatov et al., 2020) to perform the heat simulation leading to the necessary surface temperatures. This model aims for fast simulation of larger scenes, and further focuses on surface temperatures. Each triangle in the reconstructed scene is converted into a narrow orthogonal prism in order to model heat transfer into the inner of an object. The heat equation for each prism is then given by assuming only orthogonal heat transfer. This is discretized in depth z, i.e. the prism is split in layers of depth d. As lower boundary condition, a fixed core temperature Tcore is assumed. The upper boundary condition is given by the heat exchange with the environment, including radiative and convective heat exchange. This yields where cv = specific heat capacity ρ = materials' density d = layer depth Ts = temperature r.h.s. = heat fluxes, as heat equation for the surface triangle. The heat flux consists of a convective term, where h1 = free convection heat coefficient h2 = forced convection heat coefficient v wind = average wind speed T air = average air temperature, a conductive term, where K = thermal conductivity T in = inner temperature and finally two radiative terms for solar (S) and atmospheric (R) heat exchange, The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLIII-B2-2020, 2020 XXIV ISPRS Congress (2020 edition) where a = solar albedo Ic = clearness factor Esun = solar irradiation above atmosphere level θ = solar elevation angle γsun = occlusion factor where ε = thermal emissivity σ = Stefan-Boltzmann-constant γ sky = occlusion factor T sky = sky temperature.
The model does not consider latent heat or further heat sources in order to keep computational complexity low and to be easily adapted to other scenes. The material and semantic classification results of the preceding scene reconstruction is employed to assign generalized material parameters to each triangle of the scene mesh, including e.g. the thermal emissivity, heat conduction, or core temperature. Finally, the differential equation 2 together with suitable initial conditions is solved by Forward-Euler integration.

Generation and validation of thermal background for anomaly detection
Once the temperature has been assigned to every triangle of the mesh, a synthetic thermal image is rendered from the almost position as the input thermal image. In the two upcoming sections, the details of the rendering procedure (particularly focusing on the radiance image generation) and the evaluation strategy are provided.
2.2.1 Synthetic Thermal Map Generation: The semantic 3D model was orthogonally projected in order to get the simulated IR image of the scene. A simple Open GL renderer was used to get the visible triangles from an orthogonal view on the scene and to transfer the mesh of triangles to a 2D representation corresponding to the measured thermal map in pixel size and dimensions. Given an index image of visible triangles at each pixel, we directly establish the synthetic thermal map based on the thermal simulation results.
The resulting thermal map displays the simulated surface temperatures, however, an actually measured thermal image underlies certain restrictions due to its remote sensing character. This non-contact temperature measurement is based on an estimation from the emitted heat radiation of a surface. Poor heat radiators, e.g. polished metallic surfaces, therefore often result in false measurements. I.e. they appear as very cold objects even if their true temperature does not differ from their environment. We thus estimate the emitted radiance from the simulated temperatures in order to replicate this effect in the synthetic thermal image. From the material classification, the material-dependent heat emissivity is given and the heat radiance for each triangle can be derived as where σ = Stefan Boltzmann Constant T = temperature ε = thermal emissivity L = heat radiance Herein, we apply the well-known graybody-assumption. It implies that the surface is radiating uniformly in all directions independent on its temperature, and that the emissivity is independent on the wavelength.

Evaluation
Finally, we evaluate the synthetic thermal map in terms of average temperature difference, root mean squared error (RMSE), and mutual information (MI) as reference for our proposed anomaly detection method. While the average temperature difference as well as the RMSE have to be determined on the temperatures, MI can operate on the infrared images with different modalities and intensity ranges. Evolving from information theory, the MI of two images reflects their similarity such that given one random distribution, i.e. one image, it measures the uncertainty of a second distribution, i.e. the second image, (Viola, Wells, 1995). Zero MI therefore reflects independence of both images and the higher the MI value, the more similar the images.

Anomaly detection
Many different approaches on anomaly detection exist. To maintain our focus on the possibly usage analysis of simulation, we chose basic methods for image comparison and anomaly detection since their results are well comprehensible and traceable. In the following, the method of image differencing and our evaluation procedure are presented.
2.3.1 Image Differencing: In our approach, we define anomalies as differences between the simulated target-state and the measured actual state that deviate from the mean difference by a fixed threshold. Differences are calculated pixel-wise for the image pairs of measured and simulated thermal image and thermal map respectively. For each difference image, the mean value µ and standard deviation σ are determined. As in simple Gaussian-based outlier detection techniques, we chose a 3σ distance to the mean value as threshold (Chandola et al., 2009), i.e. every pixel representing a difference value being outside the µ ± 3σ range is marked as an anomalous pixel. We further perform class-wise anomaly detection in order to identify in-class anomalies. To do so, the binary mask for each class is generated based on the landcover map and then used to determine class-wise difference images. The mean and standard deviation values of these are calculated and evaluated, as before, by a 3σ distance measure.

Evaluation:
Detected anomalies are evaluated in terms of cross-validation. In our simulation-based approach, we recieve four anomaly images to evaluate, which correspond to • anomaly detection on the thermal map difference • anomaly detection on the thermal image difference • class-wise anomaly detection on the thermal map difference • class-wise anomaly detection on the thermal image difference.
The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLIII-B2-2020, 2020 XXIV ISPRS Congress (2020 edition) Furthermore, we apply the RXD on the measured thermal image and thermal map for reference. In the present case of unimodal data, image differencing with anomalies in the tails is equivalent to the RXD with a corresponding threshold. Doing so, we finally recieve eight anomaly maps for further evaluation.
The anomalous pixels of each detection result are categorized into their underlying class to identify the ones containing the majority of anomalies. Absolute and relative frequencies of anomalies are determined and compared. The temperature and temperature gradients of anomalies are analysed by means of histogram analysis in order to identify the correlations. Additionally, anomalous pixels are visually inspected.

RESULTS
In this section, we present the results of our method for two urban areas. First, we introduce the dataset and the results of the simulated IR images. Second, the anomaly detection results including our proposed method and the RXD are presented and evaluated.

Dataset
Data was provided by the City of Melville, Perth, Australia (Council of City of Meville, 2017). It comprises multispectral data covering 5 bands (blue, green, red, red-edge, nearinfrared), a LiDAR point cloud and a thermal image of the area of the City of Melville. To translate the image pixel values to actual temperature values, a look-up table was provided. With this, we extract the corresponding thermal map. Two sub-areas had been chosen as test datasets for our proposed method to which we will refer to as area 1 and area 2 in the following. Figure 1 displays the corresponding orthophotos. The areas of size 2×1 km and 2,8×2 km respectively covers the most relevant urban structures such as streets, buildings with different kind of roofings in color, material, or shape, park areas, and trees. Additionally, area 2 also covers more advanced structures such as small lakes surrounded by dense forest, a highway bridge and seaside. Since the overall area of Melville had to be covered in two nights, area 2 exhibits a region of higher temperature due to a hotter day before flying and due to an earlier time of flight, around the first night and 8:30 pm the second night.

Synthetic Thermal Map
The results of the synthetic LWIR image generation together with the actual measured LWIR images are shown in Figures  2 and 3 for area 1 and area 2, respectively. From the thermal simulation, a discrete point in time had to be chosen to generate the LWIR images. Based on the time of measurement of the real LWIR images, the chosen point in time for simulated temperatures in area 1 was 00:00 am. Considering area 2, a slightly earlier time than the time of measurement was chosen, i.e. 8:00 pm, due to a conclusion drawn from area 1 analysis which will be clarified later in this section. At the coastal region of area 2, temperatures were only detected close to the land surface. In order to perform the image-based comparison to the simulation, the missing area, i.e. the sea water temperatures, had been filled in by a modification of harmonic inpainting (Kottler et al., 2016). For area 1, an interactive procedure for building outline correction using an OSM shapefile as supplementary input has additionally been performed. Details of   this procedure as well as performance are presented in (Bulatov et al., 2020).
Simulation and measurement likewise show that natural landcovers such as grass or trees have already cooled down in the early evening, in contrast to man-made surfaces such as streets or roofs that remain warm for a longer period of time. The simulated temperatures of the roofs in area 1, representing the temperatures at midnight, however appear overestimated. Therefore, we conclude that a deviation of the roofings heat capacity The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLIII-B2-2020, 2020 XXIV ISPRS Congress (2020 edition)  From the measured and simulated surface temperatures, area 1 exhibits a temperature difference distribution with mean µ = 0.7429 • C and standard deviation σ = 5.3475, cp. Figure 4a. Area 2 yields a temperature difference distribution with an average value of µ = 4.1402 • C with a standard deviation of σ = 3.8415, cp. Figure 4b. The RMSE and MI are listed in Table 1. Taking into consideration the objective of the underlying thermal model, i.e. large-scale and fast simulation rather then high precision, the RMSE lie in a suitable range. The MI for area 1 increases with improvement of the simulation, i.e. regarding the radiance instead of the surface temperatures. On the contrary, area 2 shows a degradation by usage of radiance values. This agrees with the fact that area 2 shows many complex structures and steel roofs with high variance. A broader temperature distribution of steel roofs is caused by the increasing availability of modern surface treating, and thermal parameters (such as emissivity values) vary more than currently supposed by the model, which handles only constant parameter values. This effect is further increased by the classification results reaching not the precision as for area 1. Classification errors due to missed buildings or building parts pass not through the additional interactive post-processing step as was carried out for area 1. Since the consideration of the radiance strongly highlights metallic surfaces, and often new belatedly attached building parts imply metallic roofing, the decrease of radiance MI against temperature MI is reproducible.

Anomaly Detection
From the simulated temperatures and LWIR image, anomaly detection on the difference image of simulation and measurement is performed, to which we will refer to as CDT (change detection between temperatures) and CDR (change detection between LWIR images). CDTc and CDRc respectively indicate the corresponding inner-class anomaly detection. Furthermore, the RXD is applied on the LWIR image and the thereof deduced surface temperatures, to which we will refer to as RXT and RXR with the corresponding in-class approaches RXTc and RXRc.
The relative frequencies of anomalous pixels from each method for areas 1 and 2 are visualized in Figure 5 and Figure 6. They show a similar distribution, however the relative frequencies in area 2 are higher in each case, being based on the less precise classification results, cp. Sec. 3.2. The RXR and RXT lead to the lowest amount of detected anomalous pixels. These anomalies coincide with coldly appearing surfaces that significantly deviate from the residual, see Figure 7 (top and middle row, left). Thus, the output of the RXD on the measured data covers the expectation. Considering the RXTc and RXRc, the assumption of an anomaly appearing rare in the whole images is weakly satisfied, particularly in area 2, which might indicate necessary adaption of the anomaly threshold. However, the full percentage of anomalies is given by adding up all detected inner-class anomalies, thus the inner-class percentage of anomalies is better suited as parameter for threshold evaluation and will be regarded later on. RX-Detector applied on the LWIR image. CDT: change detection between the determined and simulated temperatures. CDR: change detection between the given and simulated LWIR image. RXTc, RXRc, CDTc, CDRc: as before, but with each class considered separately and finally merged.
Going into further detail, we analyze the classes of the detected anomalies. Table 2 summarizes the top three classes for each area. In five out of the eight anomaly detection approaches, including all non-classwise approaches, metal roofing appears most frequently as anomalies. However, the detection result of these five approaches do not coincide. Inspection of the resulting binary masks of each method and the corresponding temperature distribution yields cold surfaces being marked as anomalies in case of the RXT and RXR, which is, as stated before, conform to our expectation of steel roofings having cold LWIR signatures. In case of area 2, a small amount of very hot pixels is detected too, however, this still meets our expectation of RXD results since they lie in the upper temperature range of the whole image. Considering our proposed method of change detection between measurement and simulation, a shift of detected anomalies within the steel roof class occurs. While the CDT emphasizes 13.82 % and 11.41 %, respectively for area 1 and area 2, of all steel roofs as anomalous, which lies in the same range than the RXT and RXR, the CDR detects 78.9 %  definition of an anomaly as deviation from the expected, i.e. the simulation, is fulfilled.
For classwise anomaly detection, the inner-class percentage of detected anomalies is of additional interest and listed in Table 3. The amount of detected anomalies within a class allows to evaluate the quality of the chosen anomaly threshold. In the case of RXTc and RXRc, the hypothesis of an anomaly occurring rare within a class does not hold, i.e. the threshold would need sig-nificant adaption, in particular considering the water and forest classes.
The overall anomaly detection results, cp. Table 2, exhibit an emphasis of smaller structures by class-wise RXTc and RXRc. Soil and grass state the main anomalies whereas steel roofs disappear from the top three anomaly classes. From class-wise temperature histogram analysis and binary mask investigation, we conclude that the boundary area of connected pixels having the same class are detected as anomalous. Since a real LWIR image does imply softened borders or edges in comparison to a visual image, the environment of each regarded pixel per class influences the anomaly detection result. These softened borders arise from a combination of temperature gradients between adjoined surface materials and sensor resolution. The predominant detection of anomalies in boundary pixels becomes reduced in case of CDTc and CDRc.
The class-wise anomaly detection collectively shows a strong influence by the temperature jump in the measured thermal image due to the earlier time of data-taking as stated in Sec.3.1, in contrast to the non-classwise approaches.

CONCLUSIONS AND FUTURE WORK
Using simulation in order to estimate the thermal background for anomaly detection is considered as challenging task, since simulation mostly underlies certain errors in comparison to the ground truth. In this paper, we have shown that despite of these challenges, anomaly detection on a difference image of a measured and simulated thermal image can lead to complementary and expanded information. We did not restrict ourselves to a further definition of anomalies, i.e. labeling a certain class or object in the scene as anomalous. This allowed us to evaluate the proposed method in terms of what has been detected that would not have been detected otherwise. For the two given urban areas, conventional anomaly detection highlights cold surfaces such as metallic roofs. Our approach for LWIR images does not detect these, however steel roofings with warm signatures are detected. We conclude that these detection results represent modern steel roofings with adapted and optimized heat properties, which are common within the urban areas we tested and which are not considered by the underlying thermal model. This outcome could be interesting for today's city planners or city councils, e.g. to update or expand existing data records of their cities or to track urbanization relying on more sustainable and less heat-absorbing materials.
The class-wise evaluation showed rather disadvantages in our approach compared to the full-image evaluation. Sensitivity towards an overall continuous temperature change was higher, and most of the detected anomalous pixels were found in the boundary regions of the corresponding binary mask. The latter is stated by the fact that a measured LWIR image does not provide sharp edges, as it is given in the binary masks. We suggest to exclude a boundary layer or to add additional sensor-like blur to the simulated image in future work, which might also optimize the full-image results.
In summary, our approach makes use of commonly known and easy to implement algorithms such as the Reed-Xiaoli-Detector and image differencing. With these basic tools, extraction of new information was however possible. For future work, we pursue, on the one hand, the application of advanced anomalous change detection techniques, and, on the other hand, improvement of the thermal map generation.