AN APPROACH TO ALLEVIATE THE FALSE ALARM IN BUILDING CHANGE DETECTION FROM URBAN VHR IMAGE

Building change detection from very-high-resolution (VHR) urban remote sensing image frequently encounter the challenge of serious false alarm caused by different illumination or viewing angles in bi-temporal images. An approach to alleviate the false alarm in urban building change detection is proposed in this paper. Firstly, as shadows casted by urban buildings are of distinct spectral and shape feature, it adopts a supervised object-based classification technique to extract them in this paper. Secondly, on the opposite direction of sunlight illumination, a straight line is drawn along the principal orientation of building in every extracted shadow region. Starting from the straight line and moving toward the sunlight direction, a rectangular area is constructed to cover partial shadow and rooftop of each building. Thirdly, an algebra and geometry invariant based method is used to abstract the spatial topological relationship of the potential unchanged buildings from all central points of the rectangular area. Finally, based on an oriented texture curvature descriptor, an index is established to determine the actual false alarm in building change detection result. The experiment results validate that the proposed method can be used as an effective framework to alleviate the false alarm in building change detection from urban VHR image. * Corresponding author


INTRODUCTION
Urban upgrading and sprawl is considered as one of the worldwide surface component alterations (Hussain et al. 2013), and becomes more and more significant with the implementation of new urbanization policy in China.The up-todate information about urban land use (especially man-made) is fundamental for the urban planning, management and geographic information updating (Wen et al. 2016).The remote sensing data has become a major source for land-cover and land-use change monitor (Hussain et al. 2013), and the VHR remote sensing images, i.e., images having spatial resolution of a meter or less, are more suitable to monitor detailed urban changes occurring at the level of ground structures such as buildings (Huang, Zhang and Zhu 2014).
Change detection is to determine and analyse the changes of the ground objects utilizing multitemporal remotely sensed images.The automatic and accurate change detection mainly root in the principle that different spectral feature caused by changes in the object of interest is separable from changes resulted from other factors, such as atmospheric conditions, illumination and viewing angles.Depending on the requirements in change detection from remote sensing image, various techniques have been developed and can be mainly categorized into image difference, image transformation, and classification-based approaches (Wu, Zhang and Zhang 2016), or pixel-based and object-based methods according the unit of analysis (Hebel, Arens and Stilla 2013).However, it is an increasingly difficult task to select the most suitable algorithm for change detection in specific applications (Tewkesbury et al. 2015).
As buildings are ones of the most dynamic structures in urban areas, building change detection has received more and more attention in recent years.Concerning the complex spatial arrangement and spectral heterogeneity within the class of buildings in VHR imagery, it is necessary to develop contextbased methods to exploit the spatial information for accurate change detection (Falco et al. 2013).For example, integrated the spectral and structural status for investigating the changed building information (Huang et al. 2014), divided whole image into blocks and based on pulse-coupled neural networks and the normalized moment of inertia feature to obtain the change map (Zhong et al. 2015), monitored land-use transitions from a semantic scene view (Wu et al. 2016), and represented complicated high-resolution scene by a set of low-dimensional semantic indexes (Wen et al. 2016).Buildings that are regarded as semantic homogeneous objects always show the inhomogeneous characteristic in VHR image because of the scattering contributions from sub-objects (Marin, Bovolo and Bruzzone 2015).Moreover, the change detection methods for VHR image may be influenced by many elements, such as georeferencing accuracy, larger backscattering variability in each class, different sensor viewing geometry and illumination angle (Hussain et al. 2013).
Particularly, the geometrical difference of buildings caused by the different sensor viewing angles and solar elevation angles in multitemporal images still pose huge challenges in change detection of urban buildings (Tang, Huang and Zhang 2013).Although registration is applied effectively, a certain unchanged man-made structures that are relatively tall might be detected as a changed area due to a different sensor position in multitemporal images (Wang et al. 2015).That is to say the rooftop always has a certain dislocation with the footing in urban buildings of an architectural height.This dislocation would be inconsistent among multitemporal images acquired from different solar or sensor positions, which is prone to result in serious false alarm in building change detection.Besides, the complex and diverse features of facade and rooftop in buildings make it difficult to deal with the false alarm problem.
Shadows can convey amount of information on the structure of building, therefore, the accurately extracted shadow regions could be highly useful for automated detection of buildings with arbitrary shapes (Ok 2013).In this paper, the shadows act as a clue to help locating the changed/unchanged buildings.A framework is proposed from a micro scene view for alleviating false alarm in change detection, which can become a useful supplement for the detection of building change.
The remainder of this paper is organized as follows.Section 2 describes the four main stages of the proposed method of false alarm alleviation in urban building change detection.In Section 3, we present the experimental results, followed by conclusion in Section 4.

Shadow Detection
The shadow in VHR image is a very important clue for detecting the man-made buildings, as it is normally accompanying with the latter.The color and the texture feature of the image contents are the main source of information for shadow detection in true color high-resolution satellite imagery (Elbakary and Iftekharuddin 2014).The increased spatial resolution does not facilitate the improvement of the classification accuracy, however, object-based image analysis (OBIA) which incorporate the spatial feature can be more efficient to deal with VHR images (Qin 2015).In this paper, a moderate over-segmentation is firstly obtained using Mean-shift algorithm (Comaniciu and Meer 2002) to make every shadow region being departed into several pure super-pixels.At object level, the dark color and plain texture are regarded as the identifiable feature in distinction of shadow regions from other categories of region.As the components in true color RGB and false color Hue-Saturation-Intensity (HSI) can define shadow regions (Cretu and Payeur 2013), the original RGB components and two transformational features are used to identify shadow regions in this paper, the latter are calculated as: The texture feature is extracted using the multi-scale and multidirection 2D discrete Gabor filters (Grigorescu, Petkov and Kruizinga 2002), which consists of a sinusoidal plane wave of a pre-specified frequency and orientation, modulated by a twodimensional Gaussian.Here the Gabor texture feature is formulated with four scales and six directions.
Extreme Learning Machine (ELM) is recently known as a successful supervised learning technique for the classification of hyperspectral images (Argueello and Heras 2015), which is a class of feedforward neural networks with random weights.In this work, the samples objects are firstly divided into six classes such as shadow, building, road, bare land, water, vegetation, and then the ELM is adopted to recognize the shadow regions based on the spectral and textural features of all objects in HVR image.The segmented shadow objects beside each other are merged into a big connected region through binarization firstly, and then the synapses around big shadows are smoothed by using morphological operator Open and Close successively, and some holes in the region are dealt with by using morphological Filling approach.As a result, the shadow masking image can be obtained.
In order to remove some pseudo shadow regions, both the area and shape of shadow region are taken into account.Specifically, to select the qualified shadow region from shadow masking image, its area should be bigger than a given value SA and its shape should be constrained below a given value SS.The shape is calculated using an index SI as: where A is the area L is the longer edge of minimum bounding rectangle of shadow region

Local Rectangle Construction
If the building itself is unchanged actually, no matter how the scattering color or shape appear on facade and rooftop of a building, the edge between the rooftop and the shadow region should be clear and invariant.Based on this observation, in this work, a starting line (LS for short) at the shadow side is firstly drawn and then move toward the side of rooftop in a limited rang.As a result, a rectangle can be constructed covering part of shadow and part of rooftop, where the discrimination of false alarm of change detection then be implemented.In this rectangle construction process, two crucial factors play important roles, one is the orientation of LS and another is its moving distance DM.
In this paper, the LS is drawn on the basis of the dominant orientation of shadow region which mainly reflects the orientation of the building.Building orientation is also called as the dominant building direction, as the saying goes North-South orientation, was defined by minimizing the projections in dominant orientation and second orientation (Zhang et al. 2013).
Here the outline of shadow region is assumed as a connector of footprints whose two orientations are also taken as directions to be penalized, the target function is minimized as follows: where d is the dominant direction pi is the vector (having its length and slope information) of the footprint boundary i F represents the shadow outline d ⊥ is the vertical direction of d The equation ( 4) is solved by using Newton method here.In the process of rectangle construction, DM as another key factor is the only one manual setting parameter.It should be paid attention to that too small DM will result in too flat rectangle covering just few meaningless region, on the contrary, too large DM will result in too large rectangle covering much unwanted features of other classes.Since the edge between shadow and rooftop is the key to judge the changed/unchanged building in this paper, a DM can cover a part of rooftop with 3 meters wide is practical for VHR image.
Once the dominant orientation is formulated, the LS and the rectangle can be obtained using following four steps: Step 1: Respectively, get the original outline, Minimum perimeter of Polygons (MPP) and the centroid point of MPP for every shadow region.
Step 2: Obtain dominant orientation according to the MPP, and get its Minimum Bounding Rectangle (MBR).
Step 3: Taking the dominant orientation as slope, draw a line through the centroid point and reach the two edge at opposite side of MBR, this line is the LS.
Step 4: Start from the Ls, move towards the rooftop side at a range of DM, then the rectangle can be achieved.
According to the above steps, the very building always has oneto-one correspondence with a rectangle.However, in one particular case, a building may correspond to a partial rectangle as MMP cannot exactly overspread a narrow shadow region.
Figure 1 illustrates a way to address that unusual situation: Figure 1.Rectangle construction on a narrow shadow

Spatial Topological Matching
When all rectangles have been obtained, the next task is to match the corresponding rectangles from bi-temporal images.
The unchanged buildings are of same ground locations in the bi-temporal VHR urban images, however, are different in their rooftop locations because of different sensor viewing angle or solar angle.Generally, this dislocation is limited in such a controllable range that the corresponding rectangles in bitemporal images can have an overlapping part, and the dislocation effect happened among all buildings in each temporal image is consistent.That is to say if two corresponding buildings don't experience any change in the bitemporal images, an overlapping probably occur in the corresponding rectangles.On the contrary, the non-overlapping rectangles or the ones only overlaps in a very small extent (smaller than a given threshold 20) convey the actual building changing information.
Furthermore, a global view is adopted to investigate the correspondence for all overlapping regions in bi-temporal images.Since the dislocation is consistent in each whole image, the spatial topological relation of the buildings can be well kept in each temporal image respectively.Firstly, all the shadow regions are numbered, and the rectangles obtained from the same shadow are labelled with the same number of the shadow respectively.Then the centroid of every rectangle or mean one of the rectangles of same number is calculated.According to the above mentioned process of rectangle construction and rectangles numbering, it can be known that the rectangles encompassing the edge can well maintain an actual similar spatial topological relation as a whole like that of the buildings in each temporal image.Moreover, the sets of centroids about unchanged buildings can denote the same spatial topological relation in bi-temporal images.On the contrary, the actual changed buildings probably cause a different spatial topological relation between bi-temporal images overall.
To abstract the centroid subset about unchanged building in the bi-temporal images, this paper introduces a geometric invariant matching method (Qu 2012), which aimed to deal with the unequal redundant points set based on a rigid (isometric) transformation.Let P and Q denote the centroid sets in bitemporal images respectively, and the number of centroid points (viz.c-points) in these two sets is N and M. The spatial distance between every pair of c-points in each temporal image is calculated by following formula: where Pij is a square symmetric matrix with its size N × N Qij is a square symmetric matrix with its size M × M pi and pj are the c-points in P qi and qj are the c-points in Q Based on Pij and Qij, the distance between all c-points in bitemporal images is calculated as: where Pi is the i row in P Qj is the j column in Q σ is a noise variance Pi and Qj denote the distances between the point in each of the rows (Pi) or columns (Qj) and all points in set P or Q. Qij is a matrix with its size N × M representing a coupling between these distances.σ needs manual set and its empirical values range is: where δ is the amplitude of noise dmin is the minimum value in both P and Q The spatial topological relation among c-points in bi-temporal images can be obtained by the formula: where WG is the maximum both in all rows and columns It should be noted that there are a few of disturbers which arise from the incorrect shadows extraction may lead to unwanted corresponding relationship.Aim to address this problem, another overlay analysis is carried out for all matched rectangle pairs, and only the rectangles of more overlapping parts than a given threshold PO are accepted and sent to the next stage.

Micro Scene Discrimination
It is based on the assumption that if no change occurs between the corresponding buildings in bi-temporal images, the statistical feature of image content from the shadow side across the edge to the rooftop side probably keep consistence, and vice versa.Accordingly an index is designed to detect the false alarm between every matched rectangle pair based on the texture descriptor Oriented Texture Curves (OTC) (Margolin, Zelnik-Manor and Tal 2014).OTC can capture the texture of a patch along multiple orientations while maintain robustness to illuminating, geometric and local contrast variability.In this paper, every rectangle is regarded as a micro scene and departed into several strips at the same orientation as LS, the statistical value of pixels at every strip is formulated as: where θ is the orientation i the number of strip Sθ,i stand for a strip |Sθ,i| is the quantity at this strip P(x) is the value of RGB In this paper, the θ is chosen as the direction perpendicular to the LS.Furthermore, the following formulas are used to get the curvature from the shadow side to rooftop side: where 1, 0 () 1, 0 In some special cases, the LS is so close to the edge that the curvature of this scene would be unable to show an obvious variant value from peak to trough.Because of this, rectangle is expanded by pushing the starting edge backward for a range of 0 to 10 pixels.Figure 2 is two curvature curves of two corresponding micro scenes.We can see from this figure that both the blue and red curves experience an obvious ups and downs and the other parts are relatively flat or disorder.According to this observation, the changed/unchanged buildings can be identified by comparing the amplitude between peak and valley with the wavelength between the peak and the valley, as the "l" and "d" illustrated in Figure 2, using the formula l d   .Based on λ the index T1 is calculated as: where λ1 and λ2 represent the ratio of the amplitude to wavelength for two scenes respectively T1∈ [0,1], if the value T1 is more close to 1, the architectural structure and the translation from shadow rooftop are more similar, which denotes that there is no change occur in the corresponding buildings.Average T1 is calculated when rectangles are of the same number.
On the other hand, another index T2 emphasizes the length similarity of rectangle using the formula as: , , where W1 and W2 are the width of starting edge of two matched rectangles Generally, if the two rectangles correspond to two unchanged buildings, the edge should keep its original shape, thus T2 is close to 1. Summation T2 is formulated when the rectangles are of the same number.
The final index for determining false alarm is a combination of the T1 and T2: where ω is the weight to measure the building structure (1-ω) is the weight to measure the building width The range of T is [0, 1], a bigger T indicates two similar rectangles.In practice, a threshold T0 is given to judge the false alarm information in building change detection.
After all the identification of changed/unchanged rectangles finished through this process, the false alarm masking image would be achieved as the final result.The overlapping of the masking image with the change detection result can be taken as an activator to adjudge the false alarm to improve the latter substantively.

EXPERIMENT RESULTS
A case study on the Worldview 2 (WV2) data with 0.5m spatial resolution is conducted to validate the proposed approach.The study area is located in Changsha city, Hunan province, China.After the shadows have been obtained, we set the parameter SA to 150 and SS to 0.05 to remove the unwanted fractional or pseudo shadows, the final building shadows are shown in Figure 3(b, b').At the stage of local rectangle construction, the distance DM for LS moving toward the rooftop is set to 20, as the results shown in Figure 3(c, c').At the spatial topological matching stage, the noise parameter is set to σ = 3.25, and the area threshold PO that determines the effective matching rectangle pairs is set to 50, as its matching results shown in Figure 3(d, d').In order to avoid the scene only covering few shadows, the distance DO is set to 2. In the false alarm detection, the ω is set to 0.55 and the T is set to 0.5.Quantitatively, the accuracy of false alarm alleviation is 74.29 percent at object level, and 88.34 percent at pixel level, as shown in Table 1.Combining Figure 6 with Table 1, we can make the conclusion that the constructed rectangles via the proposed approach can overlap the change detection results, and the identified rectangles through micro scene discrimination are practicable to denote most of the real false alarm.The accuracy of alleviation 74.29% 88.34% Table 1.The accuracy of the false alarm alleviation

CONCLUSION
The contribution of this study is to propose a framework to alleviate the false alarm in urban building change detection mainly caused by different view angles of sensors or solar elevation angles in bi-temporal VHR images.As the urban buildings always shown inhomogeneous characteristic in VHR image, it is difficult to extract the whole body (even the whole rooftop) of building exactly, which pose the false alarm alleviation a hug challenge at the whole image level.Based on this observation, this proposed method just takes the micro scene encompassing the edge between shadow and rooftop as the target area to carry out subsequent stages.In order to get the micro scene, the shadows are first extracted then the rectangle covering partial shadow and rooftop is constructed by drawing a starting line and then moving in a limited range.The potential unchanged buildings are determined by using a geometric invariant matching method to capture their most similar spatial topological relation based on all centroid points of the rectangles.The transition of image content from the shadow side to the rooftop side has been investigated by considering whether the major change showing in curvature curve occurs in similar stripe range in the two corresponding micro scenes.The experiment results show that the proposed method can effectively reveal the actual unchanged buildings through a micro scene-based image analysis approach.
It should be noted that the proposed approach just alleviates the false alarm occurring on an entire unchanged building rather than the unchanged part of an expanding building, and it is more effective in false alarm alleviation for the object-based method than for pixel-based method as the VHR image used.
In future research, we plan to promote the proposed method to alleviate the false alarm in more complex urban areas, and attempt to extend the method to be a new scene-based change detection method to solve the false alarms more substantially.
Figure 1(c) is a MPP of the shadow region in Figure 1(b) but just covers part of the region making a rectangle like Figure 1(d); The white region in Figure 1(e) is the lost part, where a new MPP is got to make another rectangle showing in Figure 1(g).It is worth to note that if more than one shadow part exist like Figure 1(e), an area threshold as 50 pixels is given to abandon the too small partial shadow.Finally, the two subrectangles are regarded as an integration like Figure 1(h) shows.

Figure 2 .
Figure 2. The illustration of false alarm determining based on the curvature.
Figure 3 that most unchanged buildings in the same physical location appear the different spectral signatures of surfaces, which mainly due to the different level of moisture on different image acquisition dates.
Figure 3(e, e') show the obtained results of micro scene discrimination.Figure 3(f, f') show the actual buildings corresponding to the rectangles in Figure 3(e, e').Compared Figure 3(f, f') with Figure 3(a, a'), it can be found that all extracted unchanged buildings can denote the potential regions which are involved in actual false alarm of change detection.

Figure 3 .
Figure3.The results of false alarm detection.The first row for the I 11 , second row for the I 15 ; the 1 th column are the original images, the 2 th column are the shadows, the 3 th column are the rectangles, the 4 th column are the results of spatial topological matching, the 5 th column are the result of scene discrimination, the 6 th column are the unchanged buildings corresponding to the results in 5 th column and are emphatically visualized.Both the spatial topological matching and micro scene discrimination are the crucial stages in the proposed approach, since the former is designed to extract the matched rectangles and the latter is established to determine the true unchanged rectangles based on the former results.Here the effectiveness of these two method stages are tested.Firstly, all buildings in two images are masked via visual interpretation, as shown in Figure4, there are 39 buildings in I 11 in Figure4(a), 30 in I 15 in Figure 4(b), and the overlap buildings are shown in Figure 4(c).The unchanged buildings are shown in Figure 4(d), the changed buildings in I 11 and I 15 are shown in Figure 4(e) and Figure 4(f).There are 22 unchanged buildings (e.g., the 1 both in I 11 and I 15 ), and the changed building mainly experience the new-built (e.g., the 23 and 27 in I 11 ) or reconstruction (e.g., the 24 in I 15 ).

Figure 5
Figure 5(a) is the overlapping result of Figure 3(d) and Figure 4(d), which shows that all 22 actual unchanged building have a well matching result.From Figure5(a) we can find that almost every matched rectangle exactly overlaps the actual unchanged building even different disturbing exist on the rooftop in bitemporal images.For example, in both two 17 th rectangle with the same part edge (Figure5(b)), the curves well reflect the building structure overall as the amplitudes between the ups and downs are similar in two curves.Consequently, the T1 = 0.7115 and T2 = 0.8889 indicate the unchanged buildings.On the other hand, there is only one rectangle at bottom-right of Figure5(a)

Figure 4 .
Figure 4.The manual buildings masking.(a) shows the all buildings in I 11 marked as red, (b) shows I 15 marked as blue, (c) shows the overlapping (marked as green at pixel level) of (a) and (b), (d) shows the unchanged entire buildings, (e) and (f) show the changed buildings at object level.

Figure 5 .
Figure 5.The effectiveness of rectangle matching and scene discrimination.(a) is the overlapping of matched rectangles and actual unchanged buildings.(b) and (c) are the examples of scene discrimination: (b) shows the 17 th rectangle in both I 11 and I 15 and (c) shows the 34 th in I 11 and the 30 th in I 15 .In order to validate the proposed approach in false alarm alleviation of change detection, a morphological building index based method (MBI-based method for short) (Huang et al. 2014) is used to carry out building change detection on the same study image.Figure 6(a) shows the change detection results achieved by the MBI-based method, where there are 85 objects composed by 17825 pixels.The final result of the proposed approach (Figure 3(e)) is taken to overlap the changed detection results, as shown in Figure 6(b).In Figure 6(c) there are total 35 objects including 6723 pixels belong to the real false alarm obtained via visual interpretation, among which there are 26 objects including 5939 pixels belong to the alleviated false alarm.Quantitatively, the accuracy of false alarm alleviation is 74.29 percent at object level, and 88.34 percent at pixel level, as shown in Table1.Combining Figure6with Table1, we can make the conclusion that the constructed rectangles via the proposed approach can overlap the change detection results,

Figure 6 .
Figure 6.The effectiveness of the proposed approach in MBIbased change detection.(a) is the change detection results of MBI-based method; (b) is the overlapping (marked as green) of the detection result (marked as blue) and the identified rectangles (marked as red); (c) is the real false alarms including the alleviated ones (marked as green) and the lost ones (marked as red).Object Pixel Detected result 85 17825 Real unchanged building 35 6723 Alleviated false alarm 26 5939