DISCRETE TOPOLOGY BASED HIERARCHICAL SEGMENTATION FOR EFFICIENT OBJECT-BASED IMAGE ANALYIS : APPLICATION TO OBJECT DETECTION IN HIGH RESOLUTION SATELLITE IMAGES

With rapid developments in satellite and sensor technologies, there has been a dramatic increase in the availability of high resolution (HR) remotely sensed images. Hence, the ability to collect images remotely is expected to far exceed our capacity to analyse these images manually. Consequently, techniques that can handle large volumes of data are urgently needed. In many of today‟s multiscale techniques the underlying representation of objects is still pixel-based, i.e. object entities are still described/accessed via pixelbased descriptors, thereby creating a bottleneck when processing large volumes of data. Also, these techniques do not yet leverage the topological and contextual information present in the image. We propose a framework for Discrete Topology based hierarchical segmentation, addressing both the algorithms and data structures that will be required. The framework consists of three components: 1) Conversion to dart-based representation, 2) Size-Constrained-Region Merging to generate multiple segmentations, and 3) Update of two sparse arrays SIGMA and LAMBDA which together encode the topology of each region in the hierarchy. The results of our representation are demonstrated both on a synthetic and a real high resolution images. Application of this representation to objectdetection is also discussed.


INTRODUCTION
Over the past few years, there have been significant improvements in our ability to capture high-resolution satellite images.For instance, the recent WorldView-2 sensor can capture images at < 0.5 m resolution with a collection capacity of 300,000 sq mi/day.At this rate, this instrument alone can cover the entire USA in 12 days.Further, over this decade, it is projected that 288 earth observation satellites from 42 countries are to be launched (Euroconsult, 2012).Our ability to collect high-resolution data far exceeds our capacity to analyse them manually.Consequently, techniques for automated production of geospatial information and assisted image analysis are urgently needed.
To deal with segmentation of high-resolution remotely sensed images -one of the core tasks of image analysis -, a plethora of techniques have surfaced in the literature (Dey et al., 2010).The most sought after techniques are those that incorporate multiresolution models through the use of appropriate scale space representations (Dey et al., 2010).This is because in complex high-resolution images, an object of interest to an analyst may reveal itself at any size/scale of observation; therefore analysis at multiple scales is necessary.Although these methods contribute significantly in dealing with complex high-resolution images (Baatz and Schape, 2000, Chen et al., 2009, and Syed et al., 2011), the underlying representation and information processing is still primarily pixel-based thereby creating a bottleneck when processing large volumes of data.Further, these methods do not fully leverage the topological information present in the images in order to improve detection results.The ability to leverage contextual information requires examining a regions (potential object"s) neighbourhood, and exploring the arrangement of adjacent regions (Syed et al., 2012).For instance, (Inglada and Michel, 2009) demonstrated successful use of spatial reasoning techniques to quantify topological relationships between image objects in an object detection algorithm using their multi-scale segmentation.However, their method of topological information extraction involved pixelbased processing which significantly limits real-time topological queries between any two regions.To mitigate the above mentioned problems, a discrete topology based framework for topological information extraction was proposed in (Syed et al., 2012).However, no framework was provided in reference to using this model for multi-level topological queries.
To overcome the above mentioned issues, we propose a framework for a discrete topology based multi-scale segmentation of high resolution satellite images.This proposed representation builds-on and improves our previous research on scale-space representation (Syed et al., 2011) and topological information extraction and encoding (Syed et al., 2012).Our goal is to provide an effective foundation/framework that will facilitate/assist analysts in tasks such as target detection/recognition, classification, change detection, and multi-sensor information fusion.
The remainder of this paper is organized as follows.Section 2, provides a description of the methods used in our framework.Section 3, presents the results of applying this framework to high resolution satellite images.Finally, Section 4 presents the conclusion.

FRAMEWORK FOR DISCRETE TOPLOGY BASED HIERARCHICAL SEGMENTATION
The three main components of our framework are: 1) Conversion to dart-based representation, 2) Size-Constrained-Region Merging to generate multiple segmentations, and 3) Update of two sparse arrays SIGMA and LAMBDA which together encode the topology of each region in the multi-scale segmentation.Steps 2 and 3 take place in tandem as each level of the hierarchy is created.A block diagram of the proposed method is shown in Figure 1.For a more detailed explanation of how σ encodes the topology of the regions please see reference (Syed et al., 2012)., which uses only 24 darts that not only describe the five regions, but also encode information for topological inferences.

Merging Mechanism to Generate a Hierarchy
The second component is the region merging mechanism to generate multiple segmentations.In this framework, the Size-Constrained Region Merging (SCRM) Algorithm (Castilla et al., 2008) is used to generate a hierarchy of segmentations by controlling the size of the objects that appear at any given level.
We have adapted the SCRM to generate segmentations at multiple scales but the core idea of the algorithm remains the same.A summary of the algorithm is provided in Figure 3. • find all its neighbours 3 • find its most similar neighbour msn(), based on spectral similarity Iterative Region merging step 4 for each region in lbi[] 5 if merge_condition based on size is true then 6 • merge region with its msn() 7 • Update new region properties and SIGMA(i,j) and LAMBDA(i,j) 8 • Enforce merging constraints on neighbours of merged regions Figure 3: Algorithm for SCRM The overall algorithm is based on iterative merging of regions, uniformly across the image, until all regions below a given size (area) are eliminated.The merge_condition in step 5 of the algorithm checks to ensure that the region being merged is smaller than the size constraint for that level and also ensures that this region and its most similar neighbour have not previously been merged during the same iteration.Enforcement of merging constraints in step 8 of the algorithm allows for controlled aggregation so that the resulting regions have the highest possible homogeneity given the size constraint.Which means that homogeneous regions are formed first, and then dissimilar gaps smaller than the size constraint are progressively incorporated into the former (Castilla et al., 2008).For more details on how the SCRM algorithm is used to generate a multiscale segmentation, refer to (Syed et al., 2011).

Data Structures & Procedures to Encode the Hierarchy
Finally, the third component is the procedure used to update the data structures that store the hierarchical scale-tree.As we deal with bigger and bigger remotely sensed images (area covered), the storage and efficiency becomes more important.Using a dart-based representation for each level, not only saves memory, but also enables efficient access and retrieval of the desired information from the hierarchy.The update of the data structures is performed simultaneously as the scale-tree is being built from one level to the next (Section 2.2).
The sparse arrays SIGMA and LAMBDA together encode the hierarchy of the scale-tree.They are similar in function to their one-dimensional counterparts described in (Syed et al., 2012).
The columns of the SIGMA array are σ-permutations which capture the incidence relations between the darts, while the columns of the LAMBDA array capture the region labels associated with each dart.Every column of these structures represents a level in the hierarchy.Only those rows of the new column (level) which are different than rows of the previous column (level) are stored in the array making for an efficient storage of the hierarchy.Note that progressively lesser number of darts will be needed to describe the regions as the levels go up.
For a level, the permutation sigma "σ' and region labels "λ' can be extracted from the structures by using equations ( 1) and ( 2), where 'd' represents the dart label and "i" represents the level in the hierarchy.
For illustration, consider a hierarchy of regions generated for the image shown in Figure 2(a).The hierarchy contains five levels with the first level shown in Figure 2(b) and the remaining four levels are shown in Figure 5.The corresponding SIGMA and LAMBDA arrays which encode these five levels are shown in Figure 6.Note that as the number of levels increase, the columns of the arrays become sparser.Removal of a region at a given level involves removing the darts associated with that region and updating the data structures SIGMA and LAMBDA.The steps used to update each of these structures will be briefly described in the proceeding sections.(a) SIGMA (Σ) (b) LAMBDA (Λ) Figure 6: Recall that the arrays sigma "σ" and lambda "λ" encode the topology of a single segmentation (Syed et al., 2012).SIGMA and LAMBDA are their multi-dimensional equivalents

Update of SIGMA
Recall that the permutation σ is stored as an integer array which encodes the dart incidence relationships in the counterclockwise direction (Syed et al., 2012), as shown in Figure 7(a).When using the dart-based representation of regions, the region merging operation consists of dart removal.Figure 7 illustrates the removal of dart -8 and the updated σ relationships.A detailed description of removal operation (Brun et al., 2003) is beyond the scope of this paper, but the elementary algorithm is shown in Figure 8.The update procedure involves removing the dart set K ={-8,8,9,-9, 11,-11}.By comparing the columns Level4 and Level5 of Figure 6(a), notice the updates that were performed using Algorithm 2 i.e. σ 4 (1) = -8 (note that σ 4 (1) = 0 in SIGMA(1,4) but it"s value can be inferred from its previous non-zero value in the table) has been updated to σ 5 (1) = -5 and σ 4 (3) = -11 has been updated to σ 5 (3) = -2.

Update of LAMBDA
As a new region is created at a level, its associated darts are labelled appropriately and stored in the array LAMBDA.Let R l be the result of merging regions R i and R j , then the algorithm to update the dart associated region labels is shown below in Figure 9.

RESULTS
Using the framework described in Section 2, a hierarchical scale-space representation of the images is generated.The encoding of this hierarchy using LAMBDA and SIGMA allows us to efficiently reconstruct the multiple segmentations for further processing that may be required.A visualization of the results of this representation is shown for both synthesized and real images.Note, that for this visualization, each region is represented by a node in the tree.The position of the node on the x-y plane is the centroid of the region.The position of the node on the vertical axis is a function of its position in the scale-space.

Visualization of the Results
Figure 10: Scale-Tree of image shown in Figure 4.

Results of Using Hierarchical Representation on High Resolution Satellite Images
This image is a small section from Le Faux, France image, courtesy Digital Globe©.Figure 11(a) through (f) shows a few levels of the multi-scale segmentations that are stored within the hierarchy.In Figure 11(f), note how the man-made objects (houses and roads) have been separated from the natural parts of the image scene (trees and grass), such a delineation is not readily apparent in Figure 11(a).As a result, this hierarchical representation allows an image analyst to observe and exploit information in the image that occurs at multiple scales.Figure 11(g) is a scale-tree visualization of the image content where each region is represented as node in the tree, positioned directly above the centroid of the region.

CONCLUSIONS
A framework for hierarchical segmentation using discrete topology was presented.Frist, the pixel-based multi-spectral image was converted to a dart-based representation.Then, the data structures and algorithms required to create and store the hierarchical representation were shown.SCRM algorithm was used to generate multiple segmentations of the hierarchy by controlling the size constraint parameter.
Application of this representation for four high resolution images from the WorldView-2 Sensor, was illustrated in the results section.Note that that this representation was generated completely in an unsupervised fashion, and yet provides the image analyst a useful tool to sort and navigate the multiple segmentations of a scene.Since the image has been converted into a tree, the next stage of this research involves automatic detection of objects of interest, through attributed tree matching techniques (Kriege et al., 2012).

Figure 1 :
Figure 1: Block diagram of overall framework.2.1 Conversion from Pixels to DartsThe first component of the process is designed to generate a dart-based representation from the pixel-based region map.This step reduces the dimensionality of the data significantly by representing regions by their darts (reduced boundary representation) thereby completely encoding the topology of the map.The resulting dart-based representation is a combinatorial map G such that G = (D, σ) where D is the set of darts and σ is defined on D such that a cycle of the permutation σ denoted by σ * encodes the nodes.The permutation σ for the map shown in Figure2(b) is the first column of the array shown in Figure6(a).For a more detailed explanation of how σ encodes the topology of the regions please see reference(Syed et al., 2012).
Conversion from pixel-based representation of regions to a dart-based representation.The simple image in Figure 2(a) is of size 20x10, therefore uses 200 pixels to describe the five regions contained within.Compare this to dart-based representation, shown in Figure 2(b)

Algorithm 1 :
Size Constrained Region Merging Input: • Size constraint for the given level max_sz • list of regions at a given level lbi[] Output: • list of regions at the next level lbj[] • update of data structures Find most similar neighbour 1 for each region in lbi[] 2 Figure 4: Application of the SCRM algorithm to a simple image shows how segmentations at multiples scales can be generated by altering the size constraint.International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XL-1/W1, ISPRS Hannover Workshop 2013, 21 -24 May 2013, Hannover, Germany Multi-level segmentation of image from Figure 2(a).
σ i for a node at level i (b) σ j for a node at the next level j Figure7: Encoding and updating of σ from one level to next.

Algorithm 2 :
Updating the permutation σ Input: • Array σ i• darts to be removed K = {d k } Output: • Array σ j Update the incidence relationships in σ j

Figure 8 :
Basic algorithm to update permutation σ International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XL-1/W1, ISPRS Hannover Workshop 2013, 21 -24 May 2013, Hannover, Germany Consider the merging of regions R6 and R8 from Figure 5(c) to get region R9 in Figure 5(d).

Algorithm 3 :
Updating the λ array for each levelInput: • darts d i ∈ R i and d j ∈ R j • removal dart set d k ∈ K Output: • darts set defining R l ={d l } 1 for each dart d k ∈ K 2 • set λ (d k ) = null 3 for each dart d s ∈ { ( d i  d j ) -d k } 4 • set λ (d s ) = l i.e. new label assigned to surviving darts Figure 9: Basic algorithm to update λ Once again, consider the merging of regions R6 and R8 in Figure 5(c) to get region R9 in Figure 5(d).The update procedure involves removing the dart set {-8,8,9,-9,11,-11}.Note that the LAMBDA entries in the last column of Figure 6(b), which correspond to the removal darts, have been set to zero.The surviving darts {-1,-5,-4,-3,-2} that define region R9 have been given their proper label.

Figure 11 :
Figure 11: (a)-(f) A few levels contained in the multi-scale hierarchy of a high resolution image (g) The scale-tree of the image.Scene WorldView-2 sensor courtesy DigitalGlobe©

Figure 12 Figure 13 Figure 14 :
Figure 12: (a)-(f) A few levels contained in the multi-scale hierarchy of a high resolution image (g) The scale-tree of the image.Scene WorldView-2 sensor courtesy DigitalGlobe© (Image : Sydney Olympic Complex, Australia)