A COMPARATIVE ASSESSMENT OF EFFICACY OF SUPER RESOLVED AIRBORNE HYPERSPECTRAL OUTPUTS IN URBAN MATERIAL AND LAND COVER INFORMATION EXTRACTION

Urban areas despite being heterogeneous in nature are characterized as mixed pixels in medium to coarse resolution imagery which renders their mapping as highly inaccurate. A detailed classification of urban areas therefore needs both high spatial and spectral resolution marking the essentiality of different satellite data. Hyperspectral sensors with more than 200 contiguous bands over a narrow bandwidth of 1-10 nm can distinguish identical land use classes. However, such sensors possess low spatial resolution. As the exchange of rich spectral and spatial information is difficult at hardware level resolution enhancement techniques like super resolution (SR) hold the key. SR preserves the spectral characteristics and enables feature visualization at a higher spatial scale. Two SR algorithms: Anchored Neighbourhood Regression (ANR) and Sparse Regression and Natural Prior (SRP) have been executed on an airborne hyperspectral scene of Advanced Visible/Near Infrared Imaging Spectrometer-Next Generation (AVIRIS-NG) for the mixed environment centred on Kankaria Lake in the city of Ahmedabad thereby bringing down the spatial resolution from 8.1 m to 4.05 m. The generated super resolved outputs have been then used to map ten urban material and land cover classes identified in the study area using supervised Spectral Angle Mapper (SAM) and Support Vector Machine (SVM) classification methods. Visual comparison and accuracy assessment on the basis of confusion matrix and Pearson’s Kappa coefficient revealed that SRP superresolved output classified using radial basis function (RBF) kernel based SVM is the best outcome thereby highlighting the superiority of SR over simple scaling up and resampling approaches. * Corresponding author


INTRODUCTION 1.1 Background
Urban areas are defined by attributes that are made up of a nonuniform composition of artificial and naturally available materials.For instance, the similar land use (LU) classes representing urban areas comprise man-made structures that can be spectrally distinguished at smaller scales.Such a human conditioned environment is thus distinct in terms of features found in the unharmed natural surroundings.In spite of this heterogeneity, coarse or medium spatial resolution imagery cannot be used to perform a correct mapping of urban areas owing to presence of 'mixed pixel', i.e., pixel containing more than one feature type.Hence, high spatial and spectral resolutions are the requirements of an intricate urban categorization.
Images acquired by hyperspectral sensors are capable of improved target detection due to presence of a large number of contiguous bands separated by a narrow wavelength interval of the order of 1-10 nm (Eismann et al., 2004).This spectrally rich content when combined with detailed spatial information leads to incurring of high costs.Only a little amount of the total radiant energy reaches the sensor owing to such a narrow slicing of the spectrum.Consequently the pixel size on the chip and the pixel footprint on the surface have to be augmented for obtaining an agreeable signal-to-noise ratio (SNR; Gaidhani, 2011).Thus low spatial resolution of hyperspectral data as compared to multispectral or panchromatic data becomes a major drawback.On the other hand, a major amount of spectral information is lost during the acquisition of multispectral images as the entire scene radiance is integrated over broad spectral bands in order to achieve a high spatial resolution.Due to physical limits at hardware level, the trade-off between elaborate spectral and spatial information is difficult to overcome.A solution to this problem is provided by resolution enhancement techniques.These consist of processes of interpolation, fusion, restoration and SR (Nasrollahi and Moeslund, 2014).
In the process of interpolation, the input low resolution (LR) image undergoes a transformation onto a high resolution (HR) grid and a function is used to figure out the missing values (Fernandez-Beltran, Latorre-Carmona and Pla, 2017).A better quality output is obtained through restoration although the size remains the same as that of the input LR image (Park, Park and Kang, 2003).The objective of the fusion process is the generation of an output with a high spatial resolution which always needs two sensor datasets for its implementation (Charis Lanaras, Baltsavias and Schindler, 2015).Fused outputs also suffer from serious blurs in case of hyperspectral data (Kwan et al., 2018).These shortcomings are overcome by SR processes, which refer to that class of resolution enhancement algorithms which try to recreate the original scene in HR from LR input image(s) of the same scene.According to the input information, SR can be multi-frame or single frame.The output of multiframe methods depends on the imaging model adopted and a number of papers discuss the flow of multi-frame SR algorithms (Bioucas-Dias et al., 2013;Yue et al., 2016;Zhang et al., 2014).However, less number of surveys exist for single-frame SR.According to Fernandez-Beltran, Latorre-Carmona and Pla (2017) single-frame SR techniques can be of three types: reconstruction, image learning and hybrid.
In image learning techniques, a relationship between HR and LR domains is established using external training data or from the input image itself.The relationship learnt determines the quality of the output.The methods of ANR and SRP are based on the principles of image learning.The former uses the idea of neighbourhood embedding for SR incorporating the concept of sparse representation for dictionary construction (Timofte, De Smet and Van Gool, 2013) while the latter uses kernel based regression for establishing the relation between LR and HR spaces (Kim and Kwon, 2010).A post-processing step using Natural Image Prior (NIP; Tappen et al., 2003) is also employed in SRP for removing the blurring and ringing artifacts introduced around major edges in the final super-resolved output.
The SR algorithms whether operating on frequency or spatial domains have been tested only on natural images so far and at most only aerial images have been super-resolved (Suganya, Mohanapriya and Vanitha, 2013;Zhang et al., 2014;Kwan, Choi, Chan, Zhou and Budavari, 2017).Quality metrics have also been used to assess the effectiveness of SR algorithms.Vaiopoulos (2011) utilized spectral and spatial indices for evaluating the quality of fused outputs generated using Hyperion, Advanced Land Imager (ALI) and Landsat Enhanced Thematic Mapper + (ETM+) sensors.These were Bias, Correlation Coefficient (CC), Difference in Variance (DIV), Erreur Relative Globale Adimensionnelle de Synthese (ERGAS), Entropy (E), Universal Image Quality Index (Q), Relative Average Spectral Error (RASE) and Root Mean Square Error (RMSE).
The advancement in sensor technology has led to the evolution of information extraction methodologies for handling the high volume and rich quality of data acquired by hyperspectral sensors.A number of full-pixel and sub-pixel classifiers have been developed.They are generally based on statistical analysis (Mather, 1999), neural networks (Foody, 2000), and decision tree methods (Hansen et al., 1996).Spectral Angle Mapper (SAM) introduced by Kruse et al. (1993) is a popular algorithm for classifying hyperspectral data and performing spectral similarity analysis.SAM permits rapid classification by comparing the image spectra to a known spectra or an end member being insensitive of illumination and does not take the heterogeneity of the Earth into account assuming the end member to be a pure representation of the material under consideration (Moughal, 2013).Efficient discrimination on the basis of training pixels can be obtained with Support Vector Machines (SVMs) which select hyper plane having maximum margin separation between classes.The hyper plane is represented by a kernel defined generally by a linear, polynomial, radial or sigmoidal function (Huang, Davis, and Townshend 2002) whereas the margin is the summation of the shortest distances from the separating hyper plane to the data points of both the categories.Moughal (2013) states that the multiclass problem in hyperspectral imagery can be handled using SVMs.SVM was compared with SAM and ML in this study.By applying Minimum Noise Fraction (MNF) Transform, SVMs show higher accuracy than other classifiers.
As regards in urban areas, very few studies have been performed using hyperspectral data.Hepner et al. (1998) acquired spectra of different urban land cover types using AVIRIS and interpreted their separability for urban land cover mapping.The significance of different spectral regions for mapping of urban areas was discussed by Ben-Dor et al. (2001).Spectral mixture models have also been used to map urban materials at sub-pixel scales (Rashed et al., 2001;Wu and Murray, 2003).Recently, Kotthaus et al., (2014) derived an urban spectral library using the portable Fourier Transform Infrared (FTIR) spectrometer for 74 samples of various impervious urban materials found in the city of London.Plots of the short-wave reflectance (300-2500 nm) and long-wave (8-14 um) emissivity spectra can be found in the London Urban Micromet data Archive (LUMA; http://micromet.reading.ac.uk/spectral-library).

Study Area
The study area, on the pre-processed scene of AVIRIS-NG with the colour combination of Red: 86, Green: 44 and Blue: 24, is shown in Figure 1.

Materials Used
Airborne hyperspectral image acquired by AVIRIS -NG of Jet Propulsion Laboratory (JPL), National Aeronautics and Space Administration (NASA) under the ambit of Indian Space Research Organization (ISRO) -NASA Airborne Hyperspectral Imaging (HySI) Programme has been used for this study.The dataset has a spatial resolution of 8.1 m and possesses 425 bands spaced at 5 nm over the wavelength range of 376.440002 nm -2500.120117nm.The date and time of scene acquisition are February 11, 2016 at 08:01:29 am stretching over a latitudinal extent of 22°59'16.57"N -23°3'21.04"N and longitudinal extent of 72°25'19.94"E -72°45'6.68"E.

METHODOLOGY
The methodology adopted has the following major stages: data pre-processing, resolution enhancement using SR, comparative analysis, classification and accuracy assessment.These steps are described in the forthcoming sub-sections.

Data Pre-processing
Sensor error correction was performed in the form of removal of bands that contained noise or no information at all.This process left only 353 bands in the Level 2 AVIRIS-NG reflectance file.. Following this a square patch of 272 lines and 272 samples was extracted to define the study area.The patch was then broken down into individual bands to be used as input into SR algorithm, so as to assess the computational efficiency of the SR process.

Resolution Enhancement Using SR
Two single-frame SR algorithms based on the fundamentals of image learning have been implemented on the band wise input LR data.They are ANR given by Timofte, De Smet and Van Gool in 2013 and SRP given by Kwang In Kim and Younghee Kwon in 2010.Details of these techniques can be found in (Timofte, De Smet and Van Gool, 2013) and (Kim and Kwon, 2010).These methods have been chosen on the basis of robust visual quality of output, faster computational time and ability to recover accurate spatial and spectral characteristics of the input LR image as reported in literature.
While running ANR, first and second order gradients have been used for patch representation and same set of images have been used for dictionary training as by (Zeyde et al., 2012).Further the size of the dictionary defined is 2048 atoms and 256 atoms have been allocated as the maximum limit for neighbourhood formation around each atom.A scaling factor of 2 has been considered for both ANR and SRP.
In SRP, a 7X7 input patch was used for the training and testing phases of kernel based regression while the output patch size for the same was 5X5.Weight and kernel parameters taken were 0.05 and 0.5 x respectively.About 300 basis points were taken for initialization of the solution of optimization of kernel based regression.The upper and lower threshold limits for classifying an edge pixel into 'major' or 'minor' during postprocessing stage were set as 2.2 and 0.95 respectively.
Each SR process was run for 353 times to generate 353 HR bands which were stacked together and assigned the coordinate system of the input LR image.This gave the final superresolved output.

Classification
The scaled-up image and the generated super-resolved outputs were classified using supervised SAM and SVM approaches.For the purpose of classification, training samples pertaining to the ten target classes were taken.The classes taken include five building rooftop materials and one pavement material: china mosaic, tin, concrete, Galvanized Iron (GI) sheet, tarpaulin and asphalt and four natural surfaces: water, soil, vegetation and grass.These training samples represent the spectral pattern of the classes and help the classifier in assigning the image pixels to a particular class.
In the case of SVM, four types of kernels used for defining the hyper plane have been tested using the default parameters.They are: radial basis function (RBF), linear, sigmoid (sig) and quadratic polynomial (poly).

Accuracy Assessment
Accuracy assessment is a very important step in the validation of the classification results.An error matrix is used for identifying the overall error for each category and the misclassification occurring for each category.User's accuracy (UA), producer's accuracy (PA) and overall accuracy (OA) are calculated through this matrix.

RESULTS AND DISCUSSION
The results obtained as outcome of the methodology adopted in this research work are presented in the sub-sections to follow.

Super-Resolved Outputs
The super-resolved outputs generated have a spatial resolution of 4.05 m.Comparative analysis has been performed on the basis of visual inspection, basic statistics and computational time.One of the locations of patches taken for visual inspection is shown in Figure 2

Urban Material and Land Cover Maps
Supervised SAM and supervised SVM have been employed for preparing urban material and land cover maps using the superresolved outputs and the NN resampled image.The efficacy of the prepared maps has been assessed by validating 100 random points distributed over the classified results against 100 points collected as ground truth during field visit to the study area.Visual inspection has also been performed for the purpose of intra-class, inter-classifier and inter-dataset comparison.As mentioned in Section 2.4, four types of kernels were employed for SVM using default parameters.The best SVM output was decided on the basis of OA and Kappa Coefficient results for each of the three datasets as well as the highest PA and UA values for each class under consideration.The best SVM output has then been used for inter-classifier and inter-dataset comparison.OA and Kappa Coefficient Results for the four kernels across the three datasets are reported in Table 1 below.
Table 1.OA and Kappa Statistics for Different Types of SVM Kernels RBF based SVM gives the best results irrespective of dataset reporting over 80% OA and Kappa coefficient of above 0.79 in each case.Also, for most of the features whether natural or man-made RBF SVM classified outputs report highest PA and UA thereby affirming that it is the best kernel to be used for performing SVM using the default parameters in the present context and hence the urban material and land cover map obtained using RBF kernel based SVM has been taken for comparative analysis.
Location of patch taken for visual inspection of classified result is shown in Figure 4 and magnified version of the representative patches along with their Google Earth image is shown in Figure 5 and Figure 6 for SAM and SVM respectively.Visual examination of the urban material and land cover maps reveal that urban materials have been classified appropriately in the super-resolved outputs.However, a substantial amount of misclassification has taken place in the map generated from NN resampled output even though this resampling approach preserves spectral information of the input dataset.For instance, appearance of china mosaic pixels in concrete structures, bare soil, tin and GI sheet sheds in some patches, and vice-versa; classification of water as asphalt at few places.Edges and boundaries of different man-made features can also be distinguished clearly in the SVM outputs indicating that the enhanced spatial information in the superresolved outputs has been taken into account.Irrespective of classifier, it can be observed that there has been mixing of asphalt and other rooftop classes, wherever the rooftop classes are accompanied by shadow.These portions in the image have been designated as asphalt instead of the respective rooftop material.
OA and Kappa Coefficient values for the classification techniques used in the three datasets are shown in Table 2 and Table 3 below.The major rooftop building materials have also been classified efficiently with SVM showing the highest producer and user accuracies for china mosaic, tin and concrete.A similar pattern is reported for asphalt too.GI sheet reports moderate values while tarpaulin shows the lowest values.This could be attributed to the occurrence of these materials as mixed pixels in the dataset and consequent misclassification into other classes owing to use of hard classifiers.As the accuracy values fall within the limits, it can be reiterated that the spatial and spectral information has been preserved in the superresolved outputs with enhancements observed at many places in the scene.
Table 5. UA for Individual Class of Three Datasets SVM gives better results as it does not estimate the statistical distribution of feature classes to undertake classification wherein the classification model is established by performing margin maximization using only few training pixels.Also it has better generalization capability compared to other classifiers producing best results even from data having large dimensionality and high amount of noise.

CONCLUSION
It can be concluded from this study that different built-up features like roads, railway tracks, buildings which could not be detected in the input LR image can be observed clearly upon super-resolution.It can also be said that SRP is the better SR method of the two techniques demonstrated in this study.The exercise of information extraction using super-resolved outputs has been a success with the urban material and land cover map prepared using RBF kernel SVM over SRP dataset yielding the best accuracy and visual examination outcomes.

Figure 1 .
Figure 1.Study Area It occupies an area of about 4.85 square kilometres (sq.km) in the eastern part of Ahmedabad, the largest city of Gujarat state in India.Extending from 22.99895292 N, 72.58969011 E to 23.01874265 N, 72.61159437 E; the area of exercise is bounded by Kasturba Gandhi Marg in the north and the walled city adjoining it, industrial area of Rajpur -Gomtipur in the east, Gita Mandir Road in the west and area of Maninagar in the south.The jurisdictional authority is the civic agency of Ahmedabad Municipal Corporation (AMC).The mixed environment centred on Kankaria Lake contains maximum possible distinct features like roads, railway lines, water body, and different types of vegetation and building rooftop materials. below.

Figure 3 .
Figure 3. Enlarged Version of Patches in NN Resampled Image and Super-Resolved Outputs

Table 2 .
OA Values for SAM and SVM

Table 3 .
Kappa Statistics for SAM and SVM Lower overall accuracies and Kappa coefficients have been reported for SAM across all the three datasets ranging between 68%-71% and 0.63-0.67respectively.On the other hand, SVM performed using RBF kernel has given the best results in terms Class wise PA and UA for the datasets Resample_NN, SRP Output and ANR Output across the employed classification techniques are shown in Table4 and Table 5 respectively.
of overall accuracy and Kappa coefficient respectively across all the super-resolved and resampled outputs: 92.73% and 0.9119 for SRP, 80.64% and 0.7737 for ANR and 82.67% and 0.7937 for NN resampled output.

Table 4 .
PA for Individual Class of Three DatasetsThe highlighted values indicate the best possible accuracies.Very high level of accuracy has been observed for the natural features of water body, soil and vegetation across all classifiers.