NO-REFERENCE IMAGE QUALITY ASSESSMENT FOR ZY 3 IMAGERY IN URBAN AREAS USING STATISTICAL MODEL

More and more high-spatial resolution satellite images are produced with the improvement of satellite technology. However, the quality of images is not always satisfactory for application. Due to the impact of complicated atmospheric conditions and complex radiation transmission process in imaging process the images often suffer deterioration. In order to assess the quality of remote sensing images over urban areas, we proposed a general purpose image quality assessment methods based on feature extraction and machine learning. We use two types of features in multi scales. One is from the shape of histogram the other is from the natural scene statistics based on Generalized Gaussian distribution (GGD). A 20-D feature vector for each scale is extracted and is assumed to capture the RS image quality degradation characteristics. We use SVM to learn to predict image quality scores from these features. In order to do the evaluation, we construct a median scale dataset for training and testing with subjects taking part in to give the human opinions of degraded images. We use ZY3 satellite images over Wuhan area (a city in China) to conduct experiments. Experimental results show the correlation of the predicted scores and the subjective perceptions. * Corresponding author


INTRODUCTION
Chinese high resolution remote sensing satellites have generated a large number of remote sensing (RS) images every day.However users are still lack of domestic high quality RS data in application.To change this situation, in addition to improve the hardware conditions, satellite image quality evaluation and enhancement is the fundamental assurance for the commercial application.
In the chain of optical image acquisition for RS satellites the platforms, sensors, atmospheric environment, the surface albedos all have influence on image quality.Of above factors satellite platform and sensors' impact on the geometry precision could be eliminated by sensor calibration while the image quality degradation caused by path scattering and the spectral mixture from the surrounding ground is difficult to eliminate, especially in urban areas.Compared to the mountainous area and the cultivated region the RS images in urban areas are mainly featured with artificial structures such as buildings, roads, tree, and lawns.The RS city images often contain complicated structures, high contrast objects and large information contents.Moreover with the industrial development and environmental pollution in cities the frequently occurred fog and haze has made RS city image quality evaluation and improvement a very difficult task.
For one scene of a RS image the data quality can be divided into three parts: the geometric quality, the radiation quality and the quality of auxiliary data.The geometric quality is generally evaluated by internal and outside geometric positioning accuracy.The auxiliary data quality can provide the RS image information described by the data completeness, correctness and logic check results.Digital image based radiation quality expresses the ability of holding the surface reflectance characteristics (Wang, 2014).Depending on whether or not there are observers, the existing RS image quality evaluation is divided into two categories: the subjective evaluation and objective evaluation.For objective RS image quality evaluation a lot of previous work have been conducted and are mainly concentrated in the following four aspects (Chen, 2011Ma, 2014): (1) Task and application oriented RS image quality evaluation.The most typical example is the U.S. NIIRS which is used to assess the quality of military RS image classification (Irvine, 1997).The quality of image fusion, the classification accuracy and the scale of surveying and mapping are all task oriented.However, to fulfil the RS task successfully is not only decided by the quality of RS images but also by the contribution of data processing system and the visual interpretation done by workers.(2) Methods based on imaging system performance evaluation.In addition to the spatial resolution, temporal resolution, spectral resolution and radiation resolution, such methods focus on Modulation Transfer Function (MTF) based indexes and Signal to noise ratio (SNR) assessment (Zhang, 2002Ferzli, 2009).Good imaging system does generate high quality RS images.But the ground targets and imaging environment also play essential roles to the RS images acquisition.(3) Methods based on the image feature extraction.Such methods are mostly inherited from image processing field.They endeavour to find out the image features to represent the image quality degradation.Besides the traditional image features such as mean, variance, entropy, edges, contrast, moments some recent features include JNB (Ferzli, 2009), CPBD (Narvekar, 2011), MLV (Bahrami, 2014), BRISQUE (Mittal, 2012), RIQMC (Gu, 2016) etc.. From the point of view of information theory and signal processing the above features try to capture the image edges and details, simulate the image blurring and contrast degradation and provide powerful tools to assess the image quality.However, the majority of the above methods are for close range visible images.They seldom consider the image degradation caused by atmospheric blurring, clouds, reflectance, sampling, quantization and other complicated conditions for RS imaging process.So it will be meaningful to adopt those methods to RS images and explore the effective features to express the characteristics of RS images.(4) Learning based methods.Some recent research includes CNN method (Kang, 2014), deep learning network (Gu, 2014), DOG model and random forest (Pei, 2015).These methods try to reveal the image quality degradation mechanism in feature space by training an empirical model then the prediction model is used to calculate the test image quality score.However most of these methods have strong image distortion dependence and have certain requirements on the sample selection and sample numbers.In most cases the computational complexity is also high which has restricted the use of such methods.
In addition, due to the change of weather and air pollution there often exist clouds and haze in RS urban images, which leads to the loss of the local feature information.So the RS urban image quality evaluation must consider the degradation caused by cloud and haze.The existing cloud-index based features for RS images lack of adaptability for thin cloud and mist.And at least four bands are required to calculate the cloud index which is not suitable for panchromatic RS images.
In this paper we adopt a general purpose framework to construct RS urban image quality indices based on natural scene statistic (NSS) models based on multi spectral RS images.Inspired by the work of Moorthy (Moorthy, 2010a) and Mittal (Mittal, 2013) we establish a model that can learn to predict human judgments of image quality from databases of human rated degraded images.Cloud and haze-caused image degradation are primarily considered but not limited to them.We do not explicitly seek to characterize the structure of degradation using local filters, but instead utilize concepts from NSS to produce an easily extensible approach to other kinds of distortions.Once trained, our algorithm does not require further foreknowledge of the distortion affecting the RS images to be assessed.
The first contribution of this work is the development of a modular framework for image quality assessment of cloud and mist-deteriorated multi-spectral RS urban images.The modularity of the proposed method implies that the approach is extensible in that other distortion categories beyond the cloud and mist may be easily accomplished.The second contribution is the image quality index for RS urban images.Due to the diversity of the ground types, the multi-scale structures, the irregularity of spatial distribution of artificial buildings for urban areas the traditional filter based feature representation is ineffective.Our model is founded on perceptually relevant spatial domain NSS features extracted from local image patches that effectively capture the essential low-order statistics of RS images.We construct a dataset with image patches extracted from images acquired by multi-spectral sensors loaded on Chinese ZY3 satellite.We invited remote sensing image interpretation expert to give a quality score for each image patch and test our algorithm on the dataset.We demonstrate that our algorithm performs well in terms of correlation with human perception.

Image normalization based on NSS
Compared to other images the RS urban images have rich feature categories, complex structures and abundant information content.To extract the perceptually relevant spatial NSS features from the image patches.The spatial NSS model (Moorhy, 2010b) that we use begins by preprocessing the image by processes of local mean removal and divisive normalization: ( , , ) ( , , ) ( , , ) ( , , ) 1 ] estimate the local mean and contrast respectively, where symmetric Gaussian weighting function sampled out to 3 standard deviations( 3 K L   ) and rescaled to unit volume.The normalized image (1) has been observed to reliably follow a Gaussian distribution when computed from natural images.
Given one scene of RS urban image of size 8856 by 8476 for example.The image patches with size PP  are cropped randomly from the entire image.The normalization is computed on the image patches.

Multi-scale image quality indices
Given a collection of RS image patches, their qualities are characterized by the features computed from each patch.It has been generally noticed that the histogram indicates the probability distribution of image gray level and when computed over multi scales it reveals the statistical features in scale space.The types and degrees of image degradation caused by blur or cloud cover will generally affect the mean and shape of the local histograms in diverse ways.As shown in Fig. 1, for Region 1 from ZY3 multispectral image of Wuhan city the image patch has relatively high quality (according to RS image interpretation expert).In region 2, there has fog and blur in band1 and band2 so their histograms (blue and green plots in Region2 (e)) are significantly compressed in a very narrow area with high peaks due to loss of structural properties.The pinnacle pattern in histogram represents the influence of imaging noise or environmental fogs over city.It is also noticed that for band 4 (black plot in Region2 (e)) the histogram has salient structural shapes since the infrared band is less affected by the fog.So the image quality of band 4 is thus relatively high.In order to express this characteristic, we use the mean and skew as descriptive quality features.For each image patch we extract these 2 features ( 1 2 , ff ) from each scale, yielding 2 NS  features for NS scale levels.
Prior studies of NSS based image quality have shown that the generalized Gaussian distribution effectively captures the behaviour of the normalized natural images (Mittal, 2013).It has been observed that the model would be violated when the images do not derived from a natural source or when natural images are subjected to unnatural distortions.The degree of modification can be indicative of perceptual image degradation severity.So we use GGD based parameters as features to describe the gray distribution characteristics of the urban image caused by system noise and irregular cloud and mist coverage.
The generalized Gaussian distribution (GGD) with zero mean is given by: where () is the gamma function: The parameter  controls the 'shape' of the distribution.For example, 2   yields a Gaussian distribution and 1   yields a Laplacian distribution.The parameters of (4) ( ( , ) ) can be estimated using moment-matching based approach (Sharifi, 1995) The mean of the distribution can also be used as a statistics: For each J there are four features obtained ( , , , ) extracting estimates along the four orientations, 16 parameters are arrived at yielding ( 5 20 ,... f f ).All features are computed at NS scales to capture multi scale behaviour, by low pass filtering and down sampling by a factor of 2, yielding a set of 20 NS  features.This feature vector characterizes the distortion that the image patch is subject to.

SVM training and test
Machine learning has been applied in the field of image quality assessment for a long time.In order to map feature vectors to predicted quality scores we use support vector machine (SVM).SVMs are popular as classifiers since they perform well in highdimensional spaces, avoid over-fitting and have good generalization capabilities.In our demonstration in order to produce a quality index, the SVM   is utilized to perform such a regression (Schölkopf 2000).Specifically, for each image patch we consider a SVM   is trained using quality scores from training set to learn the mapping from the feature space to subjective quality.When presented a test image the algorithm will produce a quality score which correlates with human perception.

Dataset
In order to evaluate the performance of proposed algorithm we construct a median scale dataset for training and testing.The dataset was derived from a set of source RS images acquired by a multispectral camera loaded on ZY3 satellite.ZY3 is China's first high-resolution optical mapping satellite for civil use.It was launched on 9 Jan. 2012, equipped with four optical cameras.One panchromatic TDI CCD of 3.6m for nadir view, two panchromatic TDI CCD cameras of 2.1m for front and rear view and one multispectral camera of 5.8m for nadir view.We used 4-bands multi-spectral images of Level 1 as source without doing enhancement and radiometric correction.Some information of one image used are shown in  1.The information of one source image used for dataset A total of 250 patches of 200×200 from each band were cropped from three source images.When do cutting, be sure that the sub-images contain a variety of ground types such as water, vegetation, artificial buildings, roads etc. and have different levels of noise and varying degrees of fog coverage.
The subjects taking part in scoring are academic teachers of remote sensing science and technology from School of Remote Sensing and Information Engineering of Wuhan University in China.The average number of subjects ranking each image was 3.Each subject was individually told the goal of the scoring.In order to highlight the impact of fog on image quality we assumed that the images containing mist or fog have low scores.Therefore, for the four bands of one image band 4 always had the highest score and band 1 the lowest one.The subjects reported their judgements of quality by dragging a slider on a quality scale according to their estimation on image contrast, clearness and more importantly the mist and fog content.The decimal number for the score is in the range 0-1.All the scores for the same image were averaged.Raw difference score for an image was considered to be an outlier if it was outside an interval of deviations about the mean score for that image.After rejecting 33 outliers a total of 217 images and their corresponding scores were kept and constitute our dataset.Figure 2 shows one example of the subjective scores for blue, green, red and infrared band of a randomly selected image patch.

Results
For each sample of 217 images (four bands per image) in dataset we did normalization and calculated 20-D features for NS scales for each band as described in 2.1 and 2.2.The high dimensionality of the features brings the rise of precision but also makes the algorithm time become longer.Also we note that the quality of band 4 has always been higher than that of other bands which is consistent with our scoring rules used in subjective score acquisition process.For NS=5, Train(%)=75 the linear correlation coefficient and sum of absolute difference reach the optimal values in the experiments.The LCC (0.8068 0.8166 0.9016 0.9212) values show that the algorithm matches well with human subjective opinions of image quality.
We believe that the current approach may not be ideal.The algorithm should also be widely compared with other methods.Future work will involve finding more effective quality indices according to the characteristics of RS image in urban area.
Under our framework more experiments should also be done to make comparisons.

CONCLUSION
In this paper, we proposed a general-purpose image quality assessment method for RS image of urban areas.This is achieved by using two types of image features in multi scales.
One is from the shape of histogram the other is from the natural scene statistics.A 20-D feature vector for each scale is extracted and is assumed to capture the RS image quality degradation caused mainly by cloud and mist.We use SVM to learn to predict image quality scores from these features.And the results show the correlation of the predicted scores and the subjective perception.
Figure 1.Two regions from ZY3 multispectral image (four bands with spatial resolution 5.8m) in Wuhan city and their histograms.The image patches are 200x200 and have been stretched and zoomed for display purpose.
and be used as image quality features ( 3 4 , ff ).The signs of the normalized image (1) have been observed to follow a regular structure.This deviation can be captured by analysing the sample distribution of the products of pairs of adjacent pixel values computed along spatial horizontal, vertical and diagonal orientation ( exp are set using 5-fold cross validation on the training set of images.We use the LIBSVM package (Chang 2011) to implement the SVR.In SVR, all the experiments use the radial basis function (RBF) kernel.

Figure 2 .
Figure 2. Four bands of an image patch with their subjective scores on title.Upper-left blue band, upper-right green band, lower-left red band and lower-right infrared.Notice that the infrared band has the highest quality score due to its less sensitivity to mist.

Table
Since our method is training based, the database needs to be partitioned into the training and test sets.The training set is used to train the regression model and the test is used to evaluate the performance of algorithm.Training and testing were implemented on each band dataset respectively.In our trial about 60% of the samples and their associated human subjective score are used to train and the remaining 40% used to test.The training and test set are disjoint and do not share content.Therefore, our algorithm is independent of content and specific image degradation types.The performance indices contain the Person (Linear) Correlation Coefficient (LCC) and the Sum of Absolute Difference (SAD) between the predicted quality score and the subjective score provided by database.The better correlation with human perception means a smaller number for SAD and a value close to 1 for LCC.We designed three sets of experiments on different size training sets.The first experiment uses 75% of 217 images as training set and the remaining as test set by randomly selection, then the second and third uses 70% and 65% training samples.It shows that the regression accuracy increases with the increase of the number of training samples.For NS=3, the correlation coefficient for 65%,70% and 75% samples participating in training of band 1 is 0.4868, 0.6539 and 0.7297.For a certain training and test the prediction accuracy increases with the increases of the number of scale layers.