SENTINEL-5P NO2 DATA: CROSS-VALIDATION AND COMPARISON WITH GROUND MEASUREMENTS

Sentinel-5P (S5P) data provide information on atmospheric pollutants daily, and, for higher latitudes, consequent orbits partially overlap the same day. Provided clear atmospheric conditions, these data can provide insights on emission hotspots and on spatial distribution of critical air quality issues. The purpose of this work is to analyse several aspects of NO2 data from S5P over the years 2019, 2020 and 2021, in particular: (i) yearly average values between S5P data and 624 ground measurement stations were tested for correlation; (ii) 387 pairs of images from overlapping orbits on the same day were used to test for correlation on consecutive images with four different methods – simple linear regression over all valid cell values across the two images, over a subset with a low cloud fraction, and linear and tree-based methods using multiple predictors; (iii) local maxima values extracted from yearly NO2 emission maps were analysed to check potential hotspots of NO2 emissions. Results show that ground measurements correlate with S5P values, with r-squared values of 0.37 and 0.43 and RMSE of 7.4 and 8.6 μmol/m respectively for 2019 and 2020. Simple linear regression of overlapping consequent images returned average and standard deviation (sd) on r-squared respectively of 0.50(sd=0.21) and for RMSE of 11.3(sd=4.2) μmol/m. Points from local maxima clearly detected 19 specific positions in large cities or nearby industrial areas, mostly in the north of Italy, with average NO2 values above 90 μmol/m in some cases consistently over the three years, proving that S5P imagery is a valid index for spatial distribution of NO2 concentration and air quality.


INTRODUCTION
Monitoring air pollution is one of the many drivers of earth observation (EO), as industrialized areas pollute the atmosphere with several emission sources, from industry to heating and vehicles. The paradox is that the sources of pollution have improved our wellbeing from one point of view (better heating, transportation, industry), but introduced harm in the long term do to the impact on health.

Air pollution effect on health
The main atmosphere pollutants that are of interest for human and environmental health in general are the following: nitrogen oxides (NOX), non-methane volatile organic compounds (NMVOCs), sulphur oxides (SOX), ammonia (NH3) and carbon monoxide (CO); particulate matter (PM) emitted directly to the air (primary PM): with a diameter of 2.5 μm or less (PM2.5; also called fine PM); PM with a diameter of 10 μm or less (PM10); total suspended particulates (TSPs); black carbon (BC), the most strongly light-absorbing component of PM (additional) (European Environment Agency, 2019). It is a proven fact that "pollution is the largest environmental cause of disease and premature death in the world today" and that air pollution specifically, "… endangers planetary health, destroys ecosystems, and is intimately linked to global climate change. Fuel combustion-fossil fuel combustion in high-income and * Corresponding author, filippo.tonion@phd.unipd.it middle-income countries and burning of biomass in low-income countries-accounts for 85% of airborne particulate pollution and for almost all pollution by oxides of sulphur and nitrogen. Fuel combustion is also a major source of the greenhouse gases and short-lived climate pollutants that drive climate change. Key emitters of carbon dioxide, such as electricity-generating plants, chemical manufacturing facilities, mining operations, deforestation, and petroleum-powered vehicles, are also major sources of pollution. Coal is the world's most polluting fossil fuel, and coal combustion is an important cause of both pollution and climate change." (Landrigan et al., 2018).
The main impacts of air pollution are on respiratory and heart disease. Regarding nitrogen dioxide (NO2), the World Health Organization (WHO) and European Environmental Agency (EEA) put the limit for human health to a concentration to 21 ppb or 40 µg/m 3 . Numerous projects link human disorders to air pollution, we report here only a few that are representative of the wide variety of impacts. In pregnancy (Ferrari et al., 2020), in asthma and lung function (Rusconi et al., 2011) and others that indirectly impact the human body. NO2 specifically was found to be linked to paediatric asthma also in cities with NO2 emissions lower then the guideline from the WHO (Achakulwisut et al., 2019).
Health organizations have highlighted the need for monitoring pollution with ground sensors and remote sensing. Ground sensors are often spatially distributed on areas that are considered hot-spots, such as industrial areas and large urban areas. They provide important information, but are not evenly distributed in space, with a bias. Remote sensing fills this gap by providing distributed data, but has other drawbacks such as confounding factors in the atmosphere such as cloud cover.

Remote sensing for atmospheric pollution
Satellite images that provide information on nitrogen dioxide (NO2) concentrations are available as open data from the National Aeronautics and Space Administration (NASA) and the European Space Agency (ESA). These two agencies provide data respectively with the Ozone Monitoring Instrument (OMI) sensor on the Aura satellite (minimum pixel size of 13x24 km 2 ) and the Tropospheric Monitoring Instrument (TROPOMI) sensor on the Sentinel-5 Precursor (S5P) satellite (minimum pixel size of 5.5x3.5 km 2 ).
The COVID-19 lockdowns were one of the most studied testbeds for checking the effect on atmospheric pollution. Trivially lockdowns decreased drastically some sources of pollution, such as vehicle and industrial (partially), and less other such as household heating. Results from remote sensing data show a significant reduction between the months of January and February in eastern and central China, and between February and March in northern Italy (European Space Agency, 2020). Air quality went back to previous levels when economic activities resumed to "business as usual" (Filonchyk and Peterson, 2020).

Sentinel-5 Precursor
In support of services related to air quality, in the frame of the Copernicus Programme, ESA implemented the Sentinel-5 Precursor (S5P) mission. The S5P satellite has been successfully launched on Friday, 13th October 2017 and completed its commissioning phase on 24th April 2018, and is currently in its operational phase.
The TROPOMI instrument carried by the Sentinel-5P satellite, which has a swath of ~2600 km and covers the whole planet in a single day with a sun-synchronous polar orbit at an altitude of ~824 km. It thus provides daily coverage at ~13:30 local solar time. The satellite actually passes the equator (orbits) 14 times per day, with small gaps at the equator and overlaps at higher latitudes. The exact same image geometry is achieved every 29 days (orbit cycle) with 412 orbits.

S5P Validation:
Validation of NO2 concentration estimates from S5P in the tropospheric segment of the atmosphere is not a simple task, as direct measurement of the number of molecules in large areas is not possible. It can be compared with other estimations with other ground-based instruments which have been validated and proven to provide good fits. One approach is to use differential light absorption spectra measured from UV and visible-light spectrometers. Each measured spectrum is compared with a reference from the spectrum measured from direct daily noon zenith light.
Spectral fitting software and inverse models are then used to provide atmospheric trace gas abundances. One such test to validate S5P NO2 concentration values over a city in Belgium is provided in (Dimitropoulou et al., 2020), which used the method reported in (Friedrich et al., 2019). The authors also report the expected error budget, which is a key information for further comparison with other methods; the total uncertainty ranged between 10% to 14% depending on the spectral range that use used in the inversion model. Results from (Dimitropoulou et al., 2020) varied depending on the season, with an overall underestimation of S5P measures in this particular case.
Different sources of uncertainty in are discussed thoroughly in (Boersma et al., 2004) and are here briefly mentioned. Clouds are a major source of error, which is largely, but not completely, accounted for by removing pixels with cloud cover. Nevertheless, there can be light clouds or pixels only partially covered that are still considered valid, but contribute to overall uncertainty. Surface albedo is another source of uncertainty of the NO2 values that are derived from S5P, as sensitivity of NO2 air mass factor (AMF) is markedly high; at low albedo values (0.0 to 0.2) 12% change in AMF for a 0.015 albedo change. ESA's Validation Data Analysis Facility (VDAF) for NO2 reports a bias and dispersion respectively of -34% and 2.7 Pmolec/cm 2 (~44.8 µmol·m -2 ).

Study area
The study area is the Italian peninsula. Its complex topography provides varied scenarios. The northern part is highly industrialized, with flat terrain and relatively slow air flux, which makes air quality an issue. Figure 2 below shows clearly the difference of the northern area with respect to the rest, on the worst period of the year, which is across the winter season.

Figure 2.
Study area with ground stations.

Sentinel-5P data:
In this investigation we use S5P data from Google Earth Engine (GEE). These data were processed by GEE from L2 to L3 by binning to a grid at 1.1 km resolution. The tropospheric_NO2_column_number_density band was used. For this product all pixels with a QA value below 75% were removed in the process. This means that only cloud-free and weakly cloudy pixels satisfy the recommended quality assurance. Each image is also tagged by ESA with two flags related to quality, "Processing Status" and "Product Quality"; both can have two values, "nominal" or "degraded". Images with "degraded" flags were not considered and were removed.
The GEE product does not include the "nitrogen dioxide tropospheric column precision" which provides the error estimate originating from the spectral fit and other retrieval aspects , which would add information to the data and the process itself. ESA's Validation Data Analysis Facility (VDAF) for NO2 reports a bias and dispersion respectively of -34% and 2.7 Pmolec/cm 2 (~44.8 µmol·m -2 ). Considering the orbital cycle of 14 orbits per day, this means that subsequent orbit stripes are a bit more then 1.5 hours apart and that high latitudes overlap partially. Figure 3 below shows the overlapping area over the study area, the Italian peninsula. Due to the orbit geometry, and to the way that GEE structured the data tiles, the study area intercepted several thousand datasets between 2018 and 2021, but many of them consisted in marginal data. Via a GEE script we created a filter to keep only images that did not have the "degraded" product quality flag and that contribute with more than 50% of data pixels in the study area. This resulted of a total of 1617 images, distributed as seen in To make the process of creating the image stack of S5P and harmonizing the data fully replicable and open, all data and code were processed using R and GEE Javascript on the earthengine API and are available upon request.

S5P processing:
The daily S5P images from GEE were stacked and aggregated to three single rasters per year with the following metrics: (i) annual average, (ii) median and (iii) 75 th percentile values. The years taken in consideration are 2019, 2020 and 2021.
To analyse potential emission points, local maxima over a 5x5 cells window (25x25 km) were extracted from each of the 2019, 2020 and 2021 averaged rasters. The result are coordinates that represent estimated NO2 hotspots sources and are further discussed in the next sections.

Overlapping orbits processing:
Over almost 3.5 years, 1617 orbit strips (see Figure 3 and nevertheless most should be highly correlated to a linear regression. Eq. 1 below represents the intersection of the twoconsequent image rasters with NO2 values, with and without a cloud fraction filter. Where In is the set of indices of cells in image n that have a valid real number (i.e. is not null/NaN). Jn is a subset from In where cloud fraction (Cf) is below 0.2. It1 and I t2 are the sets of valid cell indices of the two dates being compared, thus A is the resulting set of cell indices with valid values in both images and B is a subset of A that keeps only pixels with low values of cloud fraction. A and B indices are then used to extract the two sets of S5P values at the two consecutive times t1 and t2 as vectors (V) of size |A| and |B| respectively, and resulting in two matrices O and P with size 2·|A| and 2·|B| respectively. The next step consisted in splitting the two sets, with all valid values (O) and with low cloud fraction (P), in two equal parts, one for fitting (training) and one for validation.
The sets were further analysed for each single pair and also all together by simple linear regression, multivariable linear regression and machine learning with the following four combinations in table below. 1) Simple: simple linear regression between values from the two dates: Vt2 = b1Vt1 + a 2) Low clouds: same as (1) but with the indices in B (see eq. 1), i.e. the cells with cloud fraction below 0.2. In this case we want to test to see if the regression goodness of fit improves when only cloud-free (almost) areas are considered. 3) Multivariable: multivariable linear regression, adding to (1) three other predictor variables: where Hasl is height above sea level of the cell.

4)
Random forest: regression with a machine learning method, random forest, is applied with multiple variables like in (3). This was done to test if a non-linear nonparametric method can significantly improve the prediction. Regression results from the four approaches listed above and for multiple consequent image pairs were compared using the coefficient of determination (r-squared) and root mean square of residuals/errors (RMSE).
Ground stations and municipality data: Spatial association to the single coordinate of each station was complicated as the coordinates of the position of the measuring station are not included in the data. There is only a reference to the town were the station is located. Two approaches were therefore tested for geolocating the measurements: (1) reverse geocoding using an online service (Microsoft Bing); (2) joining each measuring station to the area of a municipality using the identification code used by the Italian administration (ISTAT code) which is included in the data. The first approach consists in providing the geocoding service a string consisting in the concatenation of the name of the measuring station and the name of the municipality. The name of the measuring station is usually related to the area, taking the name of the street or of the specific location. It must be noted that this method has to be checked manually for errors that are caused by a name of the measuring station that is not in any way connected to the spatial location. The second approach is quite trivial and provides an accurate match to the municipality area, but not to the exact coordinate of the measuring station. S5P average yearly values were aggregated to each area of the municipality using average, median and 75 th percentile.

Overlapping image pairs
Over almost 3.5 years, 1617 orbit strips (see Figure 3) overlap the study area. Of these, 393 pairs of consequent orbits overlap on the same day, but only 387 of these resulted to have more than 1000 valid pixels, so were further analysed. The final dataset consisted in two columns, one with all cell values from earlier satellite overpass and the second column with cell values the later overpass. The exact time difference between two consecutive images is 101.5 minutes.
Regression on all pairs together resulted in RMSE of 12.4 µmol·m -2 , r-squared of 0.74 and 40% or relative RMSE. Figure  4 shows the scatterplot and regression results. Each pair was tested for goodness of fit using the 4 combinations described in Table 2. Further analysis was conducted over all data together. Coefficient of determination and root mean square of residuals are calculated to assess how close each pair is, with some examples in Figure 6 and Figure 7.  Table 2: top left is simple linear regression (1) between NO2 values between the two dates; top right is regression with a subset of cells with a cloud fraction less then 0.2 (2); bottom left is regression with the NO2 values of one date as dependent variable with four independent predictors (NO2 values of the other date, cloud fractions of both dates and height above sea level of pixel) (3); bottom right is regression with random forest using the same set of independent predictors as the previous set (4).

Figure 7.
An example with a lower coefficient of determination and lower values of NO2 concentration. The figure structure is the same as described in Figure 6.
Distribution of r-squared and RMSE values resulting from regression of each pair of images from the four regression types are tested for significant difference with Mann-Whitney U test. This test was used because distributions are not normally distributed, as proven from testing the data with the Shapiro-Wilk test. Figure 8 below reports the results.  Table 2) with letters accounting for significant difference at 99% confidence level from the Mann-Whitney U test. Boxplot is interquartiles and median. Red dot and line are mean and standard deviation.

NO2 ground stations vs S5P data
Two approaches were chosen to analyse ground data: single station and aggregating station to municipality with a single average value. As mentioned in the methods section this was necessary because coordinates of single stations were estimated The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLIII-B3-2022 XXIV ISPRS Congress (2022 edition), 6-11 June 2022, Nice, France from reverse geocoding; this method is not 100% accurate and might have given a few wrong results, which are discussed in the next section. On the other hand, assigning the municipality to the ground station is more reliable, but average in many cases more values to a large area. After aggregation to the polygon of the municipality, 394 and 425 municipalities have valid yearly average values of NO2 for 2019 and 2020 respectively.

Single station NO2 vs S5P:
The    (Table 3) for 2019 (left) and 2020 (right) from ground measures at station coordinates.

Municipality aggregated NO2 vs S5P:
The table below summarizes the correlation values and RMSE values. As can be seen by comparing with Table 3, the values are generally higher.

Local maxima hotspots:
The number of local maxima values around a 25 km by 25 km radius which are above 40 µmol/m 2 and 60 µmol/m 2 are respectively 443 and 167. They are located as depicted in Figure 11. Table 5 shows the 19 cities that have values above 90 µmol/m 2 . It must be noted that some cities are nearby large metropolis, like the first one in Table 5 is in the north of Milan.

DISCUSSION
The analysis of overlapping image pairs provides some insights on the uncertainty which seems to be 12.1 µmoles/m 2 40% which is very close to the precision requirement of 0.7 Pmolec/cm 2 (~11.6 µmol·m -2 ) as reported in the validation work of S5P NO2 data from Compernolle et al., (2018). The same work by (Compernolle et al., 2018) compared tropospheric NO2 with ground stations with NDACC UV-Vis Multi-Axis DOAS, Multi-Axis Differential Optical Absorption Spectroscopy (MAX-DOAS) measurements, finding higher values of RMSE in some stations, and lower RMSE in stations located in areas with clean air. Further correlation values between ground stations and S5P in Compernolle et al., (2018) show a r-squared value of 0.55, that is higher then what was found in this study, i.e. ranging from 0.33 to 0.30 when compared to single ground stations. It must be noted that the comparison in this study is with ground stations that use a very different measuring method, based on point-in-space measures in contrast with the MAX-DOAS measurements that use spectral models, that are closer to the way that S5P measures are provided.
Similar work, but comparing ground stations using ground-based MAX-DOAS instruments in (Verhoelst et al., 2021) reported a negative bias for the S5P measurements of tropospheric column data, of typically −23 % to −37 % in clean to slightly polluted conditions, but reaching values as high as −51 % over highly polluted areas. This is in line with what is reported in this work at Figure 5, where it can be seen that higher values (i.e. more polluted) have higher residuals. Of course, residuals are not the same as bias, and the methods are very different, but it can still be noted that if precision decreases over higher values, it is likely that higher biases might result when comparing with ground measurements.
Another similar work was done by (Ialongo et al., 2020) finding underestimation with respect to a spectrometer (Pandora) and to a single ground measuring station nearby. It must be noted also in this case that the spectrometer data is fitted with the DOAS model. Interestingly in this case also the measuring station data are used, that report data ppb and in µg/m 3 . Results report that the best correlation are, like in other investigations (Oxoli et al., 2020), at overpass time of the satellite (around noon). Ialongo et al., (2020) report a coefficient of correlation (r) of 0.69, i.e. rsquared 0.48, which means higher values than found in our study, which reach maximum 0.43. Oxoli et al., (2020) concentrated their study in the Lombardia region of Italy, and found an average r-squared value ranging between 0.37-0.72 (standard deviation of Pearson correlation). The results in this study fall on the lower side of this range, probably due to the much larger study area that is taken in consideration.
NO2 emission from agriculture is also another interesting aspect that drives a more careful assessment of farming practices (Mozzato et al., 2018), which are one of the many anthropic impacts that can be mapped to assess their effect on human but also on biodiversity in general (Piragnolo et al., 2014).
Regarding the ground measurements at national level, it is to note the importance of a national or, even better, European harmonization of data that are collected from the various environmental agencies. There are efforts in this sense, as all of the regional agencies (ARPAs) are collected to SNPA, but data are not harmonized and it is quite difficult to retrieve it. It is a trivial statement that collecting and sharing data through a unique portal would provide the scientific community with important data to use for cross-validation of other methods, such as remote sensing data. Standardized online services could make things easier and foster research efforts in several directions. An open collaborative approach could make use of city models that come from different projects e.g. (Prataviera et al., 2021) and that can provide insight on where to position low-cost sensors that monitor air quality from different aspects.

CONCLUSIONS
Sentinel-5P (S5P) data provide information on atmospheric pollutants daily, and, for higher latitudes, consequent orbits partially overlap the same day. These data can provide insights on emission hotspots and on spatial distribution of critical air quality issues. In this work several aspects of NO2 data from S5P are analysed over the years 2019, 2020 and 2021. Results show that ground measurements correlate with S5P values, with rsquared values of 0.37 and 0.43 and RMSE of 7.4 and 8.6 µmol/m 2 respectively for 2019 and 2020. Simple linear regression of overlapping consequent images returned average and standard deviation (sd) on r-squared respectively of 0.50(sd=0.21) and for RMSE of 11.3(sd=4.2) µmol/m 2 . Points from local maxima clearly detected 19 specific positions in large cities or nearby industrial areas, mostly in the north of Italy, with average NO2 values above 90 µmol/m 2 in some cases consistently over the three years, proving that S5P imagery is a valid index for spatial distribution of NO2 concentration and air quality.
As it is proven that S5P data effectively predict NO2 concentrations, future work will address other hypotheses and research questions more related to health issues, with support from the medical scientific community. Another aspect of the future work is to use the S5P daily time series to assess trends in time of NO2 emissions and effects of weather-related variables, such as precipitation, temperature and wind.