AIR QUALITY MONITORING AND DATA MANAGEMENT IN GERMANY – STATUS QUO AND SUGGESTIONS FOR IMPROVEMENT

This paper proposes a novel approach to facilitate air quality aware decision making and to support planning actors to take effective measures for improving the air quality in cities and regions. Despite many improvements over the past decades, air pollutants such as particulate matter (PM), nitrogen dioxide (NO2) and ground-level ozone (O3) pose still one of the major risks to human health and the environment. Based on both a general analysis of the air quality situation and regulations in the EU and Germany as well as an in-depth analysis of local management practices requirements for better decision making are identified. The requirements are used to outline a system architecture following a co-design approach, i.e., besides scientific and industry partners, local experts and administrative actors are actively involved in the system development. Additionally, the outlined system incorporates two novel methodological strands: (1) it employs a deep neural network (DNN) based data analytics approach and (2) makes use of a new generation of satellite data, namely Sentinel-5 Precursor (Sentinel-5P). Hence, the system allows for providing areal and highresolution (e.g., street-level) real-time and forecast (up to 48 hours) data to inform decision makers for taking appropriate short-term measures, and secondly, to simulate air quality under different planning options and long-term actions such as modified traffic flows and various urban layouts.


INTRODUCTION
Air pollution leads to a tremendous health risk and belongs to one of the main issues in cities worldwide. According to the European Environment Agency (EEA) and the German Environment Agency (Umweltbundesamt UBA) tens of thousands of people die as a result of high air pollutant concentration of premature deaths and YLL (Year of Life Lost due to premature mortality). Main cause is the long-term exposure to air pollutants via inhalation (EEA 2019: 8). The current air pollutants with the highest health impact on the European citizens are particulate matter (PM), nitrogen dioxide (NO2) and ground-level ozone (O3). They lead to serious health problems, including neurological, cardiovascular and respiratory diseases (Dora et al 2011: 2;Brunekreef andHolgate 2002: 1233 f.;Wei et al. 2019: 1).
The EEA publishes an annual report on air quality in Europe. The latest report (2019) analyses data of several thousands of emission measuring stations (2,886 PM10-emission measuring stations, 1,396 PM2,5-emission measuring stations, 3,260 NO2emission measuring stations and 1,903 ground-level ozoneemission measuring stations). From 2015 to 2017, 13 % to 19 % of the EU-28 urban population was exposed to PM10concentrations above the EU reference 24 h mean value of 50 µg/m³. And 12 % to 29 % of the population was even exposed to ground-level O3-concentrations above the daily 8hour mean of 120 µg/m³ (EEA 2019: 7).
The enormous impact on human health, which has been analysed in a number of studies (EEA 2019;Brunekreef and Holgate 2002;Lelieveld et al. 2020;Schneider et al. 2018;Wei et al. 2019), points out the urgency for reducing air pollution. The European Union enacted several agreements and directives to improve air quality. A European solution is essential since air pollution does not stop at territorial borders. The neighbouring country's behaviour influences the air quality in Germany. The main European achievements are the Gothenburg protocol (1999) Considering the long-term development of the data from 1990 to 2017, all air pollutants have been reduced by at least 40 %. The emission of sulphur oxides, for example, declined by 90 % from the level of 1990. There are no further measures necessary to reduce the sulphur oxides concentration (EEA 2019: 45). New technologies such as improved engines and exhaust systems, legal regulations as well as the use of low-emission fuels lead to a provable improved air quality in Europe. Despite these achievements, especially the concentrations of PM and NOx regularly exceed the limiting values in European cities. The remainder of the paper is organized as follows: section 2 introduces the specific situation in Germany, section 3 the problem statement and the study areas, and section 4 proposes an architecture for a planning support system. Finally, conclusions and an outlook on further research are given.

Status Quo and historical development
Since the Industrial Revolution in the late 19th century, Germany has to deal with increased concentrations of air pollutants. The focus changes constantly. Whereas SO2 was the main problem in the 1950s, black carbon was the main problem in the 1960s and lead in the 1980s. By taking expedient measures, the SO2and lead-concentrations could be reduced to such a degree that they fell -and still are -far below the EUdirectives as well as the recommendations of the WHO. From the 1990s onwards particulate matter (PM) became an increasing problem.
The current focus in Germany is on three air pollutants, particulate matter, nitrogen dioxide and ground-level ozone. They have negative repercussions on human health, like the risk of cardiovascular diseases, cancer and respiratory diseases. The PM10and NO2-concentrations exceed the limiting values, especially in urban and metropolitan areas, repeatedly. There is no limiting value set by statute for ground-level ozone. The EU only stated a target.  Before conducting measures, the municipal authorities need more essential information. Information about where exactly the air pollutants are emitted, how they spread over the area and which factors influence them the most. Therefore, relevant data needs to be collected, analysed, and visualized.

Air Quality Monitoring in Germany
Monitoring air quality is one of the main tasks of every federal state and the German Environment Agency ( § 44 BImSchG). Data on the different air pollutants, such as sulphur dioxide, nitrogen dioxide, particulate matter PM10 and PM2,5, lead, benzene and carbon monoxide, are continuously collected from fixed air quality monitoring stations. The air monitoring stations are divided into three categories -background, industrial and traffic. Depending on the category different air pollutants are measured. Traffic air monitoring stations, for example, collect nitrogen dioxide, PM10 and PM2,5 concentrations as traffic is the main cause. Background air monitoring stations collect additionally ground-level ozone, carbon monoxide, carbon dioxide, methane, volatile organic compounds (VOC), ammonia (NH3) and nitric acid (HNO3) to monitor the air quality in rural and urban backgrounds without the direct influence of industry or traffic.
To ensure that the collected data is EU-wide comparable, only certain types of gear can be used for monitoring air quality (UBA 2019b). Moreover, air monitoring stations must be set up at the most polluted places in cities. The necessary quantity depends on residential density in urban areas. Location criteria must be taken into consideration as well, for example the distance to the next building must be at least 0.5 m or the intake must not be in the immediate vicinity of the source of emission.
If the limiting values are exceeded, an air pollution control plan and an action plan must be developed by the county. The plan consists of suitable measures and goals to reduce the air pollutant concentration in a certain amount of time. Measures are for example establishing low emission zones (there are 58 The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLIV-4/W2-2020, 2020 5th International Conference on Smart Data and Smart Cities, 30 September -2 October 2020, Nice, France low emission zones in German cities), extending environmentally friendly transport options (free or reduced fee for public transport, bicycle and pedestrian traffic, sharing options) and parking management ( § 45 BImSchG, § 27 BImSchV). A contentious issue in public and political discussions is the establishment of driving bans for vehicles with high emission. The Environmental Action Germany (Deutsche Umwelthilfe) sued several German cities for exceeding the limiting values repeatedly and for insufficient measures. Most litigation proceedings are still in progress. First driving bans were established for the whole City of Stuttgart and particular road-sections in the City of Hamburg in January 2019. Further bans are to be expected. Driving bans affect diesel vehicles with an exhaust emission standard of 5 and lower.

PROBLEM STATEMENT AND STUDY AREAS
The 21st century is entitled as the age of cities (WBGU 2016: 1, 437) as well as the century of (big) data (Schüller and Förster 2017: 110). Since 2008, more than half the world's population live in cities and it continues to rise. The German urbanization rate is 75 % (Pauleit et al. 2016: 2). Primary objective of cities is to provide a city with a high quality of life (as well as being environmentally friendly). A city should respond to a multitude of challenges on different levels, i.e. economically, socially, culturally and environmentally; globalization, climate change, changing lifestyles and consumption patterns, increasing urbanization and traffic and digitalization, to name a few only. Cities need to develop strategies and approaches to meet these challenges. Traditional city planning, which has not changed a lot since the 1970s, is not flexible and agile enough to respond to constantly changing (external) conditions within a short period of time. German bureaucracy and decision-making processes are still very complicated and long-drawn-out. Digitalisation in the context of city planning is a challenge and an opportunity at the same time (Habbel 2017: 53 ff.). To make use of these opportunities, digital literacy is required.
Referring to air pollution, digitalization can be used on different levels. A lot of data is being collected which can be used to establish the source and distribution of air pollution. It needs to be analysed and combined appropriately in order to turn it into useful information (Witt 2010: 4 f.;Kitchin 2014). But there is a lack of digital literacy especially in municipal governments.

Research questions
Therefore, new approaches need to be developed and tested. For this, the following research questions are posed: • What kind of tools could help to support planners and administrative decision makers most in their efforts to improve air quality in cities and regions?
• What information is necessary for local and regional decisions on appropriate short-and long-term measures?
• How can available official measurements be used in the best possible way and how can combinations with traffic, satellite, or land-use data improve their validity?
• What are the capabilities of an AI-based data analytics approach to forecast air pollution on the short-term and to simulate long-term air quality scenarios?

Study areas
The study areas are the City of Stuttgart and the federal state North Rhine-Westphalia (NRW). Because of the geographic location (valley, surrounded by the low mountain ranges Black Forest and Swabian Mountains) the City Stuttgart has often low air quality. The wind velocity in the valley is often below 1.5 m/s. Stuttgart is one of the regions with the least of all wind and one of the regions with a high density of traffic at the same time. Traffic density is owing to the fact that the City of Stuttgart has a long tradition of automobile manufacturers which have their principal establishments in Stuttgart and the surrounding area. Stuttgart has seven official emission measuring stations; the station "Stuttgart -Am Neckartor" is probably the most well-known measuring station in Germany. Nitrogen dioxide and PM10 concentration exceed the limiting values regularly. In 2016 the maximal permitted daily limit value of 50 µg/m³ PM10 has been exceeded on 68 days (35 days are permitted) and nitrogen dioxide's average annual value was 82 µg/m³ (40 µg/m³ are permitted). Stuttgart is testing a lot of different measures, for example low emission zone, "Feinstaubalarm", road cleaning, decreasing speed limits, increasing parking fees and moss walls (Landeshauptstadt Stuttgart 2016). Some measures reduced the air pollutants, but because of the nitrogen dioxide emissions Stuttgart had to initiate driving bans for diesel vehicles with an exhaustemission standard of 5 or lower by January 2019. Although it was an issue by decree, the measure was very unpopular, and many demonstrations were held against it.
The second study area, North Rhine-Westphalia (NRW), is a federal state in Germany with a long history of air pollution because of its coal mining industry. Nowadays the air quality problems affect especially the high population density Rhine-Ruhr-Area. With 1,170 inhabitants per square kilometre, it is one of the densest areas in Germany and therefore has also a high density of traffic. There are 64 official emission measuring stations and additionally 74 discontinuous measuring nitrogen dioxide passive collectors. A lot of cities in NRW have problems exceeding limiting values of nitrogen dioxide.

Data supply
As stated earlier, data is collected from spot emission measuring stations. The concentration over the rest of the city is extrapolated. To give a short overview, figure 2 shows the distribution of the emission measuring stations for the City of Stuttgart.
Because of the federal system in Germany, data collection and supply differ in every federal state which complicates a standardized measurement scheme. Whereas NRW passed an open data strategy (Open.NRW) in 2014, more expenditure is needed to get access to data in Baden-Württemberg, the federal state where the City of Stuttgart is located. Data supply in Baden-Württemberg is restricted to user agreements and administrative responsibility.
Recently, many private initiatives started to collect data autonomously, apart from the official emission measuring stations. For example, there is the platform luftdaten.info which gives instructions to build a particulate matter measuring instrument by oneself. The collected data is displayed online at luftdaten.info and gives an overview over a wide area. But the data is not validated and the sensors do not comply with the EU regulations. City councils cannot use and integrate the privately collected data as they are required to use official data and data collecting systems.
As the current practices are not sufficient new approaches are needed. In the following, the outline of a novel planning support system architecture is suggested, which makes use of the available data in the best possible way in order to inform both local and regional planners for better decision making.

SYSTEM ARCHITECTURE
The overarching aim is to facilitate air quality aware decision making and to support planning actors to take effective measures to improve air quality in cities and regions. More specifically, the approach is twofold. The major goals are: • (1) to provide areal, high-resolution (e.g., street-level) real-time and forecast data (up to 48 hours) to inform decision makers for taking short-term measures, and • (2) to simulate air quality under different long-term measures such as a modified urban layout.
As the system is supposed to be co-designed with the local actors and planning agencies, requirements for such kind of planning support have been queried and are described in the following. Subsequently, a system outline that is designed to meet the desired requirement, is presented.

Requirement analysis
The presented study follows a co-design approach, i.e., besides the industry and scientific partners, local experts and administrative actors are actively involved in the system development. Therefore, based on the analysis of the general air quality management practice (see section 2 and 3) and interviews with local application partners (the City of Stuttgart and the federal state NRW), the following system requirements have been identified. The local application partners need a system that is capable of: • completing and improving the current spot-based measuring and projection, • showing the direct impact of traffic volume on the concentration of air pollutants, • identifying places with higher values, • including satellite data for higher spatial resolution, • demonstrating the success of measures, • answering fundamental questions and connections between for example the dependence of air pollutioncontribution and weather situations or the impact of climate change on air pollution-distribution, • simplifying the production of pollution maps, • creating forecasts and scenarios, and, • meeting the European and German data privacy regulations.

Methodical approach
Based on the requirement analysis, a system architecture has been designed (figure 3) that integrates heterogeneous data sets that have not been used in this combination for air quality monitoring in Germany yet. To analyse these datasets, an artificial intelligence (AI) approach is used, instead of classical physical dispersion modelling or extrapolation approaches. The input data comprises of freely available data, in situ data and earth observation data ( figure 3). As stated earlier, the common practice is decision making based on spot-based measurements. However, compared to the spatial extent of the comparatively large and structurally heterogeneous administrative regions, the few measurement stations (figure 2) cannot reflect the smallscale variability of air quality. To overcome the limitation of few point-based measurements, the system will make use of Earth observation and modelled data. These include atmospheric data from the satellites Sentinel-5 Precursor (Sentinel-5P), GOME-2 (cf. Taubenböck et al., 2020), as well as earth observation based modelled data from the Copernicus Atmosphere Monitoring Service (CAMS), modelled COSMO Data from the German Weather Service (DWD) and from the Weather Research and Forecasting Model (WRF). These heterogeneous datasets provide historical and current information of various weather and air quality related parameters. Additionally, Earth observation-based data describing the land surface is used. For example a land cover classification derived from Sentinel-2 data (Weigand et al. 2020) as well as topographic information such as digital elevation models are integrated.
In addition to Earth observation data, other data relevant to the concentration of pollutants near the ground are collected. These are air quality and weather data from local measuring stations, data on traffic density, as well as data on population density and infrastructure (buildings). Of all data, both current and historical values are recorded. With the help of deep neural networks (DNNs), the relationship between these data and current concentrations of pollutants (NO, NO2, O3, PM) will be calculated and predicted for the next 48 hours. The historical data will be employed to train the DNNs. The measured values from local stations are taken as reference. The aim is to achieve a higher degree of spatial detail in contrast to the current pointbased pollutant measurements as well as to provide an air quality forecast for the next 48 hours. It is planned to use the trained DNN to model the relationship between infrastructure data (building structures, land cover classification and population density) and air pollutants. In this way, the effect of different urban layouts on air pollution can be simulated.
Apart from the data collection from many different providers, the harmonization of the data is challenging. The complexity is high because of the various data sources and formats used at the different levels: • data retrieval: web access (HTML), data transfer (ftp-Server), REST-API, • formats: CSV, NC, XML, • data format: datetime, geocode, • designation: "NO2", "nitrogen dioxide", "Stickstoffdioxid" (German), "tropospheric_NO2", • unit of measurement: ppm, mol/m², or µg/m³, • different update procedures and update intervals.
Initially, the data sources need to be put into a standardized format for analysis. For every data source there are two modules implemented. One software module connects to the data source and downloads relevant data (Fetcher-Module), the second software module (Storage-Module) homogenizes the data to a standardized format and saves it to a PostgreSQL-database. This modular construction guarantees the extensibility of the system for further data sources.

Land Use Data
In  To integrate the required data privacy standards into the system, a regulation, data and rights management concept has been elaborated. This complies with the GDPR according to the principles of privacy by design and privacy by default and ensures to being in line with the license conditions of the different data owners.
The desired output of the system has been designed into use cases in order to meet the requirements of the local application partners: (1) web-based (near) real-time, (2) forecast, and (3) simulation services. Figure 3 gives an overview of the approach.

Initial results
The analysis started with the prediction of concentrations of pollutants at the level of the emission measuring stations. Beneficial for the pattern recognition is the availability of longterm historical data at these locations.
A statistical analysis of the measured values revealed that the individual relevant pollutant concentrations exhibit a certain behaviour over time. Figure 4 shows an example of the weekly course of NO2 concentration from 2003 to 2019 averaged for the measuring station "Stuttgart -Am Neckartor". The curve correlates strongly with the volume of traffic. Other time dependencies are the time of day, time of year (month), day of the week and public holidays. If an individual week is considered, however, it can be very different from the average course. At this point, individual influences such as weather conditions, actual traffic volume, population density, land use, etc. have to be taken into account.   As a result, the values for concentrations of the pollutants NO, NO2 and PM10 are provided for the future 48 hours. Figure 5 shows an example of the prediction of the NO2 values at the station "Köln-Rodenkirchen" on the 13th and 14th of November 2019. It is promising that the predictions show the same trends as the actual measurements. The DNN used is based on the transformer network presented in the publication by Chiyang et al. (2019). Transformer networks were originally developed in the field of Natural Language Processing (NLP). It is a novel architecture that aims to solve sequence-to-sequence tasks, which transform an input sequence to an output sequence.
The idea behind transformer networks is to handle the dependencies between input and output with attention and recurrence (Vaswani et al, 2017). Recently, transformers have therefore also been used for the analysis and prediction of time series. For the first approach a transformer network with 8 encoder layers and a self-attention mechanism was applied. A separate network with individual weights was trained for each emission measuring station.
The next steps will be to consider further inputs, such as traffic flow, a land use classification, satellite data and infrastructure. Accordingly, the existing transformer network has to be adapted and extended. Since these data are available area-wide, the determination of air pollutant concentrations will be realized with a higher local resolution to overcome the limitation of point-based air pollution measurement.

CONCLUSION AND FUTURE RESEARCH
This paper presented a novel approach to facilitate air quality aware decision making and to support planning authorities to take effective measures for improving the air quality in cities and regions. Against the background of a preliminary analysis of the air quality situations and regulations in the EU and in Germany we focused on an in-depth analysis of local management practices. The requirement analysis revealed that there is a need for better data management, analysis and visualization strategies. In particular, overcoming the limitation of point-based air pollution measurement, reliable highresolution forecasts of the expected air quality and long-term simulations of various planning options are of utmost importance. To address these issues, our system does make use of not only in situ measurements but also of earth observation and modelled data. These heterogonous datasets provide historical coverage of various weather and air quality related parameters. The selected DNN-based machine learning approach is suitable to handle and model the given data.
The co-design with local experts and administrative authorities guarantees a tailored system according to their needs and fosters the applicability and acceptance of the system. Further research comprises the integration of further data sources, the validation of the forecast capabilities in the local real-world application, and the long-term evaluation of the urban air quality simulations provided by the system.