DEFINITION OF A METHODOLOGY TO DERIVE ROAD NETWORK FUNCTIONAL HIERARCHY CLASSES USING CAR TRACKING DATA

Road network functional hierarchy classifies individual roads into several levels, for efficient traffic management and road network generalization purposes. Automatic and semi-automatic road network extraction methods exist, but the generated products normally lack information on its functional hierarchy. This paper presents a methodology for automatically retrieve functional hierarchy for an OpenStreetMap derived road network from Floating Car Data, obtaining evenly distributed (e.g. for generalization purposes) or dynamic (e.g. to take into account differences in traffic volumes in different moments of the day) classifications. Road network elements are classified in function of vehicle speed values: the class distribution generated with the proposed methodology follows a linear distribution that can be better exploited for generalization purposes. Furthermore, the methodology allows to clearly distinguish different distributions in different moments of the day and days of the week, supporting traffic management activities.


INTRODUCTION
Road network functional hierarchy classifies individual roads into several levels, in order to manage traffic efficiently by segregating through traffic from accessing, parking and nonmotorized traffics (Goto et al., 2016).
Functional classes can be also used to support road network generalisation, in order to efficiently reduce the number of features represented at lower map scales, without losing relevant information. In cartography, map generalisation is the process of deriving from a detailed source spatial database a map or database the contents and complexity of which are reduced, while retaining the major semantic and structural characteristics of the source data appropriate to a required purpose. The primary aim of cartographic generalisation is for the resulting map to convey a clearly readable image that is aesthetically pleasing (Gülgen, 2014). One of the main purposes of road network generalisation is the reduction of the portrayed number of features, because it is not possible to show every road at smaller scales particularly in urban areas (Regnauld et al., 2007).
Several methods for road network automatic and semi-automatic extraction from remotely sensed data are able to generate good results concerning feature extraction (Wang et al., 2016), but the generated products normally lack information such as functional hierarchy.
Volunteered Geographic Information (VGI) initiatives, especially OpenStreetMap (OSM), represent in most cases the best options to retrieve a harmonised and as complete as possible road network. OSM road tagging schema 1 includes 7 main hierarchical classes, but those classes may not match common usage by other organizations, such as local road authorities. Looking at OSM <highway> key values 2 , it is also clear that the worldwide values distribution is clearly biased, and this may * Corresponding author 1 https://wiki.openstreetmap.org/wiki/Key:highway cause problems for a generalization process based on those values, especially in urban environments: the first 5 classes in numerosity (residential, service, track, unclassified, footway and path), represent almost 80% of all the road elements (as of 16/04/2020) and are all related to lowest hierarchical levels. Furthermore, OSM < highway> tagging schema has some specific issues such as: -unknown road types, that is often the case when the vector feature is digitised from remotely sensed imagery, without further processing or survey. In this case, according to OSM documentation, a generic <road> value should be used; -the misuse of the <unclassified> value. According to OSM guidelines, this value is to be used for minor roads of a lower classification, but which serve a purpose other than access to properties. The word 'unclassified' is a historical artefact of the UK road system and does not mean that the classification is unknown; but within the OSM contributors this lead s to confusion and therefore to a wrong assignation to this class , that is among the 5 higher classes in numerosity.
A road classified in a low class can be more important than the others, i.e. if it plays a bridge role, without which a connected network may be broken into two parts. These roads should be preserved during the reduction process when deriving smaller scale maps or databases (Gülgen, 2014). This aspect is not specifically addressed in OSM mapping guidelines and it may lead to incorrect generalization results.
Floating Car Data (FCD) are acquired by On-Board Unit (OBU) mounted on vehicles, typically private cars linked to insurance policies, and trucks/vans managed in a fleet environment. One of the main information acquired by OBU is the position, obtained by means of a GPS receiver, using both a temporal and speed sampling interval. Other data commonly acquired by these kind of services include speed, heading, GPS signal quality, engine status (on or off) and vehicle type. FCD are already widely used for traffic analysis and simulation (Ajmar et al., 2019). Jiang (2009) discussed the possibility to use taxi FCD for deriving road hierarchy in a relatively small city in Sweden.
In this paper a methodology to automatically retrieve functional hierarchy for an existing road network from a full set of FCD, including fleet and private cars, and for a medium -sized city is presented. The objective is to produce a functional classification based on real traffic situation and therefore more flexible, e.g. capable to produce more evenly distributed (e.g. for generalization purposes) or dynamic (e.g. to take into account differences in traffic volumes in different moments of the day) classifications. Road network hierarchy could be highly beneficial in network analysis since their exploitation generally leads to easier to follow driving directions, since routes tends to have less diversions and vertical signs are generally more visible on higher hierarchy roads. Furthermore, hierarchies would allow to better fit preferences of different drivers, e.g. truck driver normally try to avoid local roads. Additionally, as several routing algorithms take into account the road hierarchy as a parameter in order to speed-up the processing of finding the shortest path (Geisberger et al., 2012) extracting a functional classification based on real traffic situation may be useful in cases of mediumterm roads impacts: a collapsed bridge or a prolonged closure of a road section affect the normal traffic behaviour, which is also influenced by re-routing strategies applied by traffic managers. These strategies may also involve the structural characteristics of a road (e.g. from two-way to one-way street to increase capacity), which are not reflected in authoritative datasets, usually considered more stable. This could have a distorting effect on navigation which a dynamic hierarchical classification could resolve.
Optimal paths computed by conventional path-planning algorithms are usually not "optimal" since realistic traffic information and local road network characteristics are not considered (Quingquan et al., 2011). Google Maps collects realtime traffic data with an impress ive number of users (1 billion users, as of 10/04/2020) and daily updates (25 million as of 10/04/2020), but the access to the dataset is not for free 3 . FCD data did not represent a free alternative, but GPS points collection through these devices, even if significantly lower in numerosity, is more under control and stable, as the equipped fleet is known and the device is always on. Furthermore, user profiling can be exploited to perform different analysis focusing on different traffic modes: i.e. the vehicle type attribute can be exploited to differentiate between private and commercial traffic paths. Once demonstrated, by means of the comparison between FCD data and traffic data measured by fixed sensors, that FCD data can represent traffic dynamics, if not in absolute values but at least in a relative form, the benefit of its usage become evident.
Functional classification is particularly relevant for road network datasets such as OSM (native or derived) for the issues previously mentioned and related on criteria adopted for the <highway> tagging process. For authoritative datasets, or more in general in cases of datasets generated with more formal acquisition specifications, the proposed approach is relevant to provide the possibility to generate a dynamic functional class.

METHODOLOGY
The FCD sample used consists of approximately 4 million records acquired by devices mounted on almost 19.000 vehicles collected during an entire week (from 2018-10-05T22:00:00 to 2018-10-11T21:59:59 CET) in the city of Turin (Italy). The representativeness of this sample has been previously discussed (Ajmar et al., 2019).
As reference layer, the EL_STR class stored into the Banca Dati Territoriale di Riferimento degli Enti (BDTRE) 4 has been used: this is an official and authoritative road network dataset released by Regione Piemonte public administration. The BDTRE dataset is used in 2 ways: − to refer FCD GPS positions to a road network feature. FCD positions has been uniquely assigned to a single BDTRE network element by means of the identification of the nearest road feature to the FCD position . Only FCD positions within 30 m from a BDTRE network feature has been considered: the 30 m threshold is considered the Advanced Transport Telematics (ATT) navigation accuracy requirement for generic services related to vehicle location (Ochieng, 2002); − to have a benchmark functional class. The BDTRE dataset has a functional class definition stored in a field named "EL_STR_FRC" and subdivided in 6 classes 5 . For the purpose of this work, only 5 classes have been used, as the pedestrian class is not relevant.
BDTRE road network dataset has been dissolved on the basis of the road name and the functional class, in order to derive geographical homogeneous entity with a relevant continuity (  Figure 1).
Each FCD position has been assigned to the nearest BDTRE network element: positions having a distance higher than 30 m from any possible target feature have been discarded, in order to limit errors linked to GPS accuracy. An FCD density information has been calculated for each dissolved road network feature, by subdividing the number of FCD positions by the length in meters of each single road feature. The density value has been then classified into 5 classes applying the Jenks natural breaks classification method, defining class breaks with the objective to minimize the variance within a single class and maximize the variance between classes. Similarly, exploiting the speed value natively acquired by OBUs, a mean speed value for each feature has been calculated and classified with the same approach above described. Table 1 and Table 2 display the confusion matrices generated by comparing actual BDTRE functional class with the one derived, respectively, by classifications based on density and mean speed. The total number of compared features (2526) is slightly lower than the total number of BDTRE features (2554) as 28 BDTRE features (approximately 1 % of the total) had no associated FCD positions. The overall accuracy is similar in the 2 cases (51.4% for the analysis based on density and 50.1% for the one based on mean speed), with omission error generally decreasing while moving to lower hierarchy and higher numerosity classes. Plotting the normalised distribution of the two values , density and speed, against BDTRE functional classes in a box and whisker plot (Figure 2), it is evident that in both cases mean values decrease while moving towards lower-level classes. It also appears that the classification based on mean speed differentiate more among classes. This can be partially explained by the fact that, in computing the density, road width has not been considered as it was not available in the BDTRE dataset. Based on the above mentioned considerations, a classification based on speed has been considered more appropriate. Om. Error (%) Figure 2 -Normalised FCD values for density and mean speed (Y axes) plotted against BDTRE functional classes (X axes). "X" represent mean values, the middle line of the box represents the median, the bottom line of the box represents the 1st quartile, the top line of the box represents the 3rd quartile, the whiskers (vertical lines) extend from the ends of the box to the minimum value and maximum value.

OpenTransportMap (OTM) is a road network dataset based on
OpenStreetMap and accessible in a scheme compatible to INSPIRE Transport Network. The OSM tag values for functional classification are grouped and mapped into the 6 classes defined in the INSPIRE directive, as shown in Table 3 (Jedlička et al., 2016). In the city of Turin, OTM classes are highly biased, with the fourthClass representing almost 80% of all road features within the municipality (Figure 7). Similarly to the BDTRE dataset, also the OTM dataset has been dissolved based on the functional class and name field, in order to obtain more continuous but homogeneous features (Figure 2).  The mean speed has been calculated for each OTM feature using the same approach applied to BDTRE features : also in this case, FCD positions has been uniquely assigned to OTM features based on proximity and excluding points with a distance higher than 30 m. Similarly to the analysis in BDTRE dataset, also in this case not all OTM features (2681) has an associated FCD position (2654), resulting in 27 features (again approximately the 1% of the original dataset) that were not classified. A mean speed value has then been calculated for each OTM network elements: then, OTM network has been subdivided in 6 speed classes using the Jenks natural breaks classification method. The number of classes has been selected in order to produce the same numerosity as the original OTM functional classes. Figure 4 shows the results of this classification: a comparison between the class numerosity in the original classification and in the one based on FCD speed shows that the distribution of the second one approximate a linear decrease while increasing the hierarchical level (higher mean speed values). A more continuous distribution allows to perform more effective thematic generalization, more adaptable to continuous map scale changes.
As mentioned in the introduction, a big advantage in setting up functional classes based on dynamic data, such as FCD, is the possibility to modify such classes, to adapt to specific traffic conditions. Figure 5 displays (with the same colour coding of Figure 4) 2 different functional classifications, one related to a reference situation for 7:00 AM on a working day, a typical morning rush hour in the city of Torino, and one related to 10:00 PM, corresponding with a situation with less congested traffic conditions. The 2 classifications clearly differs in the network elements attributed especially to the lower hierarchical classes. Applying different hierarchical classes to a route network solver may bring to a substantial differences, as shown in the example displayed in Figure 6.
The graph in Figure 7 displays the difference in the number of OTM network features falling in the different hierarchical classes, comparing the original classification with the one based on FCD mean speed values considering the entire dataset available (FCD all), only those acquired from Monday to Friday (weekdays), only those acquired on Saturday and Sunday (weekends) or only those acquired within a specific 1 hour interval (02:00 AM to 03:00 AM, 07:00 AM to 08:00 AM, 12:00 AM to 01:00 PM, 05:00 PM to 06:00 PM, 10:00 PM to 11:00 PM). It is evident that all speed based classifications have very similar patterns: weekdays 2:00 and weekdays 7:00 classifications are affected by a lower number of active vehicles.
The graph in Figure 8 makes the same comparison but considering the total length of features in km. Here the difference between low traffic conditions (weekdays 2:00 and weekdays 22:00) becomes clearly evident.

CONCLUSIONS
From this analysis, the provided method for defining road network functional classes seems to be effective in generating flexible and dynamic functional class ifications, in support to various applications such as cartographic generalization and traffic management. Coupling classification results with more rigorous methods for granting paths continuity would also allow to derive products best fitted for automatic generalization, granting higher level of connectivity within each functional class. Further research activities include additional studies on the representativeness of FCD based traffic conditions, in order to understand if systematic or local correction factors can be applied to cope with the relative reduced sample of the actual circulating vehicles. This can be performed by comparing FCD positions with figures coming from physical sensors, if made available by the managing authorities.   The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLIII-B4-2020, 2020 XXIV ISPRS Congress (2020 edition)