COMPLEX ROAD INTERSECTION MODELLING BASED ON LOW-FREQUENCY GPS TRACK DATA

It is widely accepted that digital map becomes an indispensable guide for human daily traveling. Traditional road network maps are produced in the time-consuming and labour-intensive ways, such as digitizing printed maps and extraction from remote sensing images. At present, a large number of GPS trajectory data collected by floating vehicles makes it a reality to extract high-detailed and up-to-date road network information. Road intersections are often accident-prone areas and very critical to route planning and the connectivity of road networks is mainly determined by the topological geometry of road intersections. A few studies paid attention on detecting complex road intersections and mining the attached traffic information (e.g., connectivity, topology and turning restriction) from massive GPS traces. To the authors’ knowledge, recent studies mainly used high frequency (1s sampling rate) trajectory data to detect the crossroads regions or extract rough intersection models. It is still difficult to make use of low frequency (20-100s) and easily available trajectory data to modelling complex road intersections geometrically and semantically. The paper thus attempts to construct precise models for complex road intersection by using low frequency GPS traces. We propose to firstly extract the complex road intersections by a LCSS-based (Longest Common Subsequence) trajectory clustering method, then delineate the geometry shapes of complex road intersections by a K-segment principle curve algorithm, and finally infer the traffic constraint rules inside the complex intersections. * Corresponding author


INTRODUCTION
Digital map is an indispensable tool for human production and life.It plays an important role in the daily life of residents and the construction of smart cities of China, especially for road network maps.High-precise and up-to-date road network maps provide an important prerequisite in many intelligent transportation applications (e.g., safety driving assist and smart traffic management).At present, digital maps are often produced by commercial companies using special vehicles equipped with high frequency and precision GPS-loggers to get the road network data.These processes may be expensive, timeconsuming, labour-intensive and not up-to-date.With the development of wireless sensor and positioning technology, a large number of location-related trajectory data are produced (Goodchild, 2007), which can be used for digital maps production, especially for GPS trajectory data collected by floating vehicles.These vehicles are equipped with singlefrequency GPS loggers to record the vehicle location at regular intervals (Xintao Liu & Ban, 2013;Sainio, Westerholm, & Oksanen, 2015).And the data contains a complete record of the vehicle's travel path, which contains a wealth of information about the road (such as steering restrictions, speed limits, road intersections, etc.) for fast access to road geometry and road semantic information.
Road intersections are often accident-prone areas as a result of connecting different roadways, and they are very critical to route planning in a routable road maps (Wang, Wang, Song, & Raghavan, 2017).The shape of road intersections is different such as T-shaped, cross-shaped, Y-shaped, ring-shaped and overpasses.A few studies paid attention on detecting road intersections and mining the attached traffic information (e.g., connectivity, topology and turning restriction) from massive GPS traces.To the authors' knowledge, recent studies mainly used high frequency (1s sampling rate) trajectory data to extract rough intersection models considering complex shape of intersections (Wang et al., 2015;Wang et al., 2017).It is still difficult to make use of low frequency (20-100s) and easily available trajectory data to modelling complex road intersections geometrically and semantically especially for overpasses because of the low-frequency sampling problem and the complex geometry shape and traffic modes of road intersections (Figure 1). We detect the traffic rules within each intersection of each cluster and then we use k-segment principle curve fitting of clusters of intersection to extract the geometry shape of intersections considering the low-frequency sampling problem.

Straight
The remainder of this paper is organized as follows.In section 2, recent studies on road network and intersection modelling are reviewed.In section 3, we fully describe our method proposed in this paper.Section 4 is a series of experiments of modelling intersections using the datasets of Wuhan.The last section are our conclusions and discussions.

RELATED WORKS
Road map construction is an active topic.Existing research can be divided into three categories: 1) Rasterization method: This type of research mainly convert GPS tracks to a raster map and then use digital image processing methods such as mathematical morphology to get the line of road segments (Chen & Cheng, 2008;Shi, Shen, & Liu, 2009).The resulting road map image is not able to contain topological information so it is hard to use for route planning and the low-frequency sampling problem is not well solved especially for different kinds of road intersections; 2) Incremental method: This type of research (Bruntrup, Edelkamp, Jabbar, & Scholz, 2005;Cao & Krumm, 2009;Kuntzsch, Sester, & Brenner, 2016;Li, Qin, Xie, & Zhao, 2012;Wang et al., 2015;Tang et al., 2015) simply added a single trajectory to an existing road network by adjusting input points from the trajectory and the existing road graph using physical attraction model (Cao & Krumm, 2009;Wang et al., 2015) or weighted edges of delaunay triangulation (Tang et al., 2015) of input track and existing graph to determining whether a new node or edge must be created.This kind of methods are computing-costly and not well suited for modelling road intersections with low-frequency tracks and especially sometimes needs an existing road map (Zhang, Thiemann, & Sester, 2010); 3) Clustering trajectories: This kind of researches (Guo, Iwamura, & Koga, 2007;Karagiorgou & Pfoser, 2012;Xuemei Liu et al., 2012;Schroedl, Wagstaff, Rogers, Langley, & Wilson, 2004) compute road centrelines by spatially clustering nearby trajectories using similarity measure between different trajectories.Usually, these methods sometimes needs an existing road map and can not differ similar road which are close (Xuemei Liu et al., 2012).Traditional similarity measurements are not able to handle the low-frequency sampling problem and road intersections are simply defined as where road centerlines meet.These studies are rarely modelled at complex intersections, such as overpasses.
At presents, road intersections and roadways are often modelled separately to generate a routable road map (Wang et al., 2015) which intersections does not require geometric modelling and just extract their traffic rules (Wang et al., 2017).The shortcomings of these studies are the needs to use highfrequency sampling of the trajectory data (1s), and only the cross-type road intersection are modelled.However, it's still difficult to obtain currently used high-frequency sampling trajectory which greatly limits the application of the methods these articles proposed.
Therefore, this paper proposes a novel method to make full use of low-frequency vehicle GPS trajectory data to extract traffic rules of different kinds of complex road intersection of based on the above reasons using longest common subsequence and directions of trajectories to measure the similarity between trajectories to better cluster tracks within intersections considering complex traffic mode and low-frequency sampling problem.As for geometric shape of intersections, k-segment principle curve fitting is used to get the geometry shape of intersections considering low frequency sampling problem.

Trajectory clustering within road intersections
Trajectories within each intersection are clustered to get traffic rules and further delineating the geometry and traffic rules of road intersections.In this paper, the similarity measurement combining the longest common subsequence(LCSS) and directions of trajectories is proposed to facilitate the distinction between the same road in the opposite direction of the road such as the two traveling directions of b→d and d→b of Figure 1-a After the track resampling, define the common subsequence with r as the threshold (Figure 2-b): Where si is one subsequence of T1 and pj is one point of si, and dist(pj, T2) means the shortest distance between pi and the points of T2.As illustrated in Figure 2-b, the common subsequence lcss1 of T1 with T2 have two sub sequences, the total length of subsequence can be calculated as: The longest common sub-sequence similarity between L1 and L2the is defined as: The similarity between the longest common sub-sequences of T1 and T2 under this definition can effectively measure the similarity between vehicle trajectories under low-frequency sampling as we can see in Figure 1-c.trace1 and trace2 have the same travel direction and their similarity can be clearly measured using this method.
However, the similarity of the longest common sub-sequence can not determine the travel route in different directions on the same road.The direction similarity of the vehicle trajectory is defined as the similarity of the direction change in the whole driving trajectory.In this paper, a "distance -direction" function called F is established to measure the similarity of the track direction.The x-axis is the distance from the original track point to the target track point and normalized to 0-1 and y-axis is the direction value, which is helpful to compare the similarity of the direction of the different lengths.The similarity of two tracks can be measured as of differences of heading directions: The overall similarity between T1 and T2 is defined as sim(T1,T2) = simlcss(T1,T2)*0.5+ simori(T1, T2)*0.5.
The overall dissimilarity matrix between each two trajectories are calculated and applied in the hierarchical clustering framework (Müllner, 2013).Particularly, Davies-Bouldin Index (DB) is introduced to adaptively determine the number of trajectory clusters.It is demonstrated that DB can efficiently balance the compactness of individual clusters and the isolation degree between various clusters (Davies & Bouldin, 1979).In detail, DB index with K clusters can be calculated as: Where Ni (Nj), vi (vj), and ECi (ECj) indicate the trajectory number, the centroid and the compactness coefficients of one individual cluster Ci (Cj), respectively.K is the total number of clusters.We calculate the DBIs with different cluster number n1~n2 and select the one with the smallest DBI value as the optimal cluster number.

Traffic rules of road intersections
Each cluster obtained by Section 3.1 is one representative turning mode within road intersection.The turning restriction rules of different turning modes can be inferred according to the spatial relationship between the vector from starting point to ending point( SE ) of a cluster and the points of traces belonging to the cluster.As shown in Figure 4, each point of traces(P) can be determined whether it is on the left or right side of SE by cross product of SE and SP .If 0 SE SP   (>0 means the z of SE SP  is positive), P is on the left side of SE and the track is a right turn as shown in Figure 4-a on the right side of SE and the track is a left turn as shown in Figure 4-b and If 0 SE SP   , P is just on the line of SE and the track is heading straight as shown in Figure 4-c.For U-turn, there is a little different, the directions difference between S and E is calculated and there may be a U-turn if the difference is close to ±180 as shown in Figure 4-d

Schemes of turning restriction rules inference
All the trajectories in cluster C are judged one by one according to the above schemes.The final turning rule of cluster C is selected as the one with the largest percent.Actually, one intersection may have several turning modes or trajectory clusters, and thus contain multiple turning restriction rules.

Geometry and topology of intersections
Based on the clusters of different turning modes, the turning paths of different turning modes can be delineated to express the geometric shape of the whole road intersection.However, because of the fluctuating position precision of GPS devices, trajectory points may not fall into the road intersection region and the tracks of one turning mode can not reflect the true and precise geometry of intersections (Figure 5-a), possibly leading to incorrectly delineates the geometry of turning paths (Figure 5-a highlighted).It is still changeable to use widely-available low-frequency trajectory data to fit real turning paths and in the other hand, it's expensive to obtain high-frequency GPS data.To reduce the inevitable noise in GPS data, we generate the ksegment principle curves from discrete track points (Figure 5-b) but not from trajectory lines.Principle curve proposed by Hastie (1984) is defined as one self-consistent curve that passes through the middle distribution of observed point data.Ksegment fitting algorithm of Verbeek et al. (2002) is implement to delineate the turning path of different turning modes considering the clustering of trajectories of complex intersections contains a cross-distribution pattern (Figure 5-b) and other kinds of principle curves are hard to handle this situation.The curves need to be merged after the fitting process of clusters of intersection because the curves are not the exact geometry of road intersections.The curves only express traffic modes of road intersections.For crossroads, T-junctions, overpasses, curves which is turning left or right are merged to curves which are straight and then they are merged together (Figure 6).The longest common subsequence and heading similarity are used to detect adjacent curves with similarity directions.For roundabout, there is a little difference.Firstly, the longest common sub subsequence of similar directions are detected to get the "circle" of the intersection and each trajectory is separated into entry-part and exit-part.In the cycle, the shortest length of the entry-part is the final entry-part and so is the exit-part.Then, the entry/exit-parts are clustered to get road shapes and the points of circles are fitted to get the ring of the roundabout.

Study area and datasets description
The finally result of this experiment can be shown as figure 8 with all kinds of road intersections detected.

Overpass
The geometries of the overpass are generally complex.The height of vertical roads of overpasses are generally different, because of which special rings of roads of left and right turns need to built at junctions.K-segment principle curve is used to fit the points of each cluster and the figure 10 below shows the result of fitting and the final extracted geometry and the points of entry and exit of left/right turning roads of Gunaggu Square after merging.The topology of roads of roundabout is kept which is siutable for navigable and routable road networks.The final map of roundabout is closer to the actual geometry of road intersection.

CONCLUSIONS
In this paper, we propose a method to extract complex intersections based on low-frequency trajectory data.We calculate the longest common sub-sequence and trajectory travel directions to measure the similarity between trajectories, which provides a basis for reasonable clustering of trajectories.
Then, the traffic rules at road intersections are extracted and Ksegment principle curve fitting algorithm is used to extract the geometric shape of the road intersections based on low frequency sampling tracking data.The experimental results show the effectiveness of the proposed method.Generally speaking, the proposed method can extract detailed geometry structures and traffic rules of road intersections with an effective way.Our research is important because the connectivity of road networks is mainly determined by the topological geometry of road intersections.This is of great significance for automatic production of navigable and routable electronic maps.The author believes that the continuous enrichment of semantic and geometric information is an important development direction of future electronic map, which is the foundation of smart city.

Figure 1 .
Figure 1.Traffic model at complex road intersections (a: Traffic model; b: Vehicle driving routes from north to east; c: Low-frequency GPS track data of vehicles from north to east)Therefore, it is a complicated task for us to model these complex intersections by using the GPS trajectory data of lowfrequency sampling due to the problem of sparse sampling of trajectory and complicated traffic of overpass.In this paper, we are motivated to model different kinds of road intersections using low-frequency GPS tracks and try to complete the following work:  The tracks within each intersection are clustered using longest common subsequence to measure the similarity of trajectories considering the low-frequency sampling problem and the different traffic modes along each road.
Figure 2. Illustration of trajectory resampling(a: Original trajectory data; b: Trajectory data after resampling)

Figure 3 .
Figure 3. Two tracks for example and definition of the turning function of two tracks . In the paper, four turning restriction rules, namely, turn left, turn right, going straight, and U-turn are mined based on the proposed method.
Figure 5. Road intersections with low-frequency sampling tracks of cluster 11(a: Low-frequency trajectories of complex road intersection and its central trajectory; b: Principle curves of tracks)

Figure 6 .
Figure 6.The merging process of principle curves (a-c: merging process of crossroad; d-e: merging process of roundabout)4.EXPERIMENTS AND DISCUSSION4.1 Study area and datasets descriptionIn this paper, taxi tracking dataset of a piece of area of Wuhan city(114.3°E, 30.48°N), May 1, 2014, a total of about 0.6 million taxi tracking points is used to implement the experiment in order to verify the effectiveness of this method with trajectory data sampling interval 20-100 seconds, precision 5-20 meters.

Figure 7 .
Figure 7. Study area and tracks within the area in Wuhan.

Figure 8 .
Figure 8. Finally geometry and traffic rules of different road intersections(red: entrances, blue: exit) Figure 9-a shows the original trajectories of Qingling overpass of Wuhan and Figure 9-b shows the clustering result and each cluster of result is shown as Figure 9-d-n.

Figure 9 .
Figure 9. Clustering results of Qingling overpass in Wuhan

Figure 10 .
Figure 10.Geometry of overpass and roundabout using K-segment fitting

Figure 11 .
Figure 11.Clustering results of Guanggu square roundabout in Wuhan(a: original tracks; b: Clustering result; c: noise of cluster; d-n: each cluster is shown)