SEMANTIC LABELLING OF ROAD FURNITURE IN MOBILE LASER SCANNING DATA

Road furniture semantic labelling is vital for large scale mapping and autonomous driving systems. Much research has been investigated on road furniture interpretation in both 2D images and 3D point clouds. Precise interpretation of road furniture in mobile laser scanning data still remains unexplored. In this paper, a novel method is proposed to interpret road furniture based on their logical relations and functionalities. Our work represents the most detailed interpretation of road furniture in mobile laser scanning data. 93.3% of poles are correctly extracted and all of them are correctly recognised. 94.3% of street light heads are detected and 76.9% of them are correctly identified. Despite errors arising from the recognition of other components, our framework provides a promising solution to automatically map road furniture at a detailed level in urban environments.


INTRODUCTION
Road furniture interpretation has received much attention in recent years, which is significant for both road safety and large scale mapping.Road furniture are entities such as traffic signs and traffic lights mounted on the road.The distribution of traffic lights and street lights has a compelling effect on the road safety.For instance, in Europe and USA, departments of transportation have established protocols to regulate road infrastructure inventory to reduce traffic accidents.Road furniture, as an essential part of the road environment, plays an important role in large scale mapping which can provide aided services in autonomous driving systems especially in bad weather conditions.Although there is a lot of attention paid to road furniture semantic labelling, they still interpret road furniture at an object level without further detailed information.Interpretation of road furniture in 2D images has been investigated.However, these 2D labelled road furniture components are not precise enough to generate 3D mapping of road furniture by dense matching.Current work on mapping road infrastructure relies on visual interpretation and manual labelling, which is tedious and time-consuming.Therefore, fully automatic road furniture interpretation is in urgent demand.
Much research has been carried out on road furniture recognition in point clouds.However, there is little attention on interpreting road furniture at a functional component level, namely semantically labelling of road furniture based on their functionalities.In this paper, we propose a method to semantically label road furniture based on their topological relations and features.One example of interpretation of road furniture by using our algorithm is as indicated as The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLII-2/W7, 2017 ISPRS Geospatial Week 2017, 18-22 September 2017, Wuhan, China

RELATED WORK
Model-driven methods represent an early attempt to recognize objects alongside roads from Mobile Laser Scanning (MLS) data.Several techniques used for recognition of structures in point clouds have been reviewed by Vosselman et al. (2004), which involves smooth surfaces, planar surfaces and parameterized shapes.These segmentation techniques have been widely used to model industrial installations, city landscapes, digital elevation models and trees.Xiong et al. (2011) propose a sequenced predictor to do 3-D scene analysis.However, the precision of pole and tree trunk recognition using M3N is low compared with the identification of other categorises.Velizhev et al. (2012) present an implicit shape models (ISM) based method to automatically localize and recognize cars and light poles.The spin image (SI) descriptor is employed as the feature representation for recognition.Yokoyama et al. (2013) propose a method to detect and classify pole-like road furniture from MLS data.Both shape features of pole-like objects and their surrounding pole-like objects distributions are used in this method.(Yang et al., 2013;Huang and You, 2015;Soilán et al., 2016;Lehtomäki et al., 2016) employ SVM in combination with defined features to classify point clouds of urban scene by using SVM.Random forest is adopted with manually drafted features to identify objects from MLS data by (Fukano et al., 2015;Hackel et al., 2016).Weinmann et al. (2015) propose an optimal-feature-based method to classify urban environment objects into different categories by using random forest.Yu et al. (2016).However, this method cannot undertake the semantic labelling of complex connected street furniture.A 3D convolutional neural network (ConvNets) is introduced to detect objects in RGB-D images by Song and Xiao (2016).3D Region Proposal Network (RPN) and Object Recognition Network (ORN) are firstly proposed in their work to learn objectness from geometric shapes and extract geometric features in 3D and colour features in 2D.
Much effort has been put on road furniture recognition in point clouds.Compared with these research, the interpretation of road furniture in this paper is more detailed.In our research, we focus on the interpretation of pole-like street furniture, which consists of street lights, traffic lights and three types of traffic related signs.

METHODOLOGY
In this research, a method is proposed to assign meaningful labels to decomposed road furniture.Decomposed road furniture is obtained by extracting poles and separating components attached to poles, which is explained in our previous work in detail.The methodology is described in three sections.In the first section (Section 3.1), we introduce features for distinguishing different types of road furniture components.Section 3.2 describes the formulation of generic rules for the recognition of road furniture components.The process of road furniture semantic labelling is explained in the last section (Section 3.3).

Features notation
We firstly obtain the input data from the result of decomposition which is explained in our previous work (Li et al., 2016).In this method, seven discriminant features are utilized to differentiate components attached to poles.The explanation of employed features are depicted as follows.
Relative position.The relative position between poles and their corresponding attachments is summarised as bottom, middle and top.It describes the topological relations between poles and their attachments.We clarify that attachments are the components which are connected with poles.The calculation of this features is based on the percentage of attachment which is above the highest point of attached poles or underneath the lowest point of attached poles.If the percentile of attachment which is above the highest position of pole is higher than a predefined threshold, the attachment will be defined to be at the top of this pole.If the lowest position of this attachment is close to or lower than the lowest position of this pole, this attachment will be defined to be at the bottom of this pole.The relative position is set to be middle otherwise.The feature is designed to exclude attachments at the bottom of the pole such as ground points.
Relative height.This feature is the relative height between attachments and their attached poles.The relative height is the lowest height of an attachment subtract the lowest height of its attached pole.This feature is reliable because the lowest height of an attachment can reflect the usage of this attachment.It is the main constraint for street light head connected to a vertical pole.
Geometric structure.This feature indicates the geometric dimensionality of attachments, which are linear, planar and scattered.We use the definition of geometric features described by Vosselman et al. (2013). (2) ) are the three eigenvalues that correspond to an orthogonal system of eigenvectors of the covariance matrix of the points of an attachment.These three eigenvalues are calculated for every attachment, G P is the geometric structure of an attachment.
This feature is used to describe the geometric shape of attachments.It is helpful to recognize planar components such as traffic signs.
Relative angle.It is the angle between the normal of attachments and the principal direction of their corresponding connected poles.As depicted in Fig. 2, V1 is the main direction of the pole and V2 is the direction of the normal of the attachment.This is mainly used for distinguishing attached signs from other components.
Ratio of high reflectance.This simple feature is obtained by calculating ratio of high reflective points of every component.If the reflectance of an individual point is higher than a threshold, this point will be set as a point with high reflectivity.Then the ratio of high reflective points in one component can be computed.For the reason that traffic functional signs have high reflectance, the feature is adopted to distinguish traffic functional signs and other signs.
Figure 2. The relative angle between the principal direction of a pole and the normal of an attachment Size.This feature is designed for planar attachments.Size is the area of an attachment's concave shape after it being projected to its normal direction.This feature is used to distinguish different types of traffic functional signs.

Ratio of height to length.
The ratio of height to length gives the proportion between the height of attachments and the largest variation in the horizontal plane.This feature is utilized to differentiate street signs and other signs.

Formulation of rules and features
In this paper, we categorise street furniture attached components into 5 classes, street lights, traffic signs, street signs (direction signs), traffic information signs and traffic lights.Instances of these components are as indicated in Table 1.In order to recognize components of road furniture, we character them by generic rules based on traffic regulations.Then, based on generic rules of assembling road furniture, we distinguish the topological relations between poles and their attached components.In this paper, the connectivity between a pole and an attachment is decided by their minimum distance.If it is smaller than a threshold, it is considered to be connected.
The following rules are defined to assign a semantic label to each component of road furniture.We first start with rules for components connected to a vertical pole.
Street lights connected to a vertical pole (R1): If there is a component connected to a vertical pole, its relative height is larger than a threshold Hsl and it is on the top of this pole, this component will be street light.Hsl is the threshold of discriminant feature to recognize street light head.

Traffic signs connected to a vertical pole (R2):
If there is a component connected to a vertical pole, it is not at the bottom of this pole, its relative angle is perpendicular, it is linear or planar, its area is smaller than Ats, its ratio (height to length) is close to 1 and its ratio of high reflectance points is larger than Rts, this component will be a traffic sign.Ats is important for differentiating traffic signs from other traffic functional signs.
Street signs / direction signs connected to a vertical pole (R3): Conditions are the same as traffic signs except that the area is smaller than Ass and size ratio is smaller than Rss.Then this component should be a street sign or direction sign.The ratio of size is significant for distinguishing street signs from other traffic functional signs.

Traffic information signs connected to a vertical pole (R4):
Traffic information signs are usually large.
Conditions are the same as street signs except that the area is larger than max (Ats, Ass) and there is no constraint of size ratio.Then this component should be a traffic information sign.

Traffic lights connected to a vertical pole (R5):
If there is a component connected to a vertical pole, it is not planar and its relative height is smaller than Hsl, this component should be a traffic light.
Then these are rules for components connected to a horizontal pole.

Street signs / direction signs connected to a horizontal pole (R6):
If there is a component connected to a horizontal pole, its relative angle is perpendicular, it is linear or planar, its area is smaller than Ats, its ratio (height to length) is smaller than Rss and its ratio of high reflectance points is larger than Rts, this component will be a traffic sign.

Traffic information signs connected to a horizontal pole (R7):
Conditions are the same as street signs except that the area is larger than max (Ats, Ass) and there is no constraint of size ratio.Then this component should be a traffic information sign.

Traffic lights connected to a horizontal pole (R8):
If there is a component connected to a horizontal pole, it is not planar, this component should be a traffic light.

Semantics labelling
Based on these rules, labels are assigned to the attachments.Poles are first detected by our previous work.Specifically, based on their principal direction, poles are recognised to be vertical and horizontal.Then the connectivity between attached components and poles is analysed.Based the connectivity relations, attachments are found for every pole.Features mentioned above are produced for every attachment afterwards.Then these attachments are given labels by fitting predefined rules with these generated features.Before giving labels to the attachments, the parameters are optimized by selecting the best combination of parameters in the training area.An example of complex road furniture interpretation is as shown in Fig. 1.

EXPERIMENTAL RESULT
The experimental test is carried out in two datasets, which are described in Section 4.1.The result and analysis are explained in the following Section 4.2 and Section 4.3.

Test sites
In order to evaluate the performance of this innovative method, we chose two test areas.Dataset A is collected in a medium size city located in Europe.Data acquisition system is Optech LYNX which comprises two laser scanner mounted at the back of a moving vehicle.Dataset B is Paris benchmark dataset, collected by Stereopolis II system (IGN, 2013).There are many different types of road furniture in these two research areas.Dataset A covers about 1.25km of street scene, and Dataset B covers approximate 0.43km of road scene.The point density of dataset A is high and even.The distance between neighbouring points is 0.02m in X direction and 0.03m in Y direction.In contrast, the point density in dataset B is low and uneven.The point density along the scanline direction is much higher than the point density perpendicular to the scanline direction.

Results
Due to the different configurations of data collection, the reflectance threshold for these two datasets is also different.Another problem is that the lowest street light of these two datasets is different.In order to get the optimal set of parameters, several types of road furniture are selected for training.Then the most favourable combination of parameters is obtained automatically based on the highest F1 score of recognition in training dataset.F1 score is used for evaluation in the training process because it can balance the precision rate and recall rate.The training process aims at tuning the sensitive parameters which react to our defined rules.These parameters are the relative height of attachment and reflectance value of attachment.In dataset A, the height threshold and reflectance threshold is set to be 2.4m and 65.In dataset B, the height threshold and reflectance threshold are set to be 3.4m and -4.The reflectance value of some points in Paris dataset is negative because it is corrected by distance attenuation.

Analysis
In the assessment, a confusion matrix is used to evaluate the performance of our proposed framework (Table 2 and  Table 3).The recognition of six types of road furniture is evaluated.In these tables, P stands for poles, S1 represents street lights, S2 stands for street signs, T1 symbolises traffic signs, T2 stands for traffic information sign and T3 represents traffic lights.M is for the road furniture components that are not given any labels.T is the total number of visually interpreted components.R is the recognition rate of this algorithm.Other are other components which are misrecognised as road furniture components.Total is the number of road furniture components which are recognised by the algorithm.FP is the false positive rate.In dataset A, 93.3% of poles are extracted by using decomposition which can be found in our previous work.All the detected poles are true positives.One horizontal pole is recognised as street light because this horizontal pole is not extracted as a pole at the decomposition stage.In addition, its relative height and relative position is similar to street lights, which gives rise to misrecognition.94.3% of street light heads are detected and 50 out of 65 detected street light heads are correct.50.9% of traffic signs are recognised, 27 out of 36 detected traffic signs are correct.Many of them are missed because their geometric structure and reflectance attributes are not calculated reliably.Through training, the threshold of reflectance is set to be 65.However, in view of that traffic signs can only be scanned one side when their normal is perpendicular to the driving direction, a few traffic signs do not have high reflectance.Another reason is their geometric structure is scattered result from small sized traffic signs being more effected by noisy points.As shown in Fig. 6, the point cloud of the traffic sign (in red circle) is scattered because of noisy points, which incurs it being misrecognised as street light.41.2% of street signs are extracted and 7 out of detected 12 are correct.27.3% of traffic information signs are recognised and 3 of detected 4 are correct.The recognition rate of traffic light is 51.6%.16 out of 21 detected traffic lights are correct.The recognition rate of street sign and traffic information sign is lower than 50.0%and the identification rate of traffic sign is 50.9%.One reason is that there are few instances of street signs and traffic information signs.An error in a small dataset has a larger influence on the percentage of recognition.Another reason is that the computation of geometric structure, size and reflectance of street signs and traffic signs is not correct in view of that the point cloud collection of these small sized sign components is incomplete.The recognition rate of traffic lights is not high because there are many traffic lights scanned incompletely, which leads to their incorrect structure features.Most of them are categorized as signs because of their high planarity.Later on more effort will be investigated to distinguish them.

CONCLUSIONS AND FUTURE WORK
To conclude, this paper provides a method to interpret road furniture at a detailed level.In dataset A, 93.3% of poles are correctly extracted and all of them are correctly recognised.94.3% of street light heads are detected and 76.9% of them are correctly identified.Street light heads are extracted well.In contrast, other types of components are not well identified because incomplete scanned data affecting the computation of features.Another reason is the quality of data affects the recognition.For example, part of components in dataset B can even not be interpreted by visual inspection.In addition, these interpreted components are very similar, which makes it even more challenging.
Although there are errors, the result still provides a promising solution to assist large scale street furniture mapping.Future work will be investigated on using the combination of relations between components and relations between individual road furniture to generate 3D road furniture models.
Figure 1.Road furniture interpretation linear, planar or scattered structure of attachments respectively; 1

Figure 2 .
Figure 2. Road furniture interpretation of dataset A

Figure 6 .
Figure 6.The point cloud of traffic sign (the left) and street view (the right) In dataset B, 78.6% of vertical poles are correctly extracted by our previous work.77.8% of street light heads are detected and 7 out of detected 11 are correct.53.8% of traffic signs are identified and 7 out of detected 12 are correct.Due to the limited number (only 2) of traffic information signs, we don't evaluate their recognition rate.The recognition rate of street sign is low because of the noisy points.Only 20.4% of traffic lights are recognised because of their incomplete scanned data (Fig. 7).The first two left images are lateral view.The third one is the street view image of this road furniture.From the left image, we can see these traffic lights have

Figure 7 .
Figure 7.One road furniture which contains an unrecognised traffic light (white points) in Dataset B From the result, we conclude that poles can be extracted well.Street lights can be recognised reliably if the point cloud is dense and of good quality.Some street signs are labelled as traffic lights because these street signs are still connected with each other and not separated.The reason is the data is too noisy.As illustrated in the second image of Fig. 8, component 1 (yellow area) and component 2 (red zone) should be separated.Due to the noisy data, component 1 and component 2 are not separated by decomposition algorithm (the first image of Fig. 8), which results in the misrecognition.

Figure 8 .
Figure 8.One example of incorrectly recognized road furniture

Table 1 .
Examples of attached components