ON THE CHALLENGES OF MOBILITY PREDICTION IN SMART CITIES

The mass of data generated from people’s mobility in smart cities is constantly increasing, thus making a new business for large companies. These data are often used for mobility prediction in order to improve services or even systems such as the development of location-based services, personalized recommendation systems, and mobile communication systems. In this paper, we identify the mobility prediction issues and challenges serving as guideline for researchers and developers in mobility prediction. To this end, we first identify the key concepts and classifications related to mobility prediction. We then, focus on challenges in mobility prediction from a deep literature study. These classifications and challenges are for serving further understanding, development and enhancement of the mobility prediction vision.


INTRODUCTION
In the recent past, the appearance of smart cities and internet of things (IoT) systems along with new technologies (e.g., mobile networks-MN, sensor networks), and new tools (e.g., smartphone), has led to an impressive growth of amount data and information produced. This amount of data tends to multiply wildly in the case of smart cities which include at the same time these new concepts, technologies and tools. Indeed, a smart city features the utilization of information and communication technology (ICT) infrastructure, human resources, social capital, and environment resources for economic development, and high quality of human life. Thus, analysis and mining of sensed data from dynamic cities is an important step towards making a city smart (Pan et al., 2013).
Among the flowing data in such a city, those linked to individual movements (mobility data) are very interesting for a large community of researchers and developers, especially in the mobility prediction field.
Predicting a mobile user location is an inherently interesting and challenging problem in several domains such as the development of location-based services, personalized recommendations, suspicious target tracking, intelligent transportation and mobile communication systems (MCSs). For example, in MCSs, location prediction has received increased attention driven by applications in location management, call admission control, smooth handoffs, and resource reservation for improved quality of service (Samaan, Karmouch, 2005).
However, predicting mobility requires the availability of a large amount of data from very heterogeneous sources, especially when it comes to a smart city. Two levels of collection can be distinguished in the data storage location. The first level concerns the data acquisition by a system or by an application from a mobile device, such in the case of the Mobile Crowd Sensing and Computing (MCSC) paradigm or even with the use of a recommendation application on a smartphone. In this case, the data are stored in the mobile device or in an external storage place related to the application. The second level is related to a collection of data from a storage device used by a system, such as MCSs, smart cards management systems, etc. In this case, the data, in particular mobility data, are stored in specific equipment (e.g., HLR 1 , VLR 2 , sensor nodes, etc.) and are collected directly from this equipment. However, because of the privacy concerns of this type of data, they may be subject to constraints and conditions when accessing to those data. A major challenge is to access and recover the. In addition, data are often stored in a raw state such as log files (Zheng et al., 2010). So, before being exploited, data stored in log files must be transformed into other formats like GPX or PLT format.
Once data is collected and stored, prediction requires a model coherent with the prediction's application domain and able to provide the best mobility prediction in terms of accuracy, cost, etc. A prediction model can be produced based on one of the usual techniques dedicated to prediction such as Markov chains-MCs (Amirrudin et al., 2013a;Qiao et al., 2015), Machine Learning-ML (Ozturka et al., 2019), Bayesian Networks-BNs (Dash et al., 2015), and data mining-DM techniques (Mcinerney et al., 2013), where most of them are based on learning from previous data to predict user mobility.
Although the mobility prediction has been the subject of several research works which sometimes gave acceptable results, in terms of precision (Amirrudin et al., 2013b;), and cost, certain issues remain open and accept new contributions.
In this paper, we aim to focus on the main concepts related to the mobility, data required for mobility prediction and on related works on mobility prediction. We also aim to disclose issues allowing researchers and developers to orient themselves towards open questions on the mobility prediction.
The remainder of this paper is organized as follows: Section 2 presents the basics of mobility prediction. Section 3 gives an overview on the mobility prediction works. In section 4 we propose a classification for mobility prediction. In Section 5 we focus on a set of challenges related to mobility prediction. And in Section 6 we give a conclusion and some perspectives.

DEFINITIONS AND KEY CONCEPTS
In this section we introduce the key points of mobility prediction. We start by a definition and a classification of mobility. Then we focus on the data required for mobility prediction.

Mobility
In general, the definition of mobility can be obtained from a dictionary or an encyclopedia. In Larousse 3 editions, for example, the definition can be translated as follows: "A Character of what is susceptible to movement, of what can move or be moved, change place, function." In the context of communication networks, we carry over the definition given by Samuel Pierre in his book (Samuel, 2003): "In the domain of communication networks, mobility can be defined as the ability to access, from any place, all the services normally available in a fixed and wired environment such as a home or an office. These services include, among other things, the possibility of conducting a telephone conversation while driving a car, being reached from a traditional telephone or an IP address anywhere in the world, or receiving e-mail, faxes or voice messages while traveling abroad." From our viewpoint, in the context of a smart city, mobility can be defined as any movement an entity undergoes over time, where an entity can be an object or an individual (a person). This movement can vary from a simple action (move the hand, take a seat, get up...) to a real trip (walk from a point A to point B). In reality, in such a case, mobility is a relative notion in the sense that it is closely linked to the size of the environment in which we want to define it. For example, if the environment consists of a city, the mobility is considered as being the movements from a departure point to an arrival point. However, when the environment is restricted to a limited space (house, bedroom…), the mobility concerns elementary actions taking place in this space. The movements can also be repeated over time either in the same way or in different ways: we therefore speak about random or regular movements.

Mobility classification
Based on the size of the environment in which the movement takes place, on the way in which the movement is carried out and on works dealing with movement, we distinguish two types of mobility classification: according to the distance travelled and according to the nature of the movement (see Figure 1). Classification according to the distance travelled (environment size): based on the environment size, we distinguish: extended mobility and restricted mobility. Extended mobility is characterized by movements spanning a long distance such as the movement of an individual from his home to his workplace. Restricted mobility is characterized by movements taking place in a limited area (an office, a bedroom, etc.). Most of the research works dealing with mobility, in particular mobility prediction (Jiang et al., 2016;Liou, Huang, 2005) are directed towards extended mobility. However, some 3 https://www.larousse.fr/dictionnaires/francais/mobilité/51890 works which are interested in restricted mobility exist (Almeida, Azkune, 2018). In a smart city, restricted mobility can be related to connected objects (IoT).
Classification according to the nature of the movement: refers to the way in which the movement recurs. Here, we distinguish regular mobility and random mobility. Regular mobility concerns movements reproducing in the same way (e.g. taking the same path to get to work). In contrast, random mobility concerns either new movements or movements that do not reproduce in the same way (for e.g., taking two different paths to get home). The number of works directed towards regular mobility (Nadembega et al., 2014) is greater compared to the number of works dealing with random mobility (Liu et al., 1998), or even compared to those treating both types of mobility (Liu, Maguire, 1996). Also, for mobility prediction, the accuracy is better in the case of regular mobility (Jiang et al., 2016). Random mobility is, therefore, a real challenge in the area of mobility prediction.

Mobility prediction data
Basic prediction models rely on historical mobility data to provide predictions (Anisetti et al., 2011). Other models rely on contextual data, in addition to historical mobility data (Abu-Ghazaleh, Alfa, 2009). Another category includes models that only use contextual data (Samaan, Karmouch, 2005). However, little works has been done in this latter category.
In this section, we aim to discover provenance, storage and exploitation of mobility data. Also, we define contextual data, and give an overview about their use for mobility prediction.

Mobility data: provenance, storage and exploitation
The study of any mobile object or individual's mobility requires mainly a set of data related to its location at different moments. In a smart city, the mass of mobility data circulating in the city is constantly increasing. However, before being used, these data, coming from different sources (mobile devices, vehicles, magnetic cards, sensors), must be collected and processed using specific tools, techniques and technologies (GPS, RFID, etc.).

Data provenance
In (Pan et al., 2013), the main data sources have been grouped into four categories, namely mobile devices (smartphones, laptop…), vehicles equipped with global positioning system (GPS) devices, smart cards (bank cards and transport cards), and floating sensors. Other sources exist in a smart city, such as cameras installed in the city, providing a big quantity of videos. Acquiring data from these different sources differs depending on the source. The most used techniques and technologies are GPS, WiFi, GSM and radio frequency identification (RFID). Based on (Pan et al., 2013), we summarize in Table 1

Data storage
Data can be often collected from specific equipment constituting part of the systems architecture to which the different data sources are linked. For example, mobile devices are necessarily linked to a mobile network, and their locations data are stored in specific equipments (HLR, VLR, etc.). Also, in the case of sensors, belonging to a specific sensor network, the location is stored in the sensors themselves, in the base stations or even in the processing centers. Because of the huge amount of data, Edge computing solution and Artificial Intelligence techniques are to be considered.

Data exploitation
Collected data are initially in a raw state. In order to use them, they must be transformed and saved in an exploitable format. For example, in the GeoLife project (Zheng et al., 2010), mobility data related to individual's trajectories are generated in the form of GPS logs and are transformed and saved in PLT files. The main fields recorded in a GPS log 4 file are: longitude, latitude, altitude, current date and time of day. The data in the GPS log files are saved in NMEA 5 format. To use them, they must be converted to other formats such as GPX or PLT format using tools, software or even websites, such as the Logcat utility, the GPSBabel software or the GPS Visualizer website. Also, to exploit data, consideration of data models (e.g. Fiware data model) is very important in the context of a smart city. Figure 2 illustrates the steps for producing usable data.

Contextual data
The context has been defined in several ways. The definition most often cited in the literature is that given in (Abowd et al., 1999): « Context is any information that can be used to characterize the situation of an entity. An entity is a person, place, object that is considered relevant to the interaction between a user and an application, including the user and applications themselves » Zimmermann et al (2007), propose a new definition specifying five categories of contextual information. Moreover, Dey's definition is proposed in the context of interaction between a user and an application. However, nowadays with smartphones, the system is able to give relevant information to the user without explicit interaction. According to Zimmermann, « Context is any information that can be used to characterize the situation of an entity. Elements of description of this context information fall into five categories: individuality, activity, location, time, and relations ». Figure 3 depicts an entity as well as its different contextual categories according to the vision of Zimmermann et al. When it's about mobility, the context concerns any information susceptible to inform about individual's movements or which influences their movements. In this perspective, the individual's movements depend on several parameters such as profile (age, profession, preferences, etc.), time, location, environment (weather information, information on road traffic, etc.), or again means of travel (on foot, by car, public transport, etc.). For example, to get to work, a person can take different paths under different conditions (rain, obstacle, etc.). Thus, to predict the individual's movements, considering these conditions is necessary in order to improve the prediction accuracy.
Contextual information can be obtained from a variety of sources. Benouaret (2017) distinguishes three types of contextual information: explicit, implicit and inferred. For explicit sources, context information is already included in the data or directly requested from the user. The most obvious example here corresponds to a user registration on a system, which provides personal information. For implicit sources, information is obtained from the data or the environment in which a user is situated without explicitly asking him for this information. For example, we can get the geographic situation of a user using an application installed on their smartphone. Concerning inferred sources, information is obtained using data exploitation and exploration methods such as data mining techniques.

OVERVIEW OF THE RELATED WORKS
The establishment of a prediction model or algorithm represents a key point for mobility prediction. In this section, we summarize in Table 2 some mobility prediction works carried out based on Markov models (standard and hidden MCs), BNs (standard and dynamic) and ML techniques (Artificial Neural Networks ANN, Deep learning DL, Recurrent Neural Networks RNN). We give also the main conclusions obtained in the cited works. Table 2 is established on the basis of a study described in (Zhang, Dai, 2018), summarizing some works dealing with mobility prediction. The following criteria are considered: objective, technique used, movement type, context consideration, precision, and complexity/costs generated (calculation time, memory space…). The "technique used" and "precision" criteria are reported from (Zhang, Dai, 2018). The remaining criteria are new and are useful to the identification of mobility prediction issues.
From our study of the works cited in Table 2, we retain the following: a-All works are oriented towards the application of mobility prediction to mobile networks (except Qiao et al., 2015).
b-Markov models (standard and hidden) are widely used in the field of mobility prediction. Standard MCs are simple and easy to implement (Zhang, Dai, 2018) but their performance, in terms of precision, are often subject to certain constraints (transition matrix's values (Amirrudin et al., 2013a), movement type (Jiang et al., 2016) …). The hidden Markov models are also efficient in terms of accuracy (Zhang, Dai, 2018) (about 53% in (Lv et al., 2014), sometimes exceeds 80% in (Qiao et al., 2015) and even greater than 90% in (Amirrudin et al., 2013b)). However, their complexity can increase with the increase of the number of hidden states and the size of the history (Lv et al., 2014), like in Ultra-Dense mobile Networks (UDNs), because of the complexity of transition matrix which considers hidden and observable states (Zhang, Dai, 2018).
c-Standard or hidden Markov models-based approaches often use a Clustering to determine the regions of interest. In (Gambs et al., 2012) and (Jiang et al., 2016) authors used DJCluster Algorithm; in (Qiao et al., 2015) authors used a clustering analogous to DBSCAN (Zhang et al., 2009).
d-Bayesian networks gave good results in terms of prediction accuracy, which has sometimes exceeded 75% (Dash et al., 2015). However, with the dense deployment of small cells (UDNs), the cell environment would be more complex, which would make it more difficult to build a BN (Zhang, Dai, 2018). e-Neural networks are known by well-studied algorithms and are famous for their adaptive and self-organization characteristics. However, their algorithms are also known for their amazing computational complexity, especially in case of many hidden layers, which require a lot of learning time to adjust the weight of neurons. In addition, when using ANN in the field of MCSs (in particular UDNs), the procedures of acquiring user positions are influenced by the deployment of this type of network, knowing that position is a paramount parameter for ANN in case of mobility prediction (Zhang, Dai, 2018). Certain works based on the NNs provided acceptable results such as the model proposed in (Liou, Huang, 2005) and (Wickramasuriya et al., 2017) in which the precision exceeded 95% and reached 98% respectively. In contrast, the results provided by other works are not very satisfactory compared to the costs generated by such a technique, like the model proposed in (Parija et al., 2013) that allows to provide results only for regular movements.
f-Deep learning is one of the ML models based on in-depth learning. Currently, DL is present in several fields, notably in the medical sector (image processing and classification), in the computer-assisted surveillance field (facial recognition) and in mobility prediction (Ozturka et al., 2019). In turn, the models based on DL have also shown acceptable results such as in (Ozturka et al., 2019).
In the next section we aim to propose a classification of mobility prediction according to our study (state of the art) on the works dealing with mobility prediction.

MOBILITY PREDICTION CLASSIFICATION
With a lack of a reference classification of prediction models and standard criteria on which the classification of mobility prediction is based, we propose a classification based solely on existing prediction work. For that, we define the following four criteria: the use of history, the technique used, the use of deduction rules and the data type used. According to these criteria, we propose the following classifications ( Figure 4):

Figure 4. Mobility prediction classification
Classification according to the use of history: we distinguish historical-based models (Boc et al., 2011;Abu-Ghazaleh, Alfa, 2009), and knowledge-based models (Samaan, Karmouch, 2005). Historical-based models are models that use the historical data for predicting mobility. Knowledge-based models use other data such future planning, future goals, etc. to predict mobility.
Classification according to the use of deduction rules: we distinguish direct prediction models and indirect prediction models. Direct prediction models are models that use the data of the person concerned by prediction. Most of the works cited in this paper are direct prediction-oriented. Indirect prediction models are models whose prediction is based on deduction rules, such as profile similarity (Chamek et al., 2012).
Classification according to the data type used: we distinguish mobility data-based models (Anagnostopoulos et al., 2012), and contextual data-based models (Göndör et al., 2013). Mobility data-based models are those using only mobility data in order to predict mobility. Contextual data-based models are models whose prediction takes into account other data, qualified as contextual data (such as, environment information and means of travel), in addition to mobility data. In the next section, we focus on the challenges of mobility prediction by identifying six issues obtained from our analysis of works dealing with mobility prediction.

MOBILITY PREDICTION CHALLENGES IN SMART CITIES
In this section, we aim to present open questions, related to the mobility prediction, which we considered important according to our study on the mobility prediction. These challenges either concern issues that are already treated but not resolved correctly or those which are not treated yet. Also, these challenges are organized based on trajectory size (all trajectory or a segment only), movement type, context consideration, application domains, evaluation of models and privacy of users data.

Challenge 1 -Predicting a trajectory or a segment of a trajectory
Prediction of a single transition, a sequence of a trajectory or even an entire trajectory of a mobile individual depends mainly on the use of historical data of their mobility. Indeed, some works (Anisetti et al., 2011) have opted for the use of these data, while others (Lytrivis et al., 2011) have not based on these historical data.
Historical-based works are precise but suffers from high overhead costs linked to constant monitoring requiring a more detailed analysis of the history using data mining and knowledge discovery techniques.
The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLIV-4/W2-2020, 2020 5th International Conference on Smart Data and Smart Cities, 30 September -2 October 2020, Nice, France Non historical-based works allow to predict only the final destination and not the trajectory towards this destination (Nadembega et al., 2014). These works don't require historical mobility data, but requires other essential data, such as contextual data, and can also generate very high costs linked to an immense amount of data which must be collected and processed (e.g. preferences, final objective, and planning).
Also, other works (Anagnostopoulos et al., 2012) have proposed models that consider the use of historical mobility data and current network conditions (this is a kind of hybridization between the two types mentioned above). These models suffer from some limitations. They offer either shortterm predictions (one transition or two at most), or a long-term prediction with very significant additional costs linked to historical data and processing.
Thus, the challenge here is to be able to propose a long-term prediction model at the best cost (which ensures the best ratio: long-term prediction/costs incurred).

Challenge 2 -Movement type
Exploring the literature on mobility prediction works allowed us to conclude that most of works performed on the mobility prediction consider regular movements of individuals. In (Samaan, Karmouch, 2005) the authors claim that most of the existing approaches assume that the user travels according to a previously known pattern with regularity. Works based on the assumption of regularity of movement offers acceptable results (between 50% and 70% precision) (Jiang et al., 2016;Lv et al., 2014…), even very satisfactory sometimes (up to 90% accuracy) (Amirrudin et al., 2013b;Liou, Huang, 2005), in terms of prediction accuracy. These works were tested even with random movement. According to authors of these models, prediction accuracy decreases when the regularity of the movement decreases. In other words, these models provide poor results when the movement is random. Although the assumption of regularity of movement is valid in many cases due to people's daily routine, the assumption that an individual's movements may be random should not be excluded. For example, a tourist who visits a country for the first time can take several different paths, especially during the first days after his arrival. Certain works, such as (Liu et al., 1998) consider the random movement. They proposed models answering this type of movement. To the best of our knowledge, the number of these works remains negligible compared to the models dealing with regular movement. Also, by analyzing the results of the last cited work, we note that the prediction precision is not satisfactory, although the authors claim to have obtained good results. Another category of works, like (Liu, Maguire, 1996), consider both regular and random movement at the same time. The results are acceptable only for regular movements.
Thus, a major challenge is to be able to propose a model making it possible to predict mobility of a user who habitually moves in a random manner (tourist, transporter, etc.).

Challenge 3 -Context-based prediction
Most of the proposed works in the field of mobility prediction such (Ozturka et al., 2019;Wang et al., 2019) do not take into account contextual information, or, in the best of cases, consider only some standard information, such as day of the week (work day or weekend day), date, time, speed and profile (preferences, goals,…) (Dash et al., 2015;Chamek et al., 2012;Du et al., 2011). For example, in (Chamek et al., 2012), the profile is considered, in (Dash et al., 2015), only time is considered, in (Du et al., 2011), time (hour of the day) and day of the week are considered. The second category of works gave better results in terms of precision.
Besides this standard information, there is other information related to more important parameters which directly influence the individual's movements but which, to our knowledge and so far, are not taken into account, except for the work described in (Göndör et al., 2013) which considers meteorological information. The main parameters having a direct impact on the movements of individuals relate to environmental (meteorological) information, means of travel and road traffic. If we take the example of the individual who goes to work, in rainy weather, probably this individual will take a path different from his habitual path (a shortcut) to get to work. Also, if he moves by walking, he can take a different path than when he moves by car or by bicycle.
From the previous example taken from our daily life, it is very clear that contextual information play an important role in predicting mobility and that certain information are more important than others. To reinforce this view point, certain recent works, such as (Wang et al., 2019), dealing with the problem of mobility prediction envisage in their perspectives to take into account contextual information, in particular meteorological information.
Thus, the challenge here consists in proposing a solution which considers contextual information, in particular meteorological information (weather), information related to the means of travel and those related to road traffic. However, collection of this type of data is in itself another challenge. In addition, it is very difficult to consider all of these types of contextual information at the same time.

Challenge 4 -Application domains
Several applications of mobility prediction have been cited in the literature. According to (Wang et al., 2019), prediction can be applied to personalized recommendations, suspicious target tracking and intelligent transportation. Prediction also plays a big role in the field of mobile networks (Samaan, Karmouch, 2005;Liu et al., 1998…). According to (Samaan, Karmouch, 2005), the importance of mobility prediction techniques can be seen both at the network level (handoff management, resource allocation…) and service level (pushed online advertising, mapping/route guidance…). Also, in (Gambs et al., 2012), the authors proposed an extended prediction model having several potential applications such as the evaluation of geo-privacy mechanisms, the development of location-based services anticipating the next movement of a user and the design of location-aware proactive resource migration.
Although the fields of mobility prediction application are numerous, to the best of our knowledge, all the works have been directed towards the application of mobility prediction in the field of MCSs, such in mobile networks (Ozturka et al., 2019;Nadembega et al., 2014), except very little works, like the work carried out in (Almeida, Azkune, 2018) in which the mobility prediction (for elementary actions in a limited space) has been applied to detect behavioral abnormalities for elderly people. Therefore, the challenge here consists in proposing solutions (models, algorithms...) for mobility prediction which are applicable to other fields such as personalized recommendations systems.
On the other hand, although the prediction models proposed are oriented towards mobile networks, some of these models do not support certain types of networks such as UDNs which currently represent a new trend, such as 5G networks. UDNs are characterized by their high number of cells (base stations) (Zhang, Dai, 2018) and their management complexity (Samarakoon et al., 2016). This increases the complexity of certain models and prediction algorithms considerably (Zhang, Dai, 2018). Consequently, these models are not highly recommended to be applied to this type of networks because of the costs involved. Indeed, in the case of Markov chains-based models, the complexity of the transition matrix (main parameter in Markov models (Amirrudin et al., 2013a)) increases with the growth of the number of cells, in particular in case of hidden Markov model. Also, in the case of neural networks-based models (characterized by their computational complexity (Zhang, Dai, 2018)), complexity increases in case of a large number of cells (Zhang, Dai, 2018). Thus, proposing a solution that supports this type of network with lower complexity (cost) constitutes another challenge.

Challenge 5 -Evaluation of models
Evaluation of mobility prediction solutions can be done using simulations (Ozturka et al., 2019;Wang et al., 2019) often based on datasets containing data about individuals' movements. However, on one hand, these datasets may already be existing and the data they contain may not be adequate for the proposed solution, as for context-based solutions. On the other hand, some datasets set up for evaluations (Samaan, Karmouch, 2005;Göndör et al., 2013) are not very consistent in terms of the amount of data they contain (Göndör et al., 2013). In addition, datasets must often be divided into two parts: a first part for learning and a second part for validation which is based on the comparison between the second part of the dataset and the prediction results. For this, the datasets must be consistent enough so that they can be divided; otherwise, the evaluation may not be 100% correct (Göndör et al., 2013). A major challenge consists in the use or creation of a dataset which is the most complete regarding to the data necessary for the proper functioning of the proposed solution.
Another way of evaluation consists of a real evaluation (Chon et al., 2013), based on the participation of people who are available for real-time interaction (feedback), allowing to compare the results of mobility prediction with the real future individuals' movements. This approach allows for a more reliable evaluation (Göndör et al., 2013). To realize this kind of evaluation, the challenge consists in ensuring the participation of a large number of people. One way to achieve this objective can be the realization of a mobile application (Göndör et al., 2013) which is used by a group of people, and which allows the creation of a dataset by recording data necessary for prediction. These same people must also participate in the evaluation of the prediction results of their movements.

Challenge 6 -Confidentiality and privacy of users data
Mobility prediction solutions are mainly based on the historical data of individuals' movements, on their profile data, on their social relationships or even on their schedules. These data are directly linked to the privacy of individuals and need a certain degree of confidentiality. Indeed, the disclosure of personal data related to an individual's movements or their schedules can have a negative influence on this individual. In other words, anyone who has information about an individual at disposal can harm the daily life of this individual by going from a simple tracking action to physical damage (theft, assault, etc.). Thus, confidentiality in the field of mobility prediction mainly concerns the confidentiality of the historical data of individuals as well as the information which can be deduced from this data.
In (Pan et al., 2013), the authors emphasized the confidentiality aspect of the trace data of individuals moving in a smart city and concluded that the disclosure of personal identity could occur during the collection, publication and use of trace data. With regard to collection, localization techniques may record user or device ID and cause risks. Location by GPS is more secure than GSM, WiFi, Bluetooth and RFID because centric servers do not need to know device IDs. Regarding the publication, personal identity could be inferred from published locations, in spite of having been removed/anonymized from the trace data records. In terms of usage, traces may expose unwanted privacy information to personalized services and applications. In such case, an anonymizing proxy is to be trusted to store, manage, and protect user locations, and to communicate between applications and users. The challenges are to keep fidelity of data for applications meanwhile protecting privacy (Pan et al., 2013).

CONCLUSION
In this paper, we've presented some key points on mobility prediction and related challenges, in order to give a guideline for the readers for new contributions. The first part of the paper was devoted to key concepts of mobility prediction through an overview about the mobility concepts and classifications, the data required for mobility prediction and some works dealing with mobility prediction. The second part focuses on a set of challenges, that are an open issues related to mobility prediction. These challenges were extracted from our deep literature study on the mobility prediction works. The second part can be very useful for a large community of researchers and developers because, to the best of our knowledge, there is no work that groups all these challenges at the same time.
From this work, we have noted that, as mobility data, contextual data are very important to predict mobility. Also, some issues such as predicting random mobility, based-context prediction, a long-term prediction and definition of appropriate datasets for evaluation of models are always open to new contributions. Mobility prediction is applied only to MNs domain. Consequently, it is interesting to apply it to other domains (e.g. recommendation systems). Also, no work has shown its interest for the confidentiality of individual mobility data. So, it would be interesting to explore this crucial point. In the future, we plan to propose a solution (model, method or algorithm) that tackles the main challenges mentioned.