VISUALIZATION AND ANALYSIS OF CELLULAR & TWITTER DATA USING QGIS

The study is to understand individual presence and movement in Friuli Venezia Giulia region. It is important for tourism planning, hazard management, business marketing, implementing government lifetime policies and benefit. The aim of this study is achieved by advanced web 2.0 applications. We need real time and geo-located data to monitor the inflow of tourist and to come up with effective promoting and benefiting plans for tourism, the evacuation and mitigation strategies during hazards to protect social life and environment with less infrastructure damage, marketing plans for advertising or selling of products. Despite wide spread success in predicting specific aspects of human behavior by social media information, a little attention is given to twitter and cell phone data. Accessibility to detailed human movements with fine spatial and temporal granularity is challenging due to confidentiality and safety reasons. With rapid development of web2.0 applications people can post about events, share opinion and emotions online. Using twitter data, how short term travelers, such as tourists, can be recognized and how their travel pattern can be analyzed. Study of finding tourist dynamics such as arriving and outgoing of tourist, sum of trips, sum of days and night spent, number of unique visitors, country of residence, main destination, secondary destination, transits pass through, repeat visits are achieved using CDR (call detail records) and DDR (data detail records).


INTRODUCTION
Understanding human movement within a geographic area i.e., national to international is crucial in various domains and applications.This understanding helps to play an important role in urban planning, transportation, emergency relief, marketing strategies, etc.This is achievable due to improvement in technologies to fetch real time and reliable data for research.Accessibility to detailed human movements with fine spatial and temporal granularity is challenging due to confidentiality and safety reasons.With rapid development of web2.0 applications people are able to post about events, share opinion and emotions online.Social media covers wide variety of topics from something as simple as some products events and services to more complex issues related with finance, culture, politics, religion, food, epidemics, famine etc.,.Twitter is one of the most popular social network website.Twitter's speed and ease of publication have made it an important communication medium for people of all lifestyles.It has also been used to recognize short-term travellers and to analyse their travel using Twitter data (Shamanth Kumar, 2013) Other ways to study human movement is through cell phone network.Mobile phone data represents movement of the network user.In the context of mobile networks, it is described as when a mobile used outside the range of its home network and connects to another cell network (Rein Ahas, 2014) .For example subscriber travelling beyond company transmitter range; their mobile would spontaneously hop onto alternative phone service and is done with the help of subscriber identity in the visited network.Study of finding tourist dynamics such as arriving and outgoing of tourist, sum of trips, sum of days and night spent, number of unique visitors, country of residence, main destination, secondary destination, transits pass through, repeat visits are achieved using CDR (call detail records) and DDR (data detail records).Some of the applications achieved by mobile positioned data are (Rein Ahas, 2014

Problem Statement:
To understand the individual presence and movement in Friuli Venezia Giulia region.

Research Questions:
 How many individuals are present in certain area at specific time? Where do they come from? Where will they go from one place to another? How the presence and movement of individuals change over time? How the trend change for different nationalities?

Aim and Objectives:
The aim of this study is to analyse patterns, trends & associations of human behaviour and their interactions.The data was provided by cellular network to academic for study purpose and data is maintained confidential.

Data Flattening:
Data flattening was required in order to perform calculations and analysis in deriving the output.All non-spatial data (excel files) were transformed into GIS format files using open source software and projected to "WGS 84 UTM 32 N".

Algorithms Used:
Algorithm is a self-contained sequence of actions to perform.It is an effective method, which is expressed with in finite amount of space and time and in well-defined formal language for calculating a function (Algorithm, n.d.)To reduce manual efforts and errors, simplified algorithms were created as per data type and output desired.Created models were used in generating maps on large-scale data.These models were useful in generating thematic maps of individual presence in each municipality and to generate flow lines of individual movement from one municipality to other.

Generation of Thematic maps using algorithm:
Fig

Kernel Density-Tweet feed:
To determine the density of people in municipality, kernel density is used.It calculates the density of features in a neighbourhood around those features.Conceptually, a smoothly curved surface is fitted over each point.The surface value is highest at the location of the point and diminishes with increasing distance from the point, reaching zero at the search radius distance from the point.Only a circular neighbourhood is possible.The volume under the surface equals the Population field value for the point, or one if NONE is specified (ESRI, n.d.)The density at each output raster cell is calculated by adding the values of all the kernel surfaces where they overlay the raster cell centre.Tool used to visualize the results on map and associate with other outputs.

GRADO, ISLAND OF SUN:
Grado is a town in the North Eastern part of Italy.It is a municipality of Friuli Venezia Giulia located on an island and adjacent peninsula of the Adriatic Sea between Venice and Trieste (GRADO, Beach, n.d.).It is mainly a fishing centre but today know as popular tourist destination.It is commonly known as L'Isola del Sole "The Sunny Island".It is famous as spa town because of its thermae and spa services.It has an area of 114 km 2 .Grado is a land of value, food products, culture and events, city of art and history.It is place where it has tourist throughout the year.To better understand the presence of Italians and foreigners, particular day based on weather report and time were selected.Very hot day (9 th August16), Rainy day (10 th August16) and Sunny day (11 th August16) were opted and 12 pm is preferred.


To know places, people would likely to visit during very hot day, rainy and sunny daythematic maps.


To know people origin and destination points -Origin Destination Flows.


To know density at the beach when compared to surrounding -Density tweets.
Maps are prepared separately for Italians and Foreigners. Tweets density in other places are more than Grado due to various attractive scenery, museums, Church, Sundials, hiking, shopping malls and famous landmarks.

FRUILI DOC:
Friuli Doc is a wine and annual food event held in Udine since 1995.This event is organized by the city and held in the historical city centre for four days.It happens on 8 -11 September.
(FRIULI DOC, n.d.)This event consist of food, wine and craft stands.This event accompanied by various initiatives such as cooking classes, tasting, demonstrations and workshops, presentation and conferences focus on typical products, exhibitions, street performances, educational workshops for children and music.To understand the presence of people in Udine during non-event and event day are selected ie., September 3, 4, 10 and 11 of 2016 between 6pm and 12 am as it is peak moment of event.


To know the number of individuals attending the event on September 10 and 11 2016 at 6pm and 12 am respectively. To know the number of individuals present in Friuli Venezia Giulia and Udine on September 3 and 4 2016 at 6pm and 12am respectively.


Event dayearly morning, September 11 at 12 midnight, Udine has 128% presence difference more than week before.It can be said that people are more on the last day of the event due to weekend.


The above mentioned places are famous for its landmarks, war left overs, sundials, hiking, historical places etc., Event day -10 th evening, the presence difference of foreigners in Udine is 36%, which is less when compared with Italians presence difference.


The b part of the map is the presence of foreigners at September 4 th and 11 th early morning 12 am. Event day -September 11 at 12 am, presence difference of 128% is observed in Udine, Which is greater than presence difference the before day.


It can be assumed that, week before on 3 rd and 4 th September people are present in Budoia, Ampezzo etc., and they are moving to Udine surroundings to attend the event.


The above points, explains how people interact with surrounding in two conditions (event and non-event day).

CONCLUSION
The case studies, opted in this study, take place in some or the other way in our day to day activity.These various conditions and scenarios help in analyzing individual presence and association with their surroundings activities.Geolocated cellular and twitter data is used to determine people presence, movements and number of flows happening between places with help of QGIS.The conditions considered in this study, were helpful in answering the above stated research questions.
Figure 1-4-1 Map of Italy Fig.2-3-1 Show the flow of process to generate thematic maps.The data used in algorithm contains people presence in Friuli region (residents and foreigners of Italy) for six observation periods that is from March to September 2016 for a time interval of 4 hours ie., at 0 am, 6 am 12 pm, 18 pm and 24 pm.This model takes polygon (municipality of study area) and point file (presence of individuals) as inputs.It filters the data based on user specified expressions and the queried results merge with municipality shape file to visualize on map.

Figure 2
Figure 2-3-1 Model in QGIS to generate thematic maps 2.3.2Generation of Origin Destination flow lines using algorithm: The model as shown in fig.2-3-2 generates origin destination flow lines, such as people movement for certain period of particular municipality.Flow line connect each point to their respective origin and destination.The model shown in fig.takes the input as origin destination points of each municipality.The output of this model can answer one of the research question where the individuals are from and where they are will go.Figure 2-3-2 Model for OD flows

Figure 3
Figure 3.1.1-1(a) Presence of Italians on August 9-10 @ 12pm (b) Presence of Italians on August 11-10 @ 12pm  Above fig.3.1.1-1a& b are presence of Italians on August 9, 10, 11 at 12pm. The percentages shown in above map explains the percentage difference of people present on corresponding dates.Difference of presence in percentage is calculated on August 9 and 10 to know at what percent the presence changes.Afternoon hour is considered because commonly people visit beach during that hour than other hours.Red zone represents presence of people on very sunny day (9 th August) while green zone represents presence of people on rainy day (10 th August).09 August 2016, at 12 pm, which is very sunny day the percentage difference of people are from 12% to 51% (red color polygons),


Figure 3.1.2-3(a) & (b) OD flows for Italians on August Sunday and Monday  Fig 3.1.2-3a & b shown explains the following points. Flow count represents number of times to and fro movement happening.The flows of Italians from Sunday to Monday are 776 to 543. The flow of Italians to Grado are decreasing from Sunday to Monday.Flow from 1 to 30 are from more places of west Friuli region and flow count is less due to distance from origin to destination is farther.


Figure 3.1.4-5a, b, c Tweet Density on August 9, 10, 11  Red color represent maximum tweet density and mostly related tweets in that place.Green -very less tweets and unrelated tweets.Yellowtweets which are likely related to our scenario (Grado beach)  Tweets are filtered based on date 9 th august16 and time, density of tweets are calculated.On very sunny day, tweets in Grado are from 540 to 5400, which show some tweets are happening and can be reliable as proxy for presence.Tweet density ranging from 47000 to 80000 are observed in Lignano and Trieste.26000 to 46000 of tweet density are Udine, Pagnacco and Comeglians.On rainy day august 10, density of 137560 is observed in Buttrio.In addition, density ranging from 44000 to 83000 are observed in Udine (73932), Lignano (59296), and Trieste (66260).

Figure 3
Figure 3.2.1-1(a)Presence of Italians on Sep 3 -10 @ 6pm (b) Presence of Italians on Sep 4 -11 @ 12 am  The above fig 3.2.1-1 a & b show presence of Italians per community wise.September 10 and 11 are event dates.10 th evening (6pm) and next day 11 th early morning (12am) are considered to understand the presence on last day of the event.Week before event day i.e., September 3 & 4 (6pm & 12am) are considered to compare the presence on normal and event day presence difference. During Friuli doc event in Udine, Red zone represent presence of people on event day (10 th september16) in municipality.Green zone represent presence of people on non-event day (3 rd september16) which is one week before.Event day -September 10 th at 6pm, presence difference of 99% noticed in Udine, which is more than presence on 3 rd

Figure 3
Figure 3.2.1-2(a) Presence of Foreigners on Sep 3 -10 @ 6pm (b) Presence of Foreigners on Sep 4 -11 @ 12am  The fig 3.2.1-2a, b shown is percentage difference of foreigners in Friuli region on Friuli doc event September 10 th and 11 th in Udine and week before.i.e., September 3 rd and 10 th 2016.Event day -10 th evening, the presence difference of foreigners in Udine is 36%, which is less when compared with Italians presence difference.

Figure 3
Figure 3.2.2-3(a) & (b) OD flows of Italians on September Sunday and Thursday of the Photogrammetry, Remote Sensing and Spatial Information Sciences, VolumeXLII-4/W8, 2018  FOSS4G 2018 -Academic Track, 29-31  August 2018, Dar es Salaam, Tanzania  As per the above fig 3.2.2-3a, b shows the flow of Italians to Udine from each municipality.The database is filtered by day and month wise due to absence of date and time.The above flow lines include all the flows happen on that particular day. The flow count is more on Thursday September16.The flow decreased on Sunday, may be people are staying in nearby hotels or in relative home. Flow count above 500 are from places which are near to Udine.Flow count from farthest places to Udine is in between 100 -500. The Italians present in the event are from all over the Friuli region and the flow to Udine is in thousands. The Italians flow in Friuli region on Thursday and Sunday is 4321 and 2217 respectively.

Figure 3 
Figure 3.2.3-4(a) & (b) OD flows on September Sunday and Thursday  Above fig 3.2.3-4 a & b, show that Destinationevent holding place Udine, Origin -municipalities where foreigners are travelling from.
Fig 3.3.1-1a & b shows presence of both Italians and Foreigners on May 21 and 28 at 6pm.  The above fig.3.3.1-1a & b shows the presence difference of individuals on May 21st and 28th.Udine and Pordenone are event places for Cantine aperte event on May 28 at 6pm.

Figure 3 
Figure 3.3.2-2(a) & (b) OD flows of Italians on May Friday and Saturday  The above fig 3.3.2-2a & b explains the flow count of Italians to Pordenone.The event days May 27 and 28 are Friday and Saturday of the month.OD matrices are filtered based on destination -Pordenone.Count flow of people to the event are from East part of Friuli region. Flow count ranging from 1400 to 3000 are observed from nearby places to event.Flows from 1 to 100 are wide spread and are mostly from all the municipalities.Count is less due to distance from the origins to destination is farthest, may be less transport service etc.,


Tweet density class from 12000 to 15000, observed in Udine and Pordenone are very high, when compared with first day of event.

Figure 3
Figure 3.3.4-4(a) & (b) Tweet Density on May 27 and 28The above points are helpful in determining people presence with help of social media activity.

5 Back Ground: Use of Mobile Positioning Data:
Extensive use of mobile positioning data is used by European countries.It tries to make an effort to explain the art of using cellular data in the study of various domains.It describes wide range of applications on mobile based data.The research applications that can be done by using mobile positioning data are monitoring real time events, business applications and transportation, emergency and relief solutions.Many important initiatives are made in the study of traffic and travel behaviour.All over the world TomTom HD, ).It consist an area of 301,338 km 2 , has various temperate seasonal and Mediterranean environment.It is the Fourth populated region of European Union Member state with 61 million inhabitants and 20 regions.Rome is the capital city.Friuli Venezia Giulia is located in North East of Italy region.Its geographical coordinates are 46 0 22' N and 13 0 10' E. It is bordering with Adriatic Sea, Slovenia and Austria.It is one of the five autonomous regions with special statute.It has an area of 7858 km 2 with 1.2 million inhabitants (Friuli Venezia Giulia, n.d.) .It has a natural opening to the sea for many central European countries.It encompasses the historical geographical region of Friuli and small historical region of Venezia Giulia.It has sharp peaked dolomite mountains and wine yards producing white wine.Trieste is the capital of Friuli region.1.trafficusesmobilepositioning data as their primary key for intelligent traffic guidance system (Rein Ahas, 2014) It mentions the importance of transparency in using such data source.It also mention about emergency solutions with the help of cell phone data.There are number of case studies, which show increasing interest in commercial value of the data in various domains[4].The initiatives are playing an important role in urban studies, transportation, and academia and advertising market.It explains the methodology for collecting and usage of the data.In most of the cases, passive positioning data is used to describe country level statistics.In 2005 -2006, "Mobile Landscape: Graz, Austria in Real Time" was developed to track human

2. METHODOLOGY 2.1 Data Used:
Study region is divided into cell of 150 * 150 meter to obtain cellular data.Cell phone data provided presence of individuals per cell and origin destination matrix.Presence data is divided based on country of origin and Italians (residents, regular visitors, and occasional visitors).Data availability for the study is from March16 to September16.Tweet feed were downloaded and contains user-id, coordinates, tweet time, text.Spatial and Non-spatial analysis were performed using open source software QGIS and Ms-office respectively.
These places are famous for Shopping malls, Churches, military museum, Historic sites, Architecture buildings, Ancient remains etc., Villesse has giant mall with Ika located, Medea -Ara Pacis, famous monument recalling fallen of all the wars, Aiello has more than 150 famous sundials.