SPATIO TEMPORAL DATA CUBE APPLIED TO AIS CONTAINERSHIPS TREND ANALYSIS IN THE EARLY YEARS OF THE BELT AND ROAD INITIATIVE-FROM GLOBAL TO LOCAL SCALE

Maritime trade represents a significant part of all global import-export trade. The traffic of containerships can be monitored through Automatic Identification System (AIS), due to the fact that the International Maritime Organization (IMO) regulation requires AIS to be fitted aboard all ships of 300 gross tonnage and upwards engaged on international voyages. The approach proposed by the authors aimed to extract value added information from an AIS dataset, with a focus on maritime economy. Using an AIS dataset of global position of containerships from 01/01/2012 to 31/12/2016, the paper focuses on space-time data cube creation and analysis for a better understanding of maritime trades trends. Data cube creation has been tested at different spatio-temporal bins dimension and on different specific topics (TEU classes, alliances, chokepoints and port areas), analysing the sensitivity on trend results, and highlighting how appropriate spatio-temporal bins dimensions are important to effectively highlight relevant trends. Results of the trend analysis are discussed and validated with the main data and information found over the period 2012-2016. The aim of this paper is to demonstrate the suitability of this approach applied to AIS data and to highlight its limitations. The authors can conclude that the approach used has proved to be adequate in describing the evolution of the global import-export trade. * Corresponding author


INTRODUCTION
Maritime transportation plays a crucial role in international trade. There are more than 50,000 merchant ships trading internationally on the sea (Yang et al., 2021). As required by the International Maritime Organization (IMO), all these ships need to be equipped with an Automatic Identification System (AIS) transmitter (IALA, 2004). The increasing level of spatial and temporal accuracy, completeness and quality of the information and accessibility of AIS data have promoted lots of new research in recent years (Yang et al., 2019). Terrestrial AIS consists of small transponders fitted to vessels which use short wave VHF radio signals and GNNS technology to broadcast each vessel's position, unique IMO identity number and other relevant information (Monaco, 2017) to ground stations: the coverage is therefore limited to about 50 miles from the coast. In space-based AIS (S-AIS), the signals from ships are collected by satellites, strongly enlarging the coverage performance at the expense of a lower data rate (Graziano et al., 2019). AIS has been primarily conceived for collision avoidance, but today's applications range from maritime surveillance and safety to impact's assessment on the environment, to traffic monitoring and forecasting, to evaluation of commercial and trade activities. Destination port, together with estimated time of arrival (ETA) is one of the key information stored within AIS that enables expanding the use of this dataset from the original maritime safety field. In the trade field, accurate information regarding ships' destinations could help port operators make timely and efficient decisions (Zhang et al., 2020) and knowing other ships' movements generates competitive advantages (Prochazka et al., 2019) Maritime trade via containerships accounts for around 80% of global import-export trade. This sector is rapidly changing and, since 2013, strongly influenced by China's Belt and Road Initiative (BRI), which aims to restructure trade routes and value chains through port and logistics investments. In this perspective, the paper aims to use spatial temporal data mining tools known as spatial data cube, to analyse and visualise spatio-temporal trends and hot spots of global containership' trades, extracted from an AIS historical database. Visualisation tools for spatio-temporal data used in the AIS context are usually density maps: although they can be considered a valuable instrument to visualise massive vessel trajectories, the temporal evolution is not integrated in the results as time is blocked on a defined temporal window. Spatio-temporal data mining and machine learning are the most used techniques on this kind of big data, in particular, for pattern recognition and anomaly detection (Yang et al., 2019). The analyses in this paper are based on an AIS dataset of daily position of containerships greater than 7,000 Twenty-foot Equivalent Unit (TEU). The dataset covers the period from 01/01/2012 (before Suez and Panama expansions) to 31/12/2016. A total number of 597,069 vessel positions from 515 different containerships has been analysed. The study also takes into account some limitations of the AIS dataset, mainly related to the limited satellite coverage in some regions prior to November 2018. Currently, data are processed from multiple satellite AIS sources and have almost complete global coverage, but in the past, the access to satellite data was limited. This results in areas with denser positions, where both terrestrial and satellite AIS were available, poorer when only one source was available. Additionally, when commencing a shipping voyage, every ship is required to enter its destination port and ETA in the AIS: the information is manually inputted by the crew members, and Yang et al. (2021) have found that a considerable proportion of this information is mistakenly entered, either intentionally or unintentionally: through a set of interviews with maritime industry practitioners, it emerged that human errors is the most common reason mentioned by the interviewees.

Data Cube Creation
Spatial data analysis refers to a set of techniques designed to find patterns, detect anomalies, or test hypotheses and theories, based on spatial data (Goodchild, 2008). Among spatial data analysis techniques, the creation of space-time data cubes has been considered adequate to analyse the AIS dataset. The authors used the ESRI ArcGIS Pro Space Time Pattern Mining toolbox to insert the mentioned AIS records into a netCDF data structure, by counting positions and aggregating specified attributes into space-time bins. For all bin locations, the trend for counts and summary field values are evaluated. Twodimensional representations of the space-time data cube attributes are a powerful tool to detect trends in space and time. Additional tools are used to detect hot and cold spots and to characterize them in function of time based on the Getis-Ord Gi* statistics and on Mann-Kendall trend test (Hamed, 2009). From a first analysis on the whole dataset, it was immediately clear as the choice of the dimension of the spatio-temporal bin affects the results in terms of readability and interpretability. Several tests have been conducted in order to define an appropriate spatio-temporal bin dimension, balancing between bins with a significant number of values, adequate spatiotemporal resolution of the analysis and readability of results. For global analysis, a data cube with a time interval of 1 month/2 weeks and hexagonal spatial grid with a height of 100 km have been chosen as the most appropriate. Figure 1 the result of a hot spot analysis is compared with a simple total count of the AIS positions over the spatial grid of 100 km. The differences between the two outputs are justified by the fact that the hot spot analysis takes into consideration not only the spatial but also the temporal distribution of the data: persistent, consecutive and intensifying hot spots are locations where growing trends have been detected within the time covered by the available AIS data sample. In the fast-growing sector of maritime trades, the hot spot analysis is particularly effective in identifying chokepoint locations: the correspondence between the hot spot analysis and well-known chokepoint locations 1 confirms that the analysed AIS dataset can be considered representative of global trends. In order to better analyse the situation in hot spot or high vessel's position concentration areas these areas, a local analysis was conducted, using subsets of the global one, with the aim to verify the effects of changing the spatio-temporal dimension of the bins on the description of the dynamics in the area. At a later stage, other research questions are covered performing the data cube creation over some filtered subsets. Firstly, the gigantism phenomenon was deepened at global and local scale, exploring the mean and the maximum TEU value registered in the bins. Then the global dataset has been divided into 3 TEU classes and a data cube has been built for each of those classes. Furthermore, the business strategies of some of the major alliances were analysed on a global scale. At chokepoint level, the analysis is further refined, splitting the subsets depending on the destination of the ship and therefore comparing the trends according to the direction followed by the containerships while transiting through the chokepoint itself. Lastly, at a single port level, subsets are extracted over two Areas of Interest (AOIs) and a 10 and 20 km spatial resolution with a temporal window of 1 month has been exploited to compare the local effect of different resoluted data cubes.

Destination Port Information
In this paper, the authors decided to exploit AIS destination information to analyse geographical trends in trade patterns: while assessing and correcting mistakes in destination input by the crew is out of the scope of this contribution, authors concentrated in pre-processing the AIS data sample to optimize the result of an automatic geocoding process. Maritime Safety Information World Port Index 2 (WPI, Pub 150) has been used to create a locator and a first geocoding process has been launched on the full AIS dataset. The process led to a 26% of matching out of the unique list of destination values (5,989 successful matchings out of 22,740 unique destination values), corresponding to a 50% of the total number of AIS positional records available (301,969 out of 597,069). The analysis of the unmatched destination values showed few trivial issues, such as leading blank spaces and non-relevant characters, that were easily systematically removed from the original text. The authors then decided to tackle less evident geocoding issues by analysing the unmatched records and discovered that, resolving the matching of the top 30 destinations in terms of associated positional records, would lead to reach the goal of having at least 75% of positional records with a geographic destination assigned: this manual effort has been therefore considered affordable compared to the derived benefits in terms of completeness and accuracy of the following analysis steps. A manual analysis of those records allowed to find the correct named port in the WPI dataset, considering that the most common issue was related to misspelling due to transliteration into latin alphabet. In one case, a new port was added to the WPI dataset: it is the Yangshan deep water port for containerships in Hangzhou Bay south of Shanghai, easily recognizable on open satellite imagery. After this phase of data manipulation, the geocoding process has been re-run, reaching the goal of having more than three-quarters of positional records with a geographic destination assigned (456,681 out of 597,069).

Global Trend Analysis
Maritime trade is the backbone of globalisation processes and the global exchange of goods. Containerships are the main carrier of processed and semi-processed goods and sail along important and more or less defined maritime routes linking the major ports of the main global economic areas. The global trend analysis of the entire dataset allows to observe the spatial and temporal variations along the three main trade routes, the Trans-Pacific, the Trans-Atlantic and the East-Asia/Europe, shown in Figure 2, as well as the effects of certain infrastructural interventions on ports and shipping channels.  The trend analysis maps in the paper adopt the legend shown in Figure 3: purple colour scale represents an up trend, green colour scale a down one. Colours vary from dark to light depending on the confidence level of the trend.

East-Asia/Europe Route.
Looking at the East-Asia/Europe route (Figure 4.A), positive trends in terms of vessel traffic along the route from Chinese ports to Mediterranean ports can be inferred. It could therefore be argued that in the years following the launch of the BRI, this route, that is the main maritime path of the project, was positively influenced by the Chinese initiative. As will be shown later, positive trends can be noted also in some of the main strategic nodes of the project, such as the Asian ports, the Suez Canal, the Persian Gulf area and the Mediterranean basin. The East-Asian region (Figure 4   A general positive trend can be seen also in the Red Sea area. There is a significant increase in the number of ships since 2015, year in which the canal widening works were completed, in the location relating to the Suez Canal exit/entry to/from the Mediterranean Sea ( Figure 5). Looking at the European/Mediterranean Sea area in Figure 6, two different trends can be highlighted. A negative trend with respect to the route from the Suez Canal through the Mediterranean to the North European ports, whose bins show a tendency to reduce in number since 2015. That year, along the East-Asia/European route, there was a -1.4% decrease in containers transported (UNCTAD, 2019).

Trans-Pacific and Trans-Atlantic Route.
Looking at the Trans-Pacific route in Figure 10, which connects Asian ports with the west coast of the United States, there is a positive trend especially since the end of 2014 when, as can be seen in Figure  9, the route overtook the East-Asia/Europe route in number of TEUs transported.
The widening of the Panama Canal in the second half of 2016 also contributed to this upward trend. The widening allowed ships of up to 14,000 TEU to cross the strait, making it economically viable to reach US East Coast ports via the transpacific route.   The Figure 11 shows an abnormal spike in early 2015 not found in the sector studies. This may be due to an anomaly on the initial dataset shown above. Finally, the upward trend can also be observed in correspondence of two of the main Trans-Atlantic route ports ( Figure 12).

Figure 12
Variation of counted AIS' positions per bin in Savannah and New York port locations (2012-2016).

TEU Analysis.
Observing the trend for maximum size of containerships, the results are positive for all routes confirming the general increase in ship size ( Figure 13).

Alliances Analysis.
This spatio-temporal analysis of containership traffic trends focuses, on a global scale, on containership alliances. Since the 1990s, when the very specialised organisation of the sector achieved world-wide coverage, mergers and acquisitions have been very frequent, with the aim of offering greater capillarity and regularity of service especially in support of the large multinationals, the main customers of operators in the container sector (SRM, 2014). The simplification of the transport service offered aims to reduce operating costs and this need is reflected in the development of maritime gigantism and the increasing role of alliances on the main shipping routes. Cooperation between shipowning companies has the dual purpose of reducing costs by exploiting economies of scale and rationalising resources and the geographical scope of the service. This paper analyses the trends of some of the world's major naval alliances. In particular, as shown in Figure 15, The Alliance, the 2M and Ocean Alliance, considered the major shipping alliances, have been investigated by means of the visualization of the trends on counted AIS' positions over all bin locations. The analysis shows the different trade strategies linked to the three alliances. As far as The Alliance is concerned, there is a decreasing trend in Mediterranean and Northern European ports, while there is strong growth in Asian and North American ports. An interesting increase is also noted in the Baltic Sea. A different behaviour is observed for the 2M alliance, for which positive trends are noted for the Asian-Mediterranean route. In particular, as far as the Mediterranean area is concerned, 2M is intensifying its routes on the Tyrrhenian and Adriatic Seas. The analysis also seems to show no substantial downward trend on the route to northern European ports, in countertendency to the other alliances. Finally, the Ocean Alliance is strong on all its routes, with a preference for Asian and North American ports, while routes to Northern European ports are declining.

Chokepoints Analysis.
Maritime chokepoints represent areas with restricted throughput and/or high concentration of ships, such as straits and canals. Because sea traffic is naturally constrained in these places, chokepoints create potential vulnerabilities to the movement of containers across the entire network (Alderson, 2020). As a significant portion of the global maritime trades passes through those specific locations, trade analyses focused on those areas are representative of global patterns. Focusing on smaller geographical areas allowed to experiment the creation of more spatio-temporal resoluted data cubes.   Top images are referred to vessels directed to European and Mediterranean ports, while the bottom ones are those directed to Asian ports: for this analysis, Port Said destination has been considered as an Asian destination as it emerged clearly that this destination is normally set for vessels intending to cross the Suez Canal accessing from the northern access. Up trends in the waiting area next to both entrances of the canal are detected, more evident considering vessels directed to Asian ports (bottom): this can be linked to the increasing volume of traffic and size of vessels generated by the global increase in maritime exchanges and, specifically here, the Suez Canal expansion. Trends along the canal itself seem stable toward Europe and the Mediterranean and decreasing toward Asia: but this analysis is strongly affected by the used AIS sample that, providing a position per day for each vessel, underestimates the numbers and volumes of underway containerships.

Chinese Ports Analysis.
Between port areas, in order to obtain a statistically significant sample, two big areas of China have been chosen. Looking at the distribution of total position in the global grid (as from Figure 18.B), China coast emerges as one of the areas with a large number of positions. Two AOIs, one over the Shanghai port, the other over the Hong Kong -Shenzhen port, has been defined, using a 200 km buffer and the territorial waters to draw them.

Figure 18
The frame A shows the ports and the AOI extents used to select data, the frame B, shows the total count of vessels' positions aggregated on the hexagonal spatial grid.
The Hong Kong -Shenzhen area comprises the ports of Hong Kong, Shenzhen, Nansha and Yantian, and represents the 7% of the whole dataset. The Shanghai area comprises the port of Shanghai, Yangshan and Ningbo, and represents the 10% of the whole dataset.

Figure 19
Trend analysis on counted AIS' positions over all bin locations, for Hong Kong -Shenzhen (a) and Shanghai (b) areas. The height of hexagonal spatial grid change from 10 km (1) to 20 km (2). (Basemap sources: Esri, USGS, NOAA).
In Figure 19, trend analysis on the number of containerships has been conducted over the two AOIs, using 10 km and 20 km as spatial bin dimension. The Hong Kong -Shenzhen area shows general descending trends. Changing from 10 to 20 km, port areas confirm the descending trend, but seems better visible on the path directed to Shenzhen and Nansha. In addition, Nansha port does not show any more a significant trend (stable situation). Looking at the Shanghai area, trend analysis on the number of containerships shows a general growing trend, with the exception of Yangshan, where the number of vessels seems decreasing. The 20 km grid in this case confirms the previous analysis. Trends of mean values of TEU registered in bins have been also analysed ( Figure 20). The trend in the Shanghai area is generally positive and in particular in the Yangshan port, the world's largest automated container terminal 4 , which was effectively built to overcome shallow water and compete and cooperate with the nearby Ningbo port (Jia-bin, Yong-sik, 2010). In the Hong Kong -Shenzhen area also the mean TEU trend is positive and, compared to the descendent trend in the number of containerships, confirm that the "gigantism" phenomenon affects this area.

Figure 20
Trend analysis on mean TEU over all bin locations, for Hong Kong -Shenzhen (a) and Shanghai (b) areas. The height of hexagonal spatial grid change from 10 km (1) to 20 km (2). (Basemap sources: Esri, USGS, NOAA).

CONCLUSION
The approach proposed by the authors aimed to extract value added information from an AIS dataset, with a focus on maritime economy. Space-time data cubes demonstrated to be effective in manipulating large dataset and in highlighting trends. It is also evident that changing the spatio-temporal resolution of the analysis has an impact on the capacity to detect local trends: relatively low resolutions are useful to provide global insight, the potential of higher ones may be limited where the dataset is so limited to lose statistical significance. One of the major difficulties encountered by the authors is to find adequate data to be used to confirm the outcomes of the analysis. Correspondences in global trends as reported in general reports such as the ones produced by UNCTAD have been generally found, together with local correspondences related to infrastructure intervention, such as Panama expansion or the inauguration of the automated Jebel Ali port. Other local statistics, such as the ones provided by port authorities, are less easily comparable: i.e., it is quite common to find statistics on volume of goods transited through a specific port, but this information is not easily inferable from AIS data, where only drought parameter can be somehow linked to the effective load of each ship. AIS data has known limitations mainly related to the fact that a significant part of the recordset is manually inserted, leading to a non-negligible percentage of accidental or deliberate errors. Significant efforts are conducted in order to recognise and correct those issues. Additionally, historical AIS are normally collected by commercial companies managing a network of ground stations, integrated with satellite data: for this reason, the completeness of this dataset is variable and hard to be quantified. Future developments include the study of methods for ship route and destination forecast based on real-time AIS data, complementing existing approaches (Zhang et al., 2020;Yang et al., 2019); those information are considered very relevant both for port authorities, for optimizing operations on vessels, and for carriers, e.g. to build repositioning strategies. This methodology could also be suitable for "just in time" impact analyses to observe trends on a general and local scale of punctual events such as the blockage of the Suez Canal 5 , quickly providing useful information on the areas/ports most affected by the events. Finally, further local analysis, such as that carried out on Chinese ports, with higher temporal frequencies and smaller spatial grids can be a useful tool for monitoring and managing the area.