DEVELOPMENT OF GEOSPATIAL INFORMATION INTEGRATED WITH BIG DATA TO AGRICULTURAL HAZARD MONITORING IN WEST JAVA

: Food security is highly dependent on three aspects, namely food availability, food access, and food utilization. The availability aspect depends on food supply which is identical to agricultural productivity. West Java Province is the third national rice producer with 16.6%, but West Java Province is the most extensive rice consumer, around 21.1% of the total national rice consumption. Agricultural productivity can decline due to natural hazards such as floods and droughts. Monitoring floods and droughts in paddy fields are necessary to prevent decreased agricultural productivity. This study aims to monitor the rice fields from the dangers of flooding and drought every month. Agricultural hazard monitoring is divided into two parameters, namely static parameters and dynamic parameters. Dynamic parameters are observed every month so that the hazard index is generated on a monthly scale. GIS and Remote sensing data are integrated to perform agricultural hazard modelling. Furthermore, this agricultural hazard modelling results will be strengthened by using big to provide information about an almost real-time event that can be accessed through the Application Program Interface (API) service. This study uses a data mining system from Drone Emprit that performs data mining on Twitter and news portals with machine learning technology (probabilistic classifier) and Natural Learning Process. The results obtained are around 15,000 data from January 1 to November 1, 2021, and 37.9% of them are identified by location based on the city or district level in West Java Province. It is hoped that the policy-maker can consider the area of agricultural land that requires assistance to increase productivity and plan a policy to support agriculture in West Java in the future.


INTRODUCTION
Food is a basic human right that must be fulfilled (Perum BULOG, 2022). According to Indonesian Law No. 18 of 2012, Food Security is the condition of fulfilling food for the state to individuals, which is reflected in the availability of sufficient food, both in quantity and quality, safe, diverse, nutritious, evenly distributed, and affordable. If food availability is smaller than the need, it will undoubtedly impact food imbalances that can later trigger a food crisis. The food crisis is a widespread food scarcity in communities in one region caused by several things, such as food distribution difficulties, population explosions, and crop failures caused by climate change (Agriculture Organization of the United Nations, 2010; Government of Indonesia, 2012;IGI Global, 2014).
In developing countries, food crises are affected mainly by population increases and climate change. According to the FAO, the world's population will grow by 34% from the current 6.8 billion to 9.1 billion by 2050, where the population increase is mainly occurring in developing countries today (FAO, 2009) . On the other hand, agricultural productivity growth only reaches 1-2% per year (Bourne, 2009). This is because most of the world's agricultural land is degraded by 33%, so there is little chance to expand agricultural areas. Of course, this condition can have implications for declining food production. Later, the IPCC stated that climate change would consistently affect food production in regions with low latitudes (Porter et al., 2015). Climate change that reaches extreme severity can cause natural disasters such as floods, droughts, tropical storms, heat waves, and forest fires, significantly impacting agricultural productivity. Drastic climate change is estimated to have the most impact on food in developing countries (World Bank, 2010). In developing countries, millions of people are still dependent on agriculture so it will be vulnerable to a food crisis. Seeing the dependence of agricultural productivity on the climate is quite significant, climate disasters will be one of the factors that must be considered its effect on agricultural productivity. In this study, disasters that will be focused on the influence of agricultural productivity are floods and droughts in line with the FAO report, which states that floods and droughts have an impact of losses and damage of 83% as a result of agricultural disasters in countries where agriculture is one of the main economic drivers (FAO, 2017).Thus, the relationship between floods and droughts to productivity is significant.
There are various technologies used to monitor agricultural productivity, one of which is big data technology. Big data technology can capture data from multiple sources related to agriculture. This data capture can be done with various technologies such as remote sensing with satellite imagery, drones, radar, or even smartphones (Islam Sarker et al., 2019) The basis of big data technology itself is spatial data (Wang & Yuan, 2014), which is data that shows the geographical location of an object in the real world. To combine big data with spatial then the realization of spatial data mining, one of its branches is geographic data mining. Geographic data mining discovers new knowledge from extensive geospatial data (Rajesh, 2011;Setiyono & Mukhlash, 2005). In addition to data related to spatial data, at this time, big data is widely associated with non-spatial data obtained from news and social media.
The rapid development of technology and information directly impacts the use of social media in people's lives. Based on data from "Hootsuite, We Are Social," the number of social media users in January 2021 in Indonesia is estimated to be 170 million population out of a total of 274.9 million total population. There was an increase of 6.3% compared to the previous year. YouTube, WhatsApp, Instagram, Facebook, and Twitter are the five most used social media in Indonesia. Social media plays an essential role in community life as a manifestation of freedom of speech. Information about events and issues in the community can be found on social media through official accounts and personal accounts. This allows for interaction between accounts and increases awareness of specific problems in society. Extracting data through social media, otherwise known as social media mining, is very important for stakeholders in decisionmaking. In this study, social media data mining will obtain geographically based social media data related to food production and problems. With the association between spatial data and nonspatial data, various information can be found in the same spatial data so that the information obtained will be more detailed.
There have been several previous studies related to this research. The first is research conducted by (Weerasekara et al., 2021) which models the multi-hazard hazard dangers of floods, droughts, strong winds, and landslides on rice productivity with a stochastic approach. The second is a study conducted by (Pratiwi et al., 2020) which modelled the multi-hazards of floods and droughts on rice fields in Central Java in 2014 -2018. The study (Pratiwi et al., 2020) used statistical data from historical data from official government documents. The third is research (Chen et al., 2018) which models the multi-hazards of floods and droughts against agricultural production with a Bayesian hierarchical approach.
The study aimed to model the dangers agriculture poses to rice fields by integrating remote sensing data. Furthermore, the modelling results will be validated using geographic and social media data mining using Drone Emprit API. With this study, it is hoped that it can be used as a solution in monitoring agricultural productivity comprehensively as an effort to support the Sustainable Development Goals (SDGs) in meeting the number two goal of ending hunger, achieving better food security and nutrition, and supporting sustainable agriculture.

Area Study
This research area of study is in West Java. As explained earlier in the study, this study has become significant to be carried out in West Java because West Java is a national rice barn area. There are many agricultural threats in West Java.

Data
The data used in this study consists of 8 data, including Digital Elevation Model (DEM), river network, precipitation, land useland cover (LULC), soil moisture, watershed, Keetch-Bryam Drought Index (KDBI), and Normalized Difference Vegetation Index (NDVI) data. A complete explanation related to the source, temporal, resolution, and reference of each data can be seen in table 1. The processing is grouped into flood hazard modelling and drought hazard modelling. The spatial data used in this study are raster data and vector data.
Raster data used are DEM, precipitation, LULC, soil moisture KBDI and NDVI data. DEM is used to provide altitude and elevation information in the study area. Precipitation is used to obtain rainfall information. LULC data is used to get rice field cover which is the focus of the area to be studied in this study. Soil moisture is used to obtain soil moisture information in the study area, where the higher the soil moisture, the higher the groundwater supply. KDBI is a unity index of fires to assess forest fire hazards. NDVI is an index that shows the greenish level of vegetation. The value of NDVI always ranges between -1 to +1; if the value of NDVI is less than equal to 0, then it is classified as non-vegetation, whereas if more than 0 will be classified as vegetation which is getting closer to 1, then vegetation has a high level of greenery. At the same time, the vector data used in this study are river network and watersheds data. River network data is used to determine the distance of the river to the rice field area, while watershed data is used to determine the density of river flow in a watershed. These data will be used as parameters in modelling flood and drought hazards.

Methods
In general, the method used can be seen in Figure 2. The methods carried out in this study are divided into three parts, namely flood hazard modelling, drought hazard modelling, and social media data mining using APIs from Drone Emprit. The Multi-Criteria Decision Analysis (MCDA) method involves modelling flood and drought hazards. Each parameter is given a score and weight to obtain locations of flood and drought hazards in West Java. Social media data mining is carried out by collecting social media and news data throughout west Java at a specific time using keywords related to food productivity problems.

Flood Hazard Modeling:
The flood hazard model in this study will analyze a monthly flood hazard model that will focus on rice fields in 2021. Modelling is done using the MCDA (Multi-Criteria Decision Analysis) method. The parameters used will be grouped by class and score, resulting from modifications from some studies listed in Table 2. The equation of river density can be seen in injunction 1, with DD as the river's density, L is the river's length, and A is the watershed area (Sakti et al., 2022). The parameters on the flood model data are divided into two, namely static parameters (DEM, Slope, River Distance, and River Density) and dynamic (precipitation). In addition, static and dynamic parameters have the same influence. Each static and dynamic parameter will be numbered first and then given a weight of one for static parameters and one for dynamic parameters. Furthermore, static and dynamic parameters will be integrated to form a flood hazard model.  (Hoque et al., 2020;Sivakumar et al., 2020;Takeuchi et al., 2015).

Drone Emprit Geo-Data Mining:
The study used a data mining system from Drone Emprit API that excavates, stores, and analyzes data from Twitter and news portals. Almost real-time data related to food security can be accessed through the Drone Emprit API service. Drone Emprit is one of the platforms that can be used to gain insight and knowledge about events and issues that occur through retrieving data from the internet. Data is sourced from online and social media such as Twitter, Facebook, Instagram, and Twitter. Users can get preliminary findings and analyses for a specific topic in less than 10 minutes. After that, the system will continue to collect social media and news data in real-time. The keywords used in reaching food productivity data are divided into three groups: food availability, food access, and food use. Figure 3 shows the keywords used and the social media mining workflow methods.

Flood Hazard Model in Rice Fields
The monthly flood hazard model results can be seen in Figure 4. The most vulnerable flood hazard models are January, February, March, November, and December in 2021. This is due to the high rainfall in the month of 2021. Precipitation in this modelling becomes influential because it is a different dynamic data parameter every month. In addition, the results of the 2021 annual flood model can be seen in Figure 5. Based on the results, in the northern area of West Java, the class ratio is very high flood danger in rice fields is more dominant when compared to the southern part of West Java.

Drought Hazard Model in Rice Fields
The model of drought hazards in rice fields with a monthly time series in 2021 can be seen in Figure 6. In the monthly drought model, the lowest drought values occur in January, February, March, November, and December 2021. This conforms with the 2021 flood hazard model that has a negative correlation. As explained in Section 3.1, the highest flood hazard models occur in January, February, March, November, and December. This very low drought can be caused by dynamic parameters, namely high precipitation, low KBDI, soil moisture, and high NDVI. In addition, the results of the 2021 annual drought model are following Figure 7. A reasonably high drought is found in the northeastern part of West Java through the results.

Probability of Flood and Drought Hazards in Rice Fields
After obtaining the danger of flooding and the danger of drought in rice fields in West Java, it will then be determined the rice field area that has the potential to get more dangerous than other rice fields. The meaning of more danger is rice fields where during the rainy season experiences flooding and the dry season will experience drought. This type of rice field will require more extra handling if it occurs because it will cause crop failure or failure to plant in rice fields. Using the annual flood hazard model ( Figure 4) and the yearly drought hazard model (Figure 6), the probability of flood and drought hazards can be seen in Figure 8. Rice fields in West Java are generally included in the low drought hazard class with high flood danger. This means that the potential for flooding in rice fields will be greater than the potential for drought. It can be seen from the results of Figure 8 that there is no type of rice field that has a high drought hazard / very high with a high / very high flood danger, so in this case, it can still be controlled on rice fields to anticipate the danger of flooding in the region. The danger of floods and droughts in rice fields in every city and regency in West Java can be calculated in Figure 9. This percentage can be used to determine the number of rice fields in a city/regency to plan policies to overcome it. It can be seen in Figure 9 that the rice fields in each city and district are dominated by high flood hazards, which reach an average of 74.88% per district/ city. While drought is dominated by low drought danger where the average of each district/city is 76.45%.

Hazard Model Analysis using Drone Emprit Geodatamining
The social data mining process obtained about 15,000 data from January 1 to November 1, 2021, and 37.9% of them were identified by location based on city or district level in West Java Province. Based on Figure 10, the highest distribution of issues in geodata mining the highest food availability data lies in the city of Bandung with a real problem of 151 cases, Sukabumi City 94 Bogor City 68 Subang Regency 60 cases, and West Bandung 47 cases. The highest percentage of patients is in urban areas because on urban land, agricultural land tends to be few, and agricultural disasters that affect agricultural production cannot support the needs in the area.
The percentage of flood and drought hazards (Figure 8) will be compared with the hazard results from geodata-mining (figure 10), where it is obtained that the city of Bandung, which has the highest food availability case information, has a high percentage of rice fields experiencing a high flood danger of 90%. The threat of medium drought reaches 89.19%. In this case, it shows social data mining and geodata mining information that is in line with the information on flood hazards and drought hazards that have been created.

Limitation and Future Study
This study has several limitations, including modelling flood hazards and droughts do not use weighting so that the weight value on each parameter is considered the same. This modelling is also not involved with other supporting data that sufficiently affect the modelling of flood and drought hazards, for example, soil type data. So that for the modelling of agricultural disasters, more complex models can be used by using weighting so that the results of the model obtained are more representative of the actual codification. Model analysis can also use time-series charts to daily levels on dynamic parameters so that the investigation is more detailed and can approach the exact condition. Further study development can also be done using WEB-GIS monitoring that can be shared with the community, integrating near-real-time agricultural disaster models and agricultural disaster geodata mining. This WEB-GIS is expected to be used by the public, especially farmers, to increase agricultural productivity.

CONCLUSION
Modelling the dangers of floods and droughts on agricultural land showed results that matched the opposite relationship. In an area with a high flood danger, the area tends to have low drought hazards and vice versa. Integrating agricultural hazard models with social data mining and geodata mining from Drone Emprit API is also directly proportional to continuous results. In other words, this agricultural hazard model is enough to represent the dangers of agriculture in actual conditions.