DEVELOPMENT OF PHENOLOGY BASED ALGORITHM FOR CROPLAND AND CROP TYPE MAPPING WITH MULTITEMPORAL LANDSAT IMAGE DATA-CASE STUDY IN THE NORTHWEST OF VIETNAM

Cropland mapping is very important for food security, policy development, land use planning, and environmental protection. Scientists have developed methods and techniques for cropland mapping with remote sensing image data. Both single date and multitemporal image data are used for generation of cropland and crop type maps. Multitemporal image data has advantages over single date image data from reliability and accuracy point of view because multitemporal image data allows to eliminate seasonality of vegetation. In this paper, the authors present new algorithm for cropland and crop type mapping with multitemporal Landsat image data. The algorithm requires for analysis of all Landsat scenes observed during one year and if needed, scenes in some years back to compensate clouds and cloud shadows. Phenology of land cover is constructed based on six bimonthly cloud free land covers that were automatically classified using the selected scenes. By grouping land covers within two months to six land covers of periods JanuaryFebruary, March-April, May-June, July-August, September-October, and November-December we create six bimonthly cloud free land covers that formulate a database for mapping cropland and crop types. By analysis of 50 Landsat scenes of path/row number 128/45 (northwest of Vietnam) observed mainly from 2017, 2016 and 2015 we success to map upland cropland and 14 crop types with area ranging from 145,143 ha to 3,373 ha per crop type. The study pointed out that phenology characterized by six bimonthly land covers is acceptable to identify cropland distribution and some specific crop types. For better results, apparently we need higher temporal resolution of image data. Due to uncertainty of the atmosphere, it is almost impossible to rely only on optical remote sensing data to achieve high temporality of data so application of multitemporal SAR data could be a way to overcome this obstacle.


INTRODUCTION
Cropland mapping is very important for food security, policy development, land use planning and environmental protection (Csillik and Belgiu, 2017;Shao et al., 2009;Xiong et al., 2017).In conventional way, annual cropland maps are established by using official statistics together with land use map and field survey (Foerster et al., 2012;Victoria et al., 2012).Remote sensing image data, due to its large area coverage, periodically repeated and multispectral observation, has been applied in a multitude of different mapping areas, ranging from agriculture, forestry, land cover, land use, including cropland and crop type mapping.Both coarse and medium spatial resolution image data are used to generate cropland maps in national wide or continental scale (Waldner et al., 2017;Xiong et al., 2017;Zhao et al., 2017;Zhong et al., 2016).
Methodology for cropland mapping using remote sensing data has been developed for years (Simonetti et al., 2015;Xiong et al., 2017;Zhong et al., 2016).In general, adopted methods and techniques include phenology based algorithms (Chen et al., 2018;Dong et al., 2016;Foerster et al., 2012;Simonetti et al., 2015;Zhao et al., 2017;Zhong et al., 2016), classification decision tree (Deng and Wu, 2013), support vector machine (Brenning et al., 2006), random forest (Tatsumi et al., 2015), and others.Data used in these methods is either single date or multitemporal.Selection of single date images for cropland mapping depends on status of the vegetation and weather conditions.Usually, single date image is required to be taken at peak growing season and not affected by clouds or cloud shadows.Cropland maps generated by using single date image, however, involve uncertainties and inaccuracy caused by vegetation seasonality (Simonetti et al., 2015(Simonetti et al., , 2014)).Multitemporal image data has many advantages over single date image data in term of elimination of seasonality of vegetation.Phenology of cropland and crop types is characterized, mostly, by using time series of normalized vegetation index (NDVI) or the enhanced vegetation index (EVI).The use of vegetation index, nevertheless, possesses errors in the time series, such as sensor problems or cloud contamination not screened out by the fitting procedure, could lead to the incorrect identification of croplands (Victoria et al., 2012).Little study applied phenology characterized by temporal profile of land cover for accurate land cover or crop mapping.
In this paper, we present a development of a new algorithm for cropland and crop type mapping with Landsat multitemporal image data.Unlike conventional approach where time series of vegetation index is used to characterize phenology of vegetation, here we use temporal profile of land cover (variations of land cover type throughout a year) to describe growing cycle of field crops and then use this profile to classify cropland and crop types.We assume that there are very few cases when cloud free Landsat images are available for both dry and rainy seasons so we need to use more Landsat scenes for every two months to composite cloud free Landsat image and then to produce bimonthly cloud free land cover.A set of six bimonthly land covers is a kernel for development of land cover profile which is basis for crop land and crop type mapping.The study was tested in northwest of Vietnam where upland shifting cultivation is dominant form of agricultural production.We success to generate upland cropland map and map of 14 crop types with area ranging from 145,143 ha to 3,373 ha per crop type.

STUDY AREA
The study area is located in the northwest of Vietnam covered by Landsat scene with path/row number 128/045 (Fig. 1).The selected Landsat scene covers parts of Yen Bai, Son La and Dien Bien provinces of Vietnam where live about 30 different ethnic groups with majority of Thai, Kinh, H'mong, Muong and Dao peoples.The area is mountainous, remote and livelihoods of ethnic groups were dependent upon shifting cultivation practices (Wezel et al., 2006).The shifting cultivation practices are characterized by the clearing of primary or secondary forest to cultivate upland crops.The cropland landscape here is highly fragmented and heterogeneous with relatively small patches.This agricultural practices leads to deforestation and increase of soil erosion.Topography of the study area is rugged with Hoang Lien Son range where locates mount Fansipan -the highest mountain of Indochina peninsula with the altitude of 3,143m above sea level.Climate here is divided to two distinct season with rainy season starting from May to November.Vegetation is dominated by evergreen forest.Upland cultivation includes major crops as maize, sugarcane, upland rice, and cassava.

Data preparation
Landsat image data was downloaded from United State Geological Service (USGS) website https://earthexplorer.usgs.gov/.Totally, 50 Landsat scenes with cloud coverage smaller than 80 % was selected.Despite the fact that baseline year for cropland mapping is 2017, we need to use Landsat scenes for the same months of 2016 and 2015 to compensate cloud cover in 2017 scenes.For very high mountain area with heavy cloudiness, we used some scenes from older years as 2006, 2009 and so to achieve cloud free dataset.This operation, in fact, does not severely influence the quality of the land cover since there are almost no changes over there due to inaccessibility.Table 1 shows list of 50 Landsat scenes used for this study.

January-February
May-June

July-August
September-Otober November-December Table 1.List of fifty Landsat scenes used to generate bimonthly cloud free land covers of 2017 All downloaded Landsat scenes are of level-1 processing and before classification of land cover, digital number DN was converted to top of atmosphere (TOA) reflectance.Figure 2 show example of complexity of cloud cover of the study area during January and February of year 2017.This complexity explains reason for usage of more Landsat scenes to compensate cloud pixels to achieve cloud free land cover.

Algorithm development
Cropland and crop type mapping is very challenge due to the of agricultural cultivation system.High temporality of remotely sensed data is one of key requirement for success of cropland and crop type mapping.Landsat image data has revisit observation of every 16 days.In cloud free atmosphere, this revisit frequency is good enough to observe different growing stage of crop.However, due to cloud contamination, this revisit time is not sufficient to obtain cloud free image data so we need more observation to compensate cloudy pixels.Our proposed algorithm is shown in Fig. 3.The algorithm is divided to two main stages.In the first stage, a dataset of six bimonthly cloud free land covers is created.In the second stage, a phenological pattern database is established with the six bimonthly cloud free land covers.The database is organized statistically to make ease of identification of unique phenological patterns.Such developed database is used further for simulation of cropland and crop type distribution using phenological patterns.In the right side of Fig. 3 we present complementally flowchart of automated classification of land cover based on application of simplified spectral pattern (SSP) concept (Duong, 2018, 2016, Duong et al., 2017, 2014).

Bimonthly cloud free land cover classification
Every selected Landsat scene will be automatically classified to obtain land cover with eight major classes: water, wetland, close forest, open forest, brown vegetation, sparse vegetation, dry bare land and moist bare land.Cloudy pixels are masked by using Band quality assessment band (BQA) which is provided in every Landsat Collection 1 level 1 data.We use cloud confidence of 75 percent to detect cloudy pixels.
We divide one-year time to six periods of two month clusters: January-February, March-April, May-June, July-August, September-October and November-December.Then we combine land covers within each period together to obtain cloud free land cover.During this combination, cloud free pixel originating from newest observation date will have higher priority than pixel from older data to be included in the final land cover (replacing cloudy pixels).Figure 4 displays six bimonthly cloud free land covers of the study area.When pixels from two land cover are cloud free at the same time, combination algorithm prefers pixel with less vegetation (bare land) to be included in final result.

Establishment of land cover database
In order to map efficiently cropland and crop types, we firstly create a table of all phenological patterns and then apply a phenological model to map cropland and crop types.Phenological model allows selection of phenological conditions following local ecological and cropping practices to classify pixels to particular crop type.For example, to map upland cropland, we select phenological patterns, which have bare land and vegetation in at least one period while no land cover class in any period is water or wetland.Phenological patterns in the database are sorted in descending order by number of pixels.The pattern with highest number of pixels is put on the first position in the sorting list.The number of phenological patterns is often very large so we focus on patterns on top of the sorting list only.These patterns are more meaningful than those with small number of pixels.

RESULT AND DISCUSSION
We used 50 Landsat scenes with path/row number 128/45 (Table 1) for cropland and crop type mapping.Basically, we prefer Landsat 8 scenes from 2017, 2016 and 2015 for mapping of year 2017.However, due to high mountains in the study area which are very often covered by clouds around the year, we need to use older Landsat images taken from previous years to compensate clouds.
To  By statistical analysis, we found that there are 648,653 phenological patterns with number of pixels per pattern ranging from 1,714,258 to one.Naturally, the patterns with fewer pixels are caused by many reasons like data noise, cloud and cloud shadow contamination, misclassification and unidentifiable land cover types produced by mixing of different crop types.Therefore, we take into consideration in mapping only phenological patterns that have reasonably high number of pixels.
To map upland cropland, we set condition that during the year the is never classified to water or wetland.At least at one time the pixel is classified as bare land and at one time as close or open forest.The idea behind requirement that at one time the pixel has to be bare land denotes harvesting time what is the main characteristics for annual crops.The existence of bare land and vegetation (forest) in the phenological pattern simply indicates growing cycle of annual crop.Figure 5 shows map of upland cropland.Upland cropland is coloured in red and displayed over background made by Landsat 8 colour composite RGB: 654.The background is blurred to highlight upland cropland.Distribution of cropland as seen in Fig. 5 is spanned on west site of Hoang Lien Son range where topography and climate condition is favourable for crop cultivation.Cropland spreads over slope land of low mountains with elevation from 200m to 1000m and follow relief of terrain.The area inside the selected window (Fig. 5) where upland crop is the most intensive experienced in recent years many natural hazards as flush flood and landslides.Upland cropland mapping, therefore, is important for environmental protection and natural hazards mitigation.could be considered as different crop types.Each potential crop is named accordingly to growing cycle with harvesting and peak growing season.For example, crop with growing (vegetative) season from May to December while from January to May the field is under harvesting, propagation, or rooting is named as BBVVVV.By this naming system we can name all phenological patterns and count their number of pixels.Due to mixture of different crop types, intensive farming practice and small size of fields, many phenological patterns are not identifiable to any crop type.We select only patterns with highest number of pixels for crop type mapping.In this study, we select 14 phenological patterns (Table 2.) with highest number of pixels ranging from 1,612,698 (145,143 ha) to 37,479 (3,373 ha) for crop identification.In this table we present list of classes that are named according to its phenological patterns and colours used for Table 2. Fourteen crop types with highest number of pixels selected for crop type mapping BBVVVV is major crop in the study area, coloured in red in Fig. 6, is interpreted as maize.Maize cultivation, however like many other crops, does not have fixed date of propagation and consequently fixed date of harvesting.There are early and late maize and time tolerance of propagation could be up to four months.This practice and trend in intercropping of maize and cassava recently result complexness in using phenological patterns for crop identification.However, we expect that phenological patterns can provide us spatial information of major crops in study area.Our study area is remote, mountainous and undeveloped region with limited access.Moreover, very high spatial resolution satellite imagery is also not available in Google Earth so validation of our cropland and crop type mapping is limited.Nevertheless, our approach is very promising and the proposed algorithm needs to be continued and tuned in future.However, visual interpretation of our map overlaid on false colour composite of Landsat images reveals that cropland distribution is quite reliable and acceptable.Due to lack of crop type distribution map we could not assess quality of crop type mapping.However, spatial analysis of distribution of crop fields demonstrates logical accordance of crop type placements with terrain characteristics and cropping practices.Finally, the study pointed out that phenology characterized by six bimonthly land covers is good enough to identify cropland distribution and crop type classification.To achieve better results, apparently we need image data with higher temporal resolution.Application of multitemporal SAR data could be a way to overcome this obstacle.

Figure 1 .
Figure 1.Study area; a) Location of study site in Vietnam; b) Digital elevation model; c) False colour composite of Landsat 8 RGB:654 of Dec. 20, 2017.

Figure 2 .
Figure 2. Landsat scenes show complexity of cloud cover in the study area for months of January and February of year 2017.Colour composite is RGB:654 for Landsat 8 and RGB:543 for Landsat 7. Observation date is Jan. 13, 2015 for a); Jan. 29, 2015 for b); Feb. 19, 2017 for c) and Jan. 28, 2006 for d).
For automated classification of land cover, we developed a database of simplified spectral patterns representing eight land cover categories: water, wetland, close forest, open forest, brown vegetation, sparse vegetation, dry bare land and moist bare land.The classification is based on comparison of SSP computed for every pixel of image to SSPs in the database and label the pixel by class of identical SSP.

Figure 4 .
Figure 4.The six bimonthly cloud free land covers of year 2017 for periods: a) January-February, b) March-April, c) May-June, d) July-August, e) September-October, and f) November-December.Dark Green: forest; Yellow green: open forest; Blue: water; Orange: bare land.

Figure 3 .
Figure 3. Block diagram of algorithm for cropland crop type mapping with multitemporal Landsat image data.Procedure for automated classification of land cover is shown in green box in the right part.
The selected Landsat scenes were automatically classified by using SSP to generate land cover with 8 classes: water, wetland, close forest, open forest, brown vegetation, sparse vegetation, dry bare land and moist bare land.Cloud and cloud shadow mask provided in Quality assessment band (BQA band) of Landsat Collection 1 Level-1dataset were used to exclude bad pixels and replace them by good pixels from land cover of other date.Good and bad pixel are defined from cloud and cloud shadow contamination point of view.When combining good pixels, higher priority is given to those that are more closed to base observation date (2017) and to bare land.Reason that we assign bare land higher priority than vegetation is to catch annual growing cycle of crop.Six cloud free land covers are presented in Figure4.By visual inspection of those land covers we can detect comprehensively seasonal changes of land covers around the year.These changes are basics for definition of phenological patterns of all crop types.

Figure 5 .
Figure 5. Cropland map of the study area and window for detail visualization.Cropland is coloured in red and displayed over blurred background made by Landsat 8 colour composite RGB: 654 class visualization.Class BBVVVV has highest number of pixels of 1,612,698 which is equivalent to areas of 145,143 ha.The class BBVVVV can be explained that in months from January to May (dry season) fields are under either harvesting, land preparation or propagation.From May to December, crops grow and the fields become covered by green vegetation.When compared to agricultural calendar of the study site we could interpret this class as maize.The class BBBVVB stands for phenology that bare land is field status of the half of the year and in November and December.Vegetation dominates from July to October.This phenology could be assigned to upland rice.The class BVVVVB has phenology where vegetation dominates in months from March to November.Harvesting is either on the beginning or on the end of the year.This phenology is typical for sugarcane cultivation.

Figure 6
Figure 6 shows crop type map and enlarged window showing details of distribution of different crops.Colours used for presentation of classes are given in Table 2.In this figure, crop types with similar phenological patterns are visualized by similar colour.Similar colour formulates small and fragmented patches showing spatial spreading of crops.We can notice quite clear spatial patterns of distribution of crops, which might reflect cropping practice.Assessment of spatial distribution of crops reveals that crop that has similar phenological patterns are

Figure 6 .
Figure 6.Map of crop types and window for detail visualization.There are 14 crop types dyed by colours in Table2.Blurred background image is made by Landsat 8 colour composite RGB: 654

Table 2 .
Blurred background image is made by Landsat 8 colour composite RGB: 654 of new algorithm for cropland and crop type mapping using multitemporal Landsat images.Our proposed algorithm can do analysis with image data of all Landsat sensors including Landsat 4/5, Landsat 7 and Landsat 8. Image data collected during one year is grouped to six clusters representing bimonthly observation.Images within each cluster are automatically classified to numbers of land cover map that are subsequently combined together to exclude clouds and cloud shadows.The six bimonthly cloud free land covers are used further for development of phenological pattern database.During the development of database, statistical computation is applied to identify all phenological patterns.These phenological patterns are main input for cropland and crop type mapping.Due to inevitability of cloud cover affecting quality of image data of the selected year, we need to use Landsat image data from years back to compensate cloud and cloud shadow and that could influence reliability of the final results, especially for mountainous region with high probability of cloud around the year.By using our proposed algorithm, we success to map cropland and crop type distribution in the study area covered by Landsat scene with path/row number 128/45.The results we achieved are created by application of logical models with phenological conditions for cropland and crop types that dominantly spread in the study area.Different phenology based land cover types such as bare land or cover of distinct vegetative status is assumed to represent different crop types.As result, we identified 14 major crop types with area ranging from 145,143 ha to 3,373 ha.Because the study area is located in mountainous area without any map of crop distribution so we could not carry out accuracy assessment.
5. CONCLUSIONCropland and crop type mapping is one of major application of multitemporal optical remote sensing.The NDVI time series of coarse remote sensor like MODIS are utilized as main data source for cropland and crop type mapping.However, for better land use management, the use of higher spatial resolution remote sensing data such as Landsat is getting in demands.In this study, we present development