EVALUATING ERROR OF LIDAR DERIVED DEM INTERPOLATION FOR VEGETATION AREA

Light Detection and Ranging or LiDAR data is a data source for deriving digital terrain model while Digital Elevation Model or DEM is usable within Geographical Information System or GIS. The aim of this study is to evaluate the accuracy of LiDAR derived DEM generated based on different interpolation methods and slope classes. Initially, the study area is divided into three slope classes: (a) slope class one (0° – 5°), (b) slope class two (6° – 10°) and (c) slope class three (11° – 15°). Secondly, each slope class is tested using three distinctive interpolation methods: (a) Kriging, (b) Inverse Distance Weighting (IDW) and (c) Spline. Next, accuracy assessment is done based on field survey tachymetry data. The finding reveals that the overall Root Mean Square Error or RMSE for Kriging provided the lowest value of 0.727 m for both 0.5 m and 1 m spatial resolutions of oil palm area, followed by Spline with values of 0.734 m for 0.5 m spatial resolution and 0.747 m for spatial resolution of 1 m. Concurrently, IDW provided the highest RMSE value of 0.784 m for both spatial resolutions of 0.5 and 1 m. For rubber area, Spline provided the lowest RMSE value of 0.746 m for 0.5 m spatial resolution and 0.760 m for 1 m spatial resolution. The highest value of RMSE for rubber area is IDW with the value of 1.061 m for both spatial resolutions. Finally, Kriging gave the RMSE value of 0.790m for both spatial resolutions.


INTRODUCTION
Light detection and ranging (LiDAR) can be a source of data for generating precise and directly georeferenced spatial information about the shape and surface characteristic of the earth.LiDAR is an established method for collecting very dense and accurate elevation data across landscapes, shallow-water areas and project sites.It is a type of an active remote sensing technique which is similar to radar but uses laser light pulses instead of radio waves for capturing 3D point clouds of the earth surface (Habib et. al., 2005).
In recent years, LiDAR has become the main data source for producing high resolution digital elevation model (DEM) or digital terrain model (DTM) (Raber et. al., 2007).Typically, a spatial resolution of 1 metre or higher can be obtained from various sources for example high density airborne LiDAR, high resolution aerial photogrammetry and high resolution satellite stereo images.On the other hand, LiDAR derived elevation has absolute accuracy of about 6 to 12 inches (15 to 30 centimetres) for older data and 4 to 8 inches (10 to 20 centimetres) for more recent data.Concurrently, relative accuracy (e.g., heights of roofs, hills, banks, and dunes) is even better.
Nowadays, airborne LiDAR is a powerful tool to survey high resolution and high accuracy DEM for large areas (Wehr and Lohr, 1999).The output of an airborne LiDAR survey is a massive point clouds that needs to be interpolated in order to provide a continuous surface for final users (Kraus and Pfeifer, 2001).The choice of interpolator and the cell size play an important role for determining the quality of LiDAR-derived DEM (Bater and Coops, 2009).Figure 1.1 shows the result of point clouds that have been collected.

Digital Elevation Model
DEM is defined as a representation of bare earth surface, void of vegetation and urban features (Wechsler, 2011).Concurrently, it is a computer representation of the earth's surface (Wechsler, 2003).Nowadays, the demand for a more accurate DEM with higher spatial resolution is increasing due to the variety of Geographic Information System (GIS) related applications such as forest management, urban planning and road design (Lim et al., 2003).For this, the use of LiDAR data in generating DEM has become a trend where the LiDAR generated DEM can be used for variety of applications (Liu, 2008).

Slope
A slope is a change of elevation due to the change in horizontal position (Jones, 1998).Slope is regularly used to describe the steepness of the ground's surface where it can be measured as the rise (the increase in elevation in some unit of measure) over the run (the horizontal distance measured in the same unit as the rise) (Stump, 2001).GIS can analyze digital elevation data such as elevation points, contour lines and DEM and subsequently, derive both slope and aspect data sets from these elevation data.
Later on, the slope and aspect data are used to describe landforms, to model surface runoff and to classify soils.

Interpolation
Interpolation according to Borrough et. al. (2001) is the procedure of predicting the value of attributes at unsampled site from measurements made at point locations within the same area or region.An interpolation made within a spatial context is known as spatial interpolation where it is the process of using the point with known values to estimate values at other points (Chang and Kang-tsung, 2006).
In general, there are two types of interpolation which are geostatistical and deterministic interpolations.Geostatistical interpolation is a method of surface creation by predicting the values between known values using the statistical approaches (Srivastava, 2006).On the other hand, deterministic interpolation creates surface based on measured points but does not take into account any spatial process occurs within (Anderson et al., 2005a).Several types of deterministic interpolations are Kriging, Inverse Distance Weighting (IDW), Spline and Natural Neighbours.
Kriging was originally developed to estimate the spatial concentrations of minerals for the mining industry and now is widely used in geography and spatial data analysis (Lee, 2004: Tang, 2005).Kriging assumes the distance or direction between the sample points that reflect a spatial correlation that can be used to explain variation in the surface.Kriging is essentially a weighted average technique but its weights depend not only on the distances between sample points and estimation locations but also on the mutual distances among sample points (Cressie, 1993;Desmet, 1997;Anderson et al., 2005a).
IDW assumes the closer a sample point is to the prediction location, the more it will influence the prediction value.
According to Myers (1994), the assigned weight of IDW depends on the distance between the data location whereas the particular location estimation on the relative location between the sample data is not considered.IDW works well for dense and evenly distributed sample points (Child, 2004).Like Kriging, IDW uses a weighted average where the outside range of maximum and minimum sample point value is not estimated.Therefore, some important topographical features such as ridges and valley cannot be generated unless they have been sampled (Lee. 2004) Finally, Spline interpolation uses a mathematical function that minimizes overall surface curvature.According to Johnston et al. (2001), Spline is an interpolator that fit a function to sampled points.Furthermore, this interpolator is able to estimate values that are below the minimum or above the maximum values within the sample data.This makes Spline method effective for predicting ridges and valleys where usually sample date is not included (Child, 2004).

PROBLEM BACKGROUND
As reported by Su et al. 2006), IDW creates DEM with less overall error compared to other methods with an average error value of 0.116 m.On the other hand, Kriging and Spline methods provided the value of 0.133 m and 0.140 m each.He concluded by mentioning the accuracy of DEM varies with the changes in terrain and land cover type (Adams and Chandler, 2002;Hodgson and Bresnahan, 2004;Hodgson et al., 2005;Su and Bork, 2006).Similarly, research by Su and Bork (2006) stated that IDW is the most accurate interpolator with RMSE of 0.02 m lower than Spline and Kriging.
In  et al., 2009).Ideally again, IDW was seen as the best interpolation method.
However, research by Clark et al. (2004) provided a different result where Kriging gave an RMSE of 2.29 m compared to IDW which was 2.47 m.It is important to note that the research was carried out within a tropical rainforest surrounding where a mixture of old-growth terra firm, swamp, secondary and selectively logged forests, as well as agro-forestry plantations, developed areas with buildings and mowed grass, and abandoned pastures with large grasses, shrubs and remnant trees was the case.
There are many opinions regarding on which interpolation method produces the highest accuracy LiDAR DEM.It can be seen that IDW and Kriging are the main competitors when it comes to producing DEM when comparing RMSEs.To date, insufficient number of studies observed this LiDAR DEM interpolation accuracy issues for vegetated area.
According to Rasib et al. (2013), the combination of Kriging and Adaptive TIN (ATIN) algorithm for filtering method provides higher accuracy of DEM with RMSE values of 0.21 m for oil palm, 0.25 m for mixed forest and 0.32 m for mangrove.However, Razak et al. (2013) mentioned that IDW is still preferred as an interpolation technique due to its faster computational duration without adding artefacts to the DTM.
In addition to interpolation technique, slope can also affect the accuracy of DEM.Study by Salleh (2014)  It can be seen that limited number of study which took into account the interpolation accuracy issue for LiDAR DEM within vegetated area are available.Furthermore, issues such as the influence of interpolation method and slope level against LiDAR DEM accuracy are taken for granted within vegetated area.Therefore, this study is carried out to address this accuracy issue and to test the most suitable interpolation method especially for rubber estate and oil palm area.

Aim and Objectives
The aim of this study is to identify the accuracy of LiDAR DEM for vegetated area using different interpolation methods and slope classification.In order to achieve the aim of the study, three objectives have been developed: 1. To review selected interpolation methods 2. To generate DEM according to different interpolation methods and at different slope classes 3. To evaluate the accuracy of DEM using specific criteria Field survey data is observed using tachymetry technique using total station and optical-levelling.In total, there are 132 points and 126 points of field survey data collected in rubber and oil palm areas, respectively.These points are collected based on the local Pahang State Cassini coordinate system, later on transformed to the old Malaysia Rectified Skew Orthomorphic (MRSO) using a general conversion method in GIS software.

Research Methodology
The methodology comprises six phases.The initial phase covers the process of literature review where the exact issue of LiDAR derived DEM was identified.Next was data acquisition.In this phase, LiDAR data that covers two vegetation areas were acquired.On top of that, aerial photo and tachymetry data were acquired.The third phase of data processing generated LiDAR derived DEM and slope map.Later on, phase four involved qualitative and quantitative evaluation through validation and accuracy assessment.Phase five discusses the results and finally, phase six concludes the study by providing some recommendations.The whole methodology is shown in Figure 4.

Ground Filtering of LiDAR data
The purpose of ground filtering is to separate ground and nonground data.The maximum building size defines the area that will have at least one hit on the ground.On the other hand, the maximum terrain angle is the maximum of steepest allowed slope.The maximum iteration angle is the maximum angle between points while the iteration distance makes sure that the iteration does not produce big jump upward when triangles are large.Finally the reduce iteration parameter helps to avoid adding unnecessary point density to the group model.The optimum parameters are selected by examining topographic changes in the study area and comparing unfiltered and filtered results as an output iteratively.For the rubber area, the raw LiDAR points are 32447 and become 2670 after filtered.Meanwhile, for the oil palm area, the raw LiDAR points are 30498 and became 2867 after filtered.Table 2 summarizes the results of raw and filtered data for both oil palm and rubber areas.

Area
Raw points Filtered points Rubber 32447 2670 Oil palm 30498 2867 Table 2. Summarization of raw and filtered points for oil palm and rubber areas

DEM Generation Using Different Interpolation Methods:
Different interpolation methods result in different accuracy of the resultant DEM.Two types of DEM are produced from this study based on different data sources.The first DEM is from filtered LiDAR data while the second one is from tachymetry data.The DEM generated from tachymetry data is used as a reference and to produce slope maps.In this study, several interpolation techniques based on geostatistical approaches are selected due to their familiarity with DEM.The interpolation techniques are Kriging, IDW and Spline.The DEM generated from the filtered LiDAR data is interpolated using 0.5 m and 1 m of spatial resolution.Basically, the determination of spatial resolution is based on density and space between points (Meng et al., 2010).This should be done according to the density of ground points after the filtering and the need of the target application.For example 1 m spatial resolution can be generated with 2 point per m 2 ground points.Certainly for 0.5 m DTM, a higher density of ground points after filteration is needed.
Interpolation is carried out using Geostatistical Analyst Tools in ArcGIS 10.2 software.Initially, the conversion of the LiDAR data from ASCII format to feature class format are carried out that precedes the interpolation step.

Slope Classification and Slope Map:
The quality of slope also contributes to the accuracy of the produced DEM.Here, the slope was generated from the DEM of field survey data.Slope tools in ArcGIS were used to produce the slope.Then the slopes are classified manually by using select by attribute of the raster value produced by slope.

Accuracy Assessment of LiDAR Derived DEM:
Corresponding bare-earth elevations are extracted from each LiDAR derived DEM at each point to assess the effect of slope category and filtering method on DEM error.Several methods are proposed for the assessment of the quality of the DEM.Mean signed error (MSE) and root mean square error (RMSE) are two commonly accepted statistical measurements used to assess DEM accuracy.Several studies used RMSE values based on high-grade in situ surveyed elevations to determine the accuracy of DEM across varying land cover and topography (Hodgson and Bresnahan, 2004;Bater and Coops, 2009;Spaete et al., 2011).Hodgson and Bresnahan (2004) and Su and Bork (2006) embedded MSE in to identify the tendency for under or over estimation of elevations relative to specific treatment classes.
In this study, RMSE is chosen for estimating the errors using the Formula 1:

DEM from Tachymetry Data:
There are two sets of tachymetry data; oil palm and rubber.Both tachymetry data are interpolated using Kriging technique for slope map creation.
Figure 6 shows the results of DEM created from both tachymetry data.
(a) (b) Figure 6.DEM generated from tachymetry data for: (a) rubber area, (b) oil palm area Kriging technique is solely used for generating the DEM from tachymetry data due to its performance.This technique performs better when compared to the other interpolation techniques in most contexts (Arun, 2013).In addition, Kriging produced better estimations of elevation especially when sampling points become sparse (Lloyd and Atkinson, 2006).

LiDAR Derived DEM for the Rubber Area:
Two spatial resolutions of 0.5 m and 1.0 m are tested using three interpolation techniques.Figure 7 shows the LiDAR derived DEM using three interpolation methods with spatial resolution of 0.5 m and 1 m for the rubber area.

Slope Map of Tachymetry Data
Slope is also one of the factors that need to be considered.Slope map was created from the DEM of tachymetry.Figure 9 show the slope map of both oil palm and rubber area.Based on Figure 8, Spline interpolator provides DEM with more corrugated surface, coarser tones and coarser texture compared to the DEM from IDW and Kriging interpolators.For Kriging and IDW techniques, there is no difference in the DEMs produced for both spatial resolutions.However, for Spline, it can be seen that there is a slight difference for both

Accuracy Assessments
The errors are computed based on interpolation methods and slope classes.The results are shown in subsequent tables and graphs.In this section, there are overall RMSE and RMSE based on slope classes.spatial resolutions.The discussion of this section is for two distinctive spatial resolutions, 0.5 m and 1.0 m, both for oil palm and rubber areas.For both spatial resolutions, three interpolations are tested.The calculation of RMSE based on slope class was done manually by classifying the raster value according to the slope range as mentioned in Table 3.

Rubber Area:
The result of RMSE based on three interpolation methods and slope classes for both spatial resolutions of 0.5 m and 1 m is shown in  7. RMSE according to slope classes of 0.5 m and 1m spatial resolutions for oil palm area From previous tables and figures within this section, it can be seen that different spatial interpolation methods have distinctive effects to the LiDAR derived DEM output.In general, Kriging provides the highest accuracy when compared against IDW and Spline for both study areas.This is due to the ability of Kriging where it examines specific sample points to obtain a value for spatial autocorrelation that (a) (b) Figure 13.Graphs of RMSE versus slope class for spatial resolution of 0.5m and 1.0 m for oil pam area is only used for estimating surrounding points rather than assigning a universal distance power value.Furthermore, Kriging allows for interpolated cells to exceed the boundaries of the sample range (David et al. ,2005).
In relation to slope class, it can be seen that slope class 11° -15° give the lowest accuracy for both study areas and spatial resolutions.Accuracy assessment using spatial resolution indicates that, the lower the spatial resolution is, the higher accuracy can be achieved.This is similar with Spaete et al., (2010) where he indicates that RMSE for slope greater than 10 degree provides greater values compared to slope class less than ten degree.

CONCLUSION
Both, interpolation method and slope class have effects towards the accuracy of DEM derived from LiDAR.From this study, IDW method give the lowest accuracy to the DEM while Kriging method can provide high accuracy of DEM produced in both areas.However, spline interpolator give the lowest value of RMSE for rubber area.For class slope, class slope 1 which are in the range of 1 -5 degree give lowest accuracy for both area using different methods.

Figure 1 .
Figure 1.Example of point clouds captured by LiDAR.The yellow points show the treetop while the purple points indicate ground points.
Figure 2 (a) and 3 (b) show the study area for this study while Figure 2 (b) and 3 (b) show the raw LiDAR data.The study area located at Simpang Pelangai, in district of Bentong, Pahang.The areas covered in this study are oil palm (150 m x 100 m) and rubber (200 m x 100 m).The LiDAR data were collected on January 2009 using a REIGL laser scanner mounted on a British Nomad aircraft.The data were delivered in the classified LAS format of threedimensional point cloud.The average LiDAR data sampling density across the area is about 2.2 points per m 2 .
elevation derived from the LiDAR points Z REF = elevation derived from tachymetry data n = total number of points Geostatistical Analyst Tool was again used to compare the results of LiDAR derived DEM against the tachymetry data derived DEM.The prediction of elevation from LiDAR derived DEM is automatically generated in an attribute table with the resultant error.Figure5shows the error generated from LiDAR derived DEM.The calculation of RMSEs was later on done using Microsoft Excel software.

Figure 5 .
Figure 5. Error generated from LiDAR derived DEM.Elevation values are from the reference data while predicted values are elevation from LiDAR derived DEM

Based on Figure 7 ,
Kriging produced the smoothest surface compared to IDW and Spline techniques.Spline technique generates corrugated surface compared to IDW for both 0.5 m and 1 m of spatial resolutions.It is also obvious that the highest elevation is from Spline interpolator at 85.5262 m.4.1.3LiDAR Derived DEM for the Oil Palm Area:Similarly, two spatial resolutions of 0.5 m and 1 m are used for generating DEM of the oil palm area On top of that, three interpolation techniques are again tested which are Kriging, IDW and Spline.Figure8depicts the results of LiDAR derived DEM for 0.5 m and 1.0 m spatial resolutions for oil palm area.

Figure 7 .
Figure 7. DEM generated for spatial resolutions of 0.5 m and 1 m for rubber area using Kriging, IDW and Spline interpolator their research, Liu et al. (2009)obtained RMSEs of 0.165 m by using Kriging, 0.174 m for IDW and 0.150 m using Local Polynomial (LP) on a flat terrain.However, on a complex terrain, the results were significantly different where IDW's RMSE was 0.294 m, Kriging's RMSE was 0.358 m and 0.25 m RMSE for Local Polynomial (LP) for site one which is in flat terrain.The fact that LP provided better results for the study was not significant it is a moderate quick interpolator compared to the IDW (Liu

Table 1 .
Value for Each Parameter in TerraScan The entire non-ground points that include vegetation and building are removed.Filtering of ground point is performed using Adaptive TIN (ATIN) approach that is embedded within TerraScan software.Suitable values of parameters are necessary in order to obtain a better filter.The Table6 and Figure 12.Table7and Figure13show the result of RMSE based on three interpolation methods and slope classes for both, 0.5 m and 1.0 m spatial resolutions.
(b) Figure 12.Graphs of RMSE versus slope class for spatial resolution of 0.5m and 1.0 m for rubber area 4.3.2.2 Oil Palm Area: