International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences
Publications Copernicus
Download
Citation
Volume XLII-3/W10
Int. Arch. Photogramm. Remote Sens. Spatial Inf. Sci., XLII-3/W10, 581–584, 2020
https://doi.org/10.5194/isprs-archives-XLII-3-W10-581-2020
© Author(s) 2020. This work is distributed under
the Creative Commons Attribution 4.0 License.
Int. Arch. Photogramm. Remote Sens. Spatial Inf. Sci., XLII-3/W10, 581–584, 2020
https://doi.org/10.5194/isprs-archives-XLII-3-W10-581-2020
© Author(s) 2020. This work is distributed under
the Creative Commons Attribution 4.0 License.

  07 Feb 2020

07 Feb 2020

AN EFFICIENT CLUSTERING METHOD FOR DBSCAN GEOGRAPHIC SPATIO-TEMPORAL LARGE DATA WITH IMPROVED PARAMETER OPTIMIZATION

J. W. Li1,2, X. Q. Han1,2, J. W. Jiang1,2, Y. Hu1,2, and L. Liu1,2 J. W. Li et al.
  • 1Guangxi Key Laboratory of Spatial Information and Geomatics, Guilin University of Technology, Guilin 541004, China
  • 2Guilin University of Technology, Guilin 541004, China

Keywords: Data Mining, Clustering Analysis, DBSCAN Density Clustering

Abstract. How to establish an effective method of large data analysis of geographic space-time and quickly and accurately find the hidden value behind geographic information has become a current research focus. Researchers have found that clustering analysis methods in data mining field can well mine knowledge and information hidden in complex and massive spatio-temporal data, and density-based clustering is one of the most important clustering methods.However, the traditional DBSCAN clustering algorithm has some drawbacks which are difficult to overcome in parameter selection. For example, the two important parameters of Eps neighborhood and MinPts density need to be set artificially. If the clustering results are reasonable, the more suitable parameters can not be selected according to the guiding principles of parameter setting of traditional DBSCAN clustering algorithm. It can not produce accurate clustering results.To solve the problem of misclassification and density sparsity caused by unreasonable parameter selection in DBSCAN clustering algorithm. In this paper, a DBSCAN-based data efficient density clustering method with improved parameter optimization is proposed. Its evaluation index function (Optimal Distance) is obtained by cycling k-clustering in turn, and the optimal solution is selected. The optimal k-value in k-clustering is used to cluster samples. Through mathematical and physical analysis, we can determine the appropriate parameters of Eps and MinPts. Finally, we can get clustering results by DBSCAN clustering. Experiments show that this method can select parameters reasonably for DBSCAN clustering, which proves the superiority of the method described in this paper.