The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences
Publications Copernicus
Download
Citation
Articles | Volume XL-4/W3
Int. Arch. Photogramm. Remote Sens. Spatial Inf. Sci., XL-4/W3, 63–69, 2013
https://doi.org/10.5194/isprsarchives-XL-4-W3-63-2013
Int. Arch. Photogramm. Remote Sens. Spatial Inf. Sci., XL-4/W3, 63–69, 2013
https://doi.org/10.5194/isprsarchives-XL-4-W3-63-2013

  13 Nov 2013

13 Nov 2013

Key Spatial Relations-based Focused Crawling (KSRs-FC) for Borderlands Situation Analysis

D. Y. Hou1,2, H. Wu2, J. Chen2, and R. Li2 D. Y. Hou et al.
  • 1School of Environment Science and Spatial Informatics , China University of Mining and Technology, Xuzhou,Jiangsu,221116, China
  • 2National Geomatics Center of China, 28 Lianhuachi West Road, Beijing 100830, China

Keywords: Focused Crawling, Place Names, Web Information Collection, Borderlands Situation Analysis, Relevance Calculation, Spatial Relations

Abstract. Place names play an important role in Borderlands Situation topics, while current focused crawling methods treat them in the same way as other common keywords, which may lead to the omission of many useful web pages. In the paper, place names in web pages and their spatial relations were firstly discussed. Then, a focused crawling method named KSRs-FC was proposed to deal with the collection of situation information about borderlands. In this method, place names and common keywords were represented separately, and some of the spatial relations related to web pages crawling were used in the relevance calculation between the given topic and web pages. Furthermore, an information collection system for borderlands situation analysis was developed based on KSRs-FC. Finally, F-Score method was adopted to quantitatively evaluate this method by comparing with traditional method. Experimental results showed that the F-Score value of the proposed method increased by 11% compared to traditional method with the same sample data. Obviously, KSRs-FC method can effectively reduce the misjudgement of relevant webpages.