AN INFORMATION SERVICE MODEL FOR REMOTE SENSING EMERGENCY SERVICES

This paper presents a method on the semantic access environment, which can solve the problem about how to identify the correct natural disaster emergency knowledge and return to the demanders. The study data is natural disaster knowledge text set. Firstly, based on the remote sensing emergency knowledge database, we utilize the sematic network to extract the key words in the input documents dataset. Then, using the semantic analysis based on words segmentation and PLSA, to establish the sematic access environment to identify the requirement of users and match the emergency knowledge in the database. Finally, the user preference model was established, which could help the system to return the corresponding information to the different users. The results indicate that semantic analysis can dispose the natural disaster knowledge effectively, which will realize diversified information service, enhance the precision of information retrieval and satisfy the requirement of users.


INTRODUCTION
China is one of the countries which have the most frequent and severe natural disasters in the world.Due to the enormous data size, valid data cannot be retrieved accurately, which will disturb the disaster analysis.
In recent years, many studies focus on the semantic analysis.For  Corresponding author: Dr. Shuhe Zhao, associate professor, E-mail: zhaosh@nju.edu.cnThis paper utilizes the semantic analysis in retrieval and recommendation about emergency data in order to enhance the accuracy and efficiency.The main research contents include three parts: remote sensing emergency semantic network, semantic analysis model and the information service model based on user preference.

DATA
The test data mainly includes emergency knowledge documents such as electric power, earthquake, terrorist activities emergency methods.The expert knowledge mainly includes disaster emergency database which is supplied by Institute of Remote Sensing and Digital Earth Chinese Academy of Sciences.The former is the basic data and the latter is utilized to establish the semantic network and extract the key words in emergency knowledge dataset.

METHOD
The research method goes as following.First, in order to segment the source documents, we combine the word segmentation system with the emergency knowledge database to extract the key words in those unprocessed documents.
Second, according to accurate matching in remote sensing emergency knowledge, this paper establishes a semantic analysis model.In view of document, topics and words, our research establishes the semantic analysis model to realize the accurate emergency services about multivariate information of time and space and provide precision information to the users.
Third, aiming at recommending the information according to the users' preference, our study utilizes the machine learning

Here we utilize the Probabilistic Latent Semantic
The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLII-2/W7, 2017 ISPRS Geospatial Week 2017, 18-22 September 2017, Wuhan, China Analysis(PLSA) to establish the mapping relation between documents, topics and words.PLSA is put forward by T. Hoffmann.

Standardization of semantic weight:
Our research measures the frequency of those key words we extract in 3.1.2according to every documents so that we can obtain a  ×  "document -word" matrix N(d, w).This matrix reflects the main content of the document to some degree, but many highfrequency vocabularies also cannot reflect the correct topic of the documents.Therefore, we should standardize the "document -word" matrix in order to feature the key words which can imply the topic of the articles.
This paper utilizes the "tf-idf" formula to standardize the matrix.
The formula is below: Where freg=the frequency of the key words in the documents.
docFreg=the number of the documents which contain the key words.
numDocs=the total number of the documents.As the result, we can get the joint distribution about pairs of (di, wj ):

PLSA:
PLSA utilizes Expectation Maximization Algorithm(EM), which is an iterative method to find maximum likelihood, to estimate P(zk|di) and P(wj|zk).The detailed procedures are below: a. E step: Utilize the estimated value of the latent variable to calculate the maximum likelihood value.

User preference model
The model will reserve the machine recognizable information and generate a database.Then, utilize the function to filter the database and select valid information to establish the user sample database in the information service model based on user preference.
In order to cluster the information prepared to be pushed, this paper utilizes the "documents-topics" matrix in semantic analysis to calculate the probability center vector of the different documents.The included angle between different documents vectors will express the discrepancy between documents.The formulas are below:

Results
The

Conclusions
This paper selected the typical places prone to natural disasters as application demonstration.
First, we utilize the "ICTCLAS" combined with the emergency knowledge database to segment the source documents data so that we can extract the key words from those documents.Then, considering the view of the vocabulary, and text, the system utilizes the PLSA to establish semantic analysis model.Finally, establish the information service based on user preference which can cluster and recommend the preferred information accurately and efficiently.
The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLII-2/W7, 2017 ISPRS Geospatial Week 2017, 18-22 September 2017, Wuhan, China According to the results, the system will be robust to achieve the function of the emergency requirements in the typical multiple disaster areas.Meanwhile, the system can analysis the requirements of the users accurately and recommend the relevant information, which will realize the rapid respond to the emergency.
example, Moro et al utilized the wikipeadia to analyze the ontological relations between sematic information.Speer et al established the semantic network named "Concept Net" in 2013, which mainly describe the hierarchical relation about different vocabularies.Harrington et al also measured the distance between vocabulary vectors in semantic space and the distance represents the similarity of the different words.
|  ,   ) = (  |  )(  |  ) ∑ (  |  )(  |  ) step: Maximize the maximum likelihood value calculated through E step to estimate the parameter.This paper utilizes formulas (6) and (7) to reestimate the model.(  |  ) = ∑ (  |  )(  |  ,   )  ,  )(  |  ,   ) parameter we get in the M step will be utilized in next E step.E and M step will iterate until the result become convergence.This paper utilizes the formula below to check the convergence condition: () = ∑ ∑ (  ,   ) ∑ (  |  ,   ) log  [(  |  )(  |  )the final result of P(zk|di) and P(wj|zk) to construct two matrixes: U=(P(zk|di))K,I, V=(P(wj|zk))J,K.U represents the probability distribution of latent semantic relations(topics) in the documents and V represents the probability distribution of words in the latent semantic relations.
P(Z|  ) = (P( 1 |  ), P( 2 |  ) … P(  |  ))  (10) Where t = unclassified texts c = type of the texts P(Z|ci) = probability center vector To solve the problem that the information which users prefer is real-time adjusting, our research will establish the information service model mainly on two aspects including selecting the training sample and building the preference based on those training sample.First, our study utilizes the weighted time decay function to filter the sample and select the most representative data during the recent time.Then, utilize the BP neural network algorithm to train the filtered sample data and mine the potential information which can represent the recent users' preferences.

Table 1 .
experiment texts we utilized are mainly provided by China Parts of the "topicswords" matrix In order to evaluate the result of the PLSA, this paper introduce the precision index and recall index as the evaluation indicators.

Table 2 .
Precision and recall ratio