A method and system for feature word extraction from document set based on location information
A technology of location information and extraction method, applied in the field of feature word extraction of document sets, can solve the problems of low feature word extraction accuracy and manual correction, and achieve the effect of reducing labor correction cost, providing accuracy, and improving accuracy.
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment Construction
[0042] The present invention provides a method and system for extracting feature words from a document set based on location information. In order to make the purpose, technical solution and effect of the present invention clearer and clearer, the present invention will be further described in detail below with reference to the accompanying drawings and examples. It should be understood that the specific embodiments described here are only used to explain the present invention, not to limit the present invention.
[0043] Term frequency (term frequency, TF) refers to the frequency that a given word appears in the file.
[0044]
[0045] Inverse document frequency (IDF) is a measure of the universal importance of words. The IDF of a specific word can be divided by the total number of documents |D| by the number of documents containing the word|{j:t i ∈d j}|, and take the logarithm of the obtained quotient to get:
[0046]
[0047] The TF-IDF weight is:
[0048] tf·idf...
PUM
Login to View More Abstract
Description
Claims
Application Information
Login to View More 


