Method for extracting key phrases based on lexical chain
A technology of key phrases and vocabulary chains, which is applied in special data processing applications, instruments, and electronic digital data processing, etc., can solve the problems of low coverage of accurate document topic information and the inability of keyword extraction methods to accurately reflect topic information, etc., to achieve Effects of increased speed, reduced dimensionality, and avoidance of redundancy
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
specific Embodiment approach 1
[0029] Specific implementation mode one: combine figure 1 Illustrate this implementation mode, a kind of key phrase extracting method based on lexical chain, be based on computer realization, "HowNet" dictionary is housed in this computer, the specific steps of method are:
[0030] Step 1: take the document of the article to be processed as the extraction object, and obtain the word meaning in the document;
[0031] Step 2: Use the dictionary "HowNet" to disambiguate words, and filter out the abstract sememes in "HowNet";
[0032] Step 3: Construct a lexical chain for the disambiguated words, obtain a set L of lexical chains, and obtain multiple strong chains;
[0033] Step 4: Select a core word from each strong chain, and use these core words to form the core word set of the document;
[0034] Step 5: Calculate the co-occurrence rate between different core words in the core word set, and select the core word whose co-occurrence rate is greater than the extraction threshold ...
specific Embodiment approach 2
[0037] Specific embodiment two: this embodiment is a further description of step 1 in the method for extracting key phrases based on lexical chains described in specific embodiment 1. The steps for obtaining word meanings described in step 1 are:
[0038] Step A: Perform word segmentation and stop word filtering on the document to obtain the word space WordSet of the document;
[0039] Step B: Sequentially scan the word space WordSet to obtain the meaning of each word in the word space WordSet one by one. The process of obtaining the meaning of each word is:
[0040] Step B1: Set the word sequence in the document as: M1, M2, M, M3, M4, where M is the word whose meaning is currently to be determined, and M1, M2, M3, M4 is the context information of M, such as figure 2 as shown, figure 2 The vertices in represent the sense classes corresponding to each word, and the edges between the vertices are the degree of association between the sense classes;
[0041] Step B2: From f...
specific Embodiment approach 3
[0043] Specific embodiment three: This embodiment is a further description of step 2 in the method for extracting key phrases based on lexical chains described in specific embodiment 1. The dictionary "HowNet" described in step 2 is a word database , stored on the computer hard drive.
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com