Method and device for extracting keywords from text

A keyword and text technology, applied in the field of extracting keywords from text, can solve the problems of large network scale and low efficiency, and achieve the effects of improving efficiency, reducing network scale, and reducing the amount of calculation

Active Publication Date: 2016-08-03
HUAWEI TECH CO LTD +1
View PDF2 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] The above-mentioned existing graph-based keyword extraction method regards each word in the text as a node, so the scale of the network formed is large, and a large number of calculations are required in the process of keyword extraction, resulting in low efficiency

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and device for extracting keywords from text
  • Method and device for extracting keywords from text
  • Method and device for extracting keywords from text

Examples

Experimental program
Comparison scheme
Effect test

specific example

[0040] Carry out semantic class numbering to words with the numbering method shown in Table 1, the concrete example of the organization mode of words in the thesaurus dictionary is as follows:

[0041] Ba01computer#computer#PC

[0042] Ba02 mobile phone #mobile phone

[0043] In the synonym dictionary, each entry starts with a 4-digit number, followed by multiple synonyms separated by the symbol "#". In the example in the above-mentioned synonym dictionary, the entry indicates that the words corresponding to the semantic class number Ba01 include: computer, computer, and PC; the words corresponding to the semantic class number Ba02 include: mobile phone and mobile phone.

[0044] After step 202, a synonym network is formed with the semantic class numbers as nodes. In the synonym network, each semantic class number is used as a node, so for synonyms, since their semantic class numbers are the same, multiple words belonging to the synonym correspond to the same node in the syn...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a method and device for extracting keywords from text. Wherein, the method for extracting keywords from the text includes: performing word segmentation processing on the text; labeling the semantic class numbers for the words obtained by the word segmentation processing; using the semantic class numbers as nodes to form a synonym network; selecting nodes from the synonym network as keywords. By adopting the method and device for extracting keywords from text provided by the present invention, the efficiency of extracting keywords can be improved.

Description

technical field [0001] The invention relates to network technology, in particular to a method and device for extracting keywords from text. Background technique [0002] In the process of displaying webpages to users, the website needs to extract keywords from the text, and determine the content displayed on the webpage according to the keywords. [0003] At present, the graph-based keyword extraction method is used, the words in the text are used as nodes, and the relationship between words is used as edges to connect the words to form an unweighted network graph, and the key words are found by mining special nodes in the network. word. For example, in a graph-based keyword extraction method, words are used as nodes, and words are connected to form an unweighted network graph according to the co-occurrence relationship of words in a certain window, and it is proved that the network has small-world characteristics, Moreover, the words and fundamental concepts that have an ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G06F17/27
Inventor 刘建毅刘正阳谭银燕
Owner HUAWEI TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products