Unlock instant, AI-driven research and patent intelligence for your innovation.

Method for extracting key word from article

A keyword and article technology, applied in the direction of unstructured text data retrieval, text database clustering/classification, special data processing applications, etc. The effect of precision and recall

Inactive Publication Date: 2014-02-05
广东利为网络科技有限公司
View PDF3 Cites 31 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] The purpose of the present invention is to overcome the deficiencies in the prior art, to provide a method for extracting keywords from articles, to solve the problem that the keyword extraction algorithm in the prior art takes up a lot of resources, and the accuracy of extraction is low, reducing the Reduced system resource occupancy and improved the accuracy of keyword extraction

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method for extracting key word from article

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0015] figure 1 is the flow chart of the method of the first embodiment of the present invention, such as figure 1 As shown, the method includes:

[0016] S101. Preprocess the article to obtain a word set of the text, where the preprocessing includes removing stop words, performing part-of-speech filtering, and constructing a synonym chain;

[0017] It should be noted that the implementation body of the present invention may be a computer or a terminal, which is not limited in this embodiment of the present invention.

[0018] Stop words refer to those function words that cannot reflect the theme. They not only cannot reflect the theme of the literature, but also interfere with the extraction of keywords, so it is necessary to filter them out. Stop words usually include function words, content words, and punctuation marks. For example, when scanning the text and performing word frequency statistics, some real words, function words or punctuation that have no substantial eff...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention provides a method for extracting a key word from an article. The method comprises the following steps: pre-processing the article, and obtaining a word assembly of a text, wherein the pre-processing comprises steps of removing stopwords, filtering word class and establishing a synonym chain; selecting one representative word in the synonym chain, and respectively calculating a word frequency variable value, a regional position variable value and a participle distance sequence variable value, of the word, according to a certain regulation; calculating a weighted value of the word according to the word frequency value, the regional position value and the participle distance sequence value of the word, and judging whether the word is taken as the key word of the article or not according to the weighted value. The method solves the problem in the prior art that the key word extracting algorithm resource occupancy is more, and the accuracy rate of the extracting is low; the occupancy rate of system resource is lowered; the accuracy rate of the key word extracting is improved.

Description

technical field [0001] The invention relates to the field of computer technology, in particular to a method for extracting keywords from articles. Background technique [0002] The Internet has accumulated a large amount of text information, and how to efficiently retrieve text information has become a technical problem that needs to be solved urgently. Text information processing includes text classification, text clustering, text mining and approximate query processing, and keyword extraction in this paper has a wide range of applications in the above areas. An important work of the library, the research on automatic keyword indexing of English text started earlier, and some related systems have been developed. There are mainly GenEx systems implemented by Turney on the basis of the C4.5 decision tree algorithm. The system uses a genetic algorithm to train a keyword extractor, and then the extractor takes documents as input, and outputs keywords after processing; Frank e...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F17/30G06F17/27
CPCG06F16/3335G06F16/35
Inventor 徐波
Owner 广东利为网络科技有限公司
Features
  • R&D
  • Intellectual Property
  • Life Sciences
  • Materials
  • Tech Scout
Why Patsnap Eureka
  • Unparalleled Data Quality
  • Higher Quality Content
  • 60% Fewer Hallucinations
Social media
Patsnap Eureka Blog
Learn More