A keyword extraction method and device

An extraction method and technology of keywords, applied in the computer field, can solve the problems of not considering the structural relationship of words within the text, and not being able to reflect the influence of adjacent words well.

Active Publication Date: 2019-01-22
POTEVIO INFORMATION TECH
View PDF4 Cites 26 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, the classic TextRank algorithm does not depend on other training corpus, does not consider the word structure relationship within the text, and builds a graph model for keyword extraction, so it cannot reflect the influence of adjacent words very well.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A keyword extraction method and device
  • A keyword extraction method and device
  • A keyword extraction method and device

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0097] In order to make the purpose, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly described below in conjunction with the drawings in the embodiments of the present invention. Obviously, the described embodiments are the Some, but not all, embodiments are invented. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without making creative efforts belong to the protection scope of the present invention.

[0098] The embodiment of the present invention provides a keyword extraction method, figure 1 It is a schematic flow diagram of the keyword extraction method in the embodiment of the present invention, such as figure 1 Said, said method comprises:

[0099] Step S101, obtain webpage text information, pre-process described webpage text information, obtain the sequence of candidate k...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The embodiment of the invention provides a keyword extraction method and device. The method comprises the following steps: acquiring web page text information; pre-processing the web page text information to obtain a sequence of candidate keywords; Constructing the candidate keyword map according to the sequence of the candidate keywords, obtaining a similarity value between each candidate keyword and other candidate keywords in the sequence of the candidate keywords according to the candidate keyword map operation, and using the similarity value as an initial weight value of each candidate keyword; According to the initial weight value of each candidate keyword, calculating the convergence weight value corresponding to each candidate keyword, sorting the convergence weight value corresponding to each candidate keyword by size value, and extracting the target keywords of the web page text information in each candidate keyword by sorting the convergence weight value of each candidate keyword according to the size value of the convergence weight value. The embodiment of the invention improves the initial weight algorithm of the TextRank algorithm, and realizes more efficient extraction of keywords in the web page text information.

Description

technical field [0001] The invention relates to the field of computer technology, in particular to a keyword extraction method and device. Background technique [0002] The purpose of text keyword extraction is to highly condense the subject of the text and quickly obtain the core content of the text. Keyword extraction plays an important role in automatic summarization of news and academic papers, social tagging, text topic extraction and other fields. [0003] Keyword extraction can be divided into supervised and unsupervised from the perspective of whether the corpus is marked. Among them, the typical representative of supervised keyword extraction can regard keyword extraction as a binary classification problem. For any vocabulary in a text, a binary judgment is made, that is, whether it belongs to a keyword or a non-keyword binary classification. This method It is required to manually mark keywords in advance on the document set corpus, carry out classification model ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/27G06F16/953
CPCG06F40/211G06F40/284Y02D10/00
Inventor 张春荣
Owner POTEVIO INFORMATION TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products