Method and device for extracting keyword based on graph model

An extraction method and keyword technology, applied in the field of keyword extraction, can solve the problems of not being able to determine which word is important, extracting meaningless, unable to accurately extract keywords, etc.

Active Publication Date: 2017-09-01
BEIJING QIYI CENTURY SCI & TECH CO LTD
View PDF5 Cites 75 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] However, the existing keyword extraction methods based on graph models are only based on the current text. If the current text is a short text, all the words in the text appear only once, and it is impossible to determine which word is important. Therefore, each Words may be extracted, resulting in the inability to accurately extract keywords; if the current text is a long text, some words that appear many times (such as "because of", "probably", etc.), because they have votes for themselves, make their own The number of votes is too high, resulting in the importance of these repeated words, but the extraction of these words itself is meaningless, resulting in low accuracy of keyword extraction
In short, only based on the current text to extract keywords, no matter whether the current text is long text or short text, some words in the text are extracted as keywords because of scattered semantics or high frequency of occurrence, which eventually leads to keyword extraction. The accuracy rate is not high

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and device for extracting keyword based on graph model
  • Method and device for extracting keyword based on graph model
  • Method and device for extracting keyword based on graph model

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0076] The following will clearly and completely describe the technical solutions in the embodiments of the present invention with reference to the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are only some, not all, embodiments of the present invention. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without making creative efforts belong to the protection scope of the present invention.

[0077] The keyword extraction method based on the graphical model is an effective method for extracting keywords. Among them, the graphical model is a general term for a class of technologies that use graphs to represent probability distributions. A text can be mapped to a word-based The relationship between nodes and words is a network graph of edges. Among them, the keyword extraction based on the graph model is an important basic work, which plays a key rol...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The embodiment of the invention provides a method and a device for extracting a keyword based on a graph model. The method comprises the steps of acquiring a to-be-processed text, and segmenting words of the to-be-processed text to obtain candidate keywords corresponding to the to-be-processed text; finding out word vectors corresponding to the candidate keywords from a word vector model, wherein the word vector model includes the word vectors of the candidate keyword; constructing a word similarity matrix of the candidate keywords according to the word vectors; acquiring a language database corresponding to the to-be-processed text, calculating global information of the candidate keywords in the language database to obtain a global weight of the candidate keywords, and taking the global weight as an initial weight of the candidate keywords, wherein the global information represents the importance degree of the candidate keywords in the language database, and the language database at least includes a search log and a network document; and ranking the candidate keywords according to the initial weight and the word similarity matrix of the candidate keyword, and extracting the keyword of the to-be-processed text. By use of the embodiment, the keyword extraction accuracy rate is effectively improved.

Description

technical field [0001] The invention relates to the technical field of keyword extraction, in particular to a method and device for extracting keywords based on a graphical model. Background technique [0002] At present, there are various keyword extraction methods, such as semantic-based keyword extraction methods, webpage-based keyword extraction methods, etc. , this method is simpler and more direct, does not require training, and has better results, so it has been widely used. [0003] Existing keyword extraction methods based on graphical models, by dividing the text into several components (words, sentences), and establishing a graphical model, use the voting mechanism to sort the components in the text, and then select the top-ranked components unit as a keyword. Specifically, the given text is first divided into complete sentences. Then perform word segmentation and part-of-speech tagging for each sentence to obtain words and part-of-speech tags corresponding to ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30G06F17/27G06N3/08
CPCG06F16/3344G06F40/284G06N3/084
Inventor 王亮
Owner BEIJING QIYI CENTURY SCI & TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products