Deep learning-based text keyword extraction method

A technology of deep learning and extraction methods, applied in neural learning methods, special data processing applications, instruments, etc., can solve problems such as time-consuming and difficult

Inactive Publication Date: 2016-11-09
杭州量知数据科技有限公司
View PDF4 Cites 47 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

It is a difficult and time-consuming job for people to read and summarize the specific content

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Deep learning-based text keyword extraction method
  • Deep learning-based text keyword extraction method
  • Deep learning-based text keyword extraction method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0028] The present invention will be further described below in conjunction with the accompanying drawings and specific embodiments, so that those skilled in the art can better understand.

[0029] A text keyword extraction method based on deep learning, including the following steps:

[0030] 1) Obtain the text keyword database, which contains a large number of "text-keyword" pairs, "text-keyword" pairs are a set of text and the corresponding keywords of the text, each text can have multiple keywords word. Specifically, a large amount of text can be grabbed from the Internet, and then use traditional keyword extraction technology to mark keywords, or use crowdsourcing technology to mark keywords under specific needs.

[0031] 2) Use word2vec to convert "text-keywords" into word vectors. In order to avoid problems such as gradient disappearance in the training process, this embodiment uses an LSTM network. The text and keywords are converted into d-dimensional word ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a deep learning-based text keyword extraction method. The method comprises the following steps of: firstly training a recurrent neural network model, wherein the used training data comprise a large amount of texts and keywords thereof, and the training target is maximizing text-based condition probability of the keywords; converting each text and the keyword thereof into word vectors, inputting the word vectors into the recurrent neural network model and updating network parameters by using a random gradient descent method; and after the model training is finished, converting a section of text, the keyword of which is to be extracted, into a word vector, inputting the word vector into the trained recurrent neural network model so as to generate the keyword of the section of text. According to the method disclosed by the invention, the extraction of text keywords is realized by learning an end-to-end model through data driving; and compared with the traditional statistics and linguistics-based method, the method disclosed by the invention is stronger in adaptability, and can be used for obtaining different models according to different training data so as to extract keywords according to the requirements of specific fields.

Description

technical field [0001] The present invention relates to text keyword extraction technology and deep learning technology, in particular to a text keyword extraction method based on deep learning. Background technique [0002] Text keywords are usually used in search engines or database engines and can be used to determine the similarity of two texts. It is a difficult and time-consuming job for people to read and summarize the specific content of the text. In today's era of information explosion and massive text, it is almost impossible. Therefore, in more cases, the keyword extraction of the text is automated. The present invention relates to the keyword extraction technology of the text, so the relevant text keyword extraction technology is briefly reviewed below. [0003] Keywords are representative words in a piece of text. By extracting keywords, on the one hand, users can quickly browse the general theme of the text. On the other hand, keywords can also be used for si...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/27G06N3/08
CPCG06N3/08G06F40/279
Inventor 凌立刚朱海鹏
Owner 杭州量知数据科技有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products