Keyword extraction method integrating theme information and bidirectional LSTM

A technology of topic information and keywords, applied in the field of text processing, can solve the problem of not being able to fully capture contextual semantic information, and achieve a good recall rate.

Pending Publication Date: 2019-06-25
BEIJING INFORMATION SCI & TECH UNIV
View PDF2 Cites 15 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

The traditional LSTM sequence model cannot fully capture contextual semantic information with topic-distinguishing properties in keyword recognition tasks

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Keyword extraction method integrating theme information and bidirectional LSTM
  • Keyword extraction method integrating theme information and bidirectional LSTM
  • Keyword extraction method integrating theme information and bidirectional LSTM

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0037] In order to make the object, technical solution and advantages of the present invention clearer, the present invention will be further described below in conjunction with the accompanying drawings and specific embodiments. It should be understood that the specific embodiments described here are only used to explain the present invention, not to limit the present invention. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without making creative efforts belong to the protection scope of the present invention.

[0038] A keyword extraction method that combines topic information and bidirectional LSTM. Firstly, it combines LDA and Skip-gram model to learn the keyword vector representation of words, and then uses the keyword vector of words as the input of the bidirectional LSTM model to make full use of the bidirectional LSTM model. The temporal memory characteristics of the word simultaneously model ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to a keyword extraction method integrating theme information and bidirectional LSTM. The method comprises the following steps: firstly, combining LDA and Skip-gram model to learnthe topic word vector representation of the word; taking the subject word vector of the word as the input of a bidirectional LSTM model; time memory characteristics of a bidirectional LSTM model arefully utilized, and semantic information of themes of words is modeled at the same time; and finally, outputting a label prediction probability of the word through a softmax function. According to themethod, the context semantic information of different distances can be fully utilized to predict the keywords, the obtained correct rate, recall rate and F value are all good, the keyword recognitioneffect obviously surpassing the prior art is achieved, and the requirements of practical application can be well met.

Description

technical field [0001] The invention belongs to the technical field of text processing, and in particular relates to a method for extracting keywords by integrating subject information and bidirectional LSTM. Background technique [0002] With the advent of the era of big data, network information has shown explosive growth, how to quickly obtain valuable key information from massive literature resources is of great significance to information retrieval and knowledge discovery. In the field of NLP, automatic keyword extraction is the basis for natural language processing tasks such as natural language understanding, automatic summarization, text classification and clustering, and machine translation. Traditional keyword extraction methods often rely on complex features manually set, and the recognition effect is not good. In recent years, with the popularity and development of deep learning theory, deep neural networks have been widely used in various fields such as images,...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/27G06F16/35
Inventor 吕学强董志安游新冬
Owner BEIJING INFORMATION SCI & TECH UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products