A kind of lstm neural network training method and device

A neural network training and neural network technology, which is applied in the field of LSTM neural network training methods and devices, can solve the problems of complex Transformer model structure, high training and calculation overhead, etc., and achieve the effects of improving carrying capacity, reducing computational load, and excellent performance.

Active Publication Date: 2022-04-05
CHENGDU SEFON SOFTWARE CO LTD
View PDF9 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0006] The object of the present invention is to: provide a kind of LSTM neural network training method and device, solve the problem that in natural language processing, the Transformer model structure based on attention mechanism is complex, and the training calculation cost is huge

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A kind of lstm neural network training method and device

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0041] A LSTM neural network training method, including training data generated by unlabeled text, after processing keywords in the unlabeled text, weighting the training data according to the keywords, improving the ability of the training data to carry feature information, and weighting the The training data is used for LSTM neural network training. The present invention draws on the physiological basis that human beings focus on key positions or words when acquiring information, and combines the long-short-term memory network LSTM to propose a model training method that does not change the model structure, by changing the weight of key information in the training data , to obtain better model training results.

Embodiment 2

[0043] The difference between this embodiment and Embodiment 1 is that the training data generated by the unlabeled text processes the keywords in the unlabeled text and then weights the training data according to the keywords, so as to improve the carrying capacity of the training data for feature information , the method of using weighted training data for LSTM neural network training includes the following steps:

[0044] S1. Taking the unlabeled text as the training text, and preprocessing the training text;

[0045] S2. Identify the preprocessed training text and generate keywords of the training text;

[0046] S3. Encoding the words in the training text to obtain high-dimensional space continuous word vectors, and performing the same encoding to keywords to obtain keyword vectors;

[0047] S4. Add the keyword vector to the corresponding word vector to weight the word vector to obtain the final training data;

[0048] S5. Input the final training data into the LSTM neur...

Embodiment 3

[0050] The difference between this embodiment and Embodiment 2 is that the method for preprocessing the training text in step S1 includes at least one of cleaning, word segmentation, and removing stop words.

[0051] Further, the keywords in the step S2 include entity keywords, relationship keywords and event keywords. Perform named entity recognition on the preprocessed training text, obtain common named entities such as name, address, organization, time, currency, quantity, etc., and establish entity keywords. Then extract the entity relationship from the preprocessing training text. If there is a relationship between entities, judge whether the entity relationship belongs to common components and wholes, tool usage, member set, cause and effect, entity destination, content and container, information and theme, production Types such as Produced and Entity and Origin, and form relational keywords. Event extraction is performed on the pre-processing training text. If there is...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a LSTM neural network training method and device. The purpose of the invention is to provide a long short-term memory network training method based on a text perception focusing mechanism. The invention uses the mechanism of focusing on key information when humans perceive things and giving more attention weights to neural network model training; applying the word vector model to key information such as entity relationships and events in the text, and calculating entity vectors and event vectors , Entity enhancement, relationship enhancement, and event enhancement are performed on the training data, and the proportion of key information in the training data is increased without changing the network structure, so as to obtain network parameters that are more suitable for the training data and improve the performance of the LSTM neural network.

Description

technical field [0001] The invention relates to the fields of natural language processing and artificial intelligence, in particular to an LSTM neural network training method and device. Background technique [0002] As a representative of the "connectionism" school of artificial intelligence, deep learning technology has made remarkable achievements in the fields of speech, vision, and natural language processing in recent years, and has been realized in the Internet, security, education, medical, industrial manufacturing and other industries. landed. [0003] Human-generated data contains a large number of time series, such as voice signals, audio signals, text, financial data, equipment logs, etc. These data have contextual relationships in the time dimension. The convolutional neural network RNN ​​(Recurrent Neural Network) was therefore invented to "memorize" the previous information by passing the hidden state of each moment to the next moment, and then obtain the abi...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G06N3/04G06N3/08G06F40/289G06F40/295
CPCG06N3/08G06N3/045
Inventor 曾理王纯斌蓝科
Owner CHENGDU SEFON SOFTWARE CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products