LSTM neural network training method and device

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A neural network training and neural network technology, which is applied in the field of LSTM neural network training methods and devices, can solve the problems of high training calculation overhead and complex Transformer model structure, and achieve the effects of improving carrying capacity, increasing calculation amount, and improving data quality.

Active Publication Date: 2020-02-11

CHENGDU SEFON SOFTWARE CO LTD

View PDF9 Cites 2 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0006] The object of the present invention is to: provide a kind of LSTM neural network training method and device, solve the problem that in natural language processing, the Transformer model structure based on attention mechanism is complex, and the training calculation cost is huge

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment 1

[0041] A LSTM neural network training method, including training data generated by unlabeled text, after processing keywords in the unlabeled text, weighting the training data according to the keywords, improving the ability of the training data to carry feature information, and weighting the The training data is used for LSTM neural network training. The present invention draws on the physiological basis that human beings focus on key positions or words when acquiring information, and combines the long-short-term memory network LSTM to propose a model training method that does not change the model structure, by changing the weight of key information in the training data , to obtain better model training results.

Embodiment 2

[0043] The difference between this embodiment and Embodiment 1 is that the training data generated by the unlabeled text processes the keywords in the unlabeled text and then weights the training data according to the keywords, so as to improve the carrying capacity of the training data for feature information , the method of using weighted training data for LSTM neural network training includes the following steps:

[0044] S1. Taking the unlabeled text as the training text, and preprocessing the training text;

[0045] S2. Identify the preprocessed training text and generate keywords of the training text;

[0046] S3. Encoding the words in the training text to obtain high-dimensional space continuous word vectors, and performing the same encoding to keywords to obtain keyword vectors;

[0047] S4. Add the keyword vector to the corresponding word vector to weight the word vector to obtain the final training data;

[0048] S5. Input the final training data into the LSTM neur...

Embodiment 3

[0050] The difference between this embodiment and Embodiment 2 is that the method for preprocessing the training text in step S1 includes at least one of cleaning, word segmentation, and removing stop words.

[0051] Further, the keywords in the step S2 include entity keywords, relationship keywords and event keywords. Perform named entity recognition on the preprocessed training text, obtain common named entities such as name, address, organization, time, currency, quantity, etc., and establish entity keywords. Then extract the entity relationship from the preprocessing training text. If there is a relationship between entities, judge whether the entity relationship belongs to common components and wholes, tool usage, member set, cause and effect, entity destination, content and container, information and theme, production Types such as Produced and Entity and Origin, and form relational keywords. Event extraction is performed on the pre-processing training text. If there is...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses an LSTM (Long Short Term Memory) neural network training method and device, and aims to provide a long-short term memory network training method based on a text perception focusing mechanism. According to the method, key information is focused when people perceive things, and more attention weight mechanisms are provided for neural network model training; a word vector modeis applied to key information such as entity relations and events in the text; according to the method, the entity vector and the event vector are calculated, entity enhancement, relationship enhancement and event enhancement are performed on the training data, and the proportion of key information in the training data is increased on the premise of not changing the network structure, so that network parameters more suitable for the training data are obtained, and the performance of the LSTM neural network is improved.

Description

technical field [0001] The invention relates to the fields of natural language processing and artificial intelligence, in particular to an LSTM neural network training method and device. Background technique [0002] As a representative of the "connectionism" school of artificial intelligence, deep learning technology has made remarkable achievements in the fields of speech, vision, and natural language processing in recent years, and has been realized in the Internet, security, education, medical, industrial manufacturing and other industries. landed. [0003] Human-generated data contains a large number of time series, such as voice signals, audio signals, text, financial data, equipment logs, etc. These data have contextual relationships in the time dimension. The convolutional neural network RNN (Recurrent Neural Network) was therefore invented to "memorize" the previous information by passing the hidden state of each moment to the next moment, and then obtain the abi...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

IPC IPC(8): G06N3/04G06N3/08G06F40/289G06F40/295

CPCG06N3/08G06N3/045

Inventor 曾理王纯斌蓝科

Owner CHENGDU SEFON SOFTWARE CO LTD

Features

Generate Ideas
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

LSTM neural network training method and device

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment 1

Embodiment 2

Embodiment 3

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology