Recurrent neural network language model training method, device, equipment and medium

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A neural network and language model technology, applied in the field of artificial intelligence, can solve problems such as hindering applications

Active Publication Date: 2019-07-23

MOBVOI INFORMATION TECH CO LTD

View PDF3 Cites 0 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0004] However, in order to pursue better language expression ability, large RNNLM models are often required, and it is precisely because of the large storage capacity and amazing computing costs of large RNNLMs that hinder their application in real-time application scenarios

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment 1

[0026] figure 1 It is a flow chart of a recursive neural network language model training method provided in Embodiment 1 of the present invention. This embodiment is applicable to the training situation of a recurrent neural network language model used for language text recognition. The method can be composed of recursive Neural network language model training device to perform, specifically includes the following steps:

[0027] S110. Input the language text in the corpus into the trained high-rank recurrent neural network language model RNNLM and the lightweight RNNLM to be trained respectively.

[0028] In this embodiment, the corpus includes Penn Treebank (PTB) corpus and / or Wall Street Journal (WSJ) corpus. Among them, the PTB corpus contains a total of 24 parts, the vocabulary size is limited to 10000, and the label Indicates out-of-set words. Select part or all of the predictions in the PTB corpus as the training set, and input the language text in the training set i...

Embodiment 2

[0042] In the process of model training, it is found that the training process of the student model still has the following two defects: First, in the language model, each training data label vector represents a degenerated data distribution, which gives the corresponding language text Likelihood on a category. Compared to the possibility distribution obtained by the teacher model in all training data, that is, the probability that the corresponding language text falls on all labels, the degenerate data distribution has more noise and localization. Second, different from the previous experimental results of knowledge distillation in acoustic modeling and image recognition, in this embodiment, it is found in the experiment of language text recognition that when the cross-entropy loss and KL divergence have fixed weights, by minimizing The weighted sum of the cross-entropy loss and the KL divergence yields a student model that is inferior to that obtained by minimizing the KL ...

Embodiment 3

[0098] image 3 It is a schematic structural diagram of a recurrent neural network language model training device provided in Embodiment 3 of the present invention. Such as image 3 As shown, an input module 31 and a minimization module 32 are included.

[0099] The input module 31 is used to input the language text in the corpus into the high-rank recursive neural network language model RNNLM and the lightweight RNNLM to be trained respectively;

[0100] Minimize module 32, be used for iterating the parameter in lightweight RNNLM, minimize the weighted sum of cross entropy loss and Kullback-Leibler divergence, to complete the training to lightweight RNNLM;

[0101] Among them, the cross-entropy loss is the cross-entropy loss of the output vector of the lightweight RNNLM relative to the training data label vector of the language text, and the Kullback-Leibler divergence is the Kullback of the output vector of the lightweight RNNLM relative to the output vector of t...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The embodiment of the invention discloses a recursive neural network language model training method, device, equipment and medium. Among them, the method includes: inputting the language text in the corpus into the trained high-rank recursive neural network language model RNNLM and the lightweight RNNLM to be trained; iterating the parameters in the lightweight RNNLM to minimize the cross-entropy loss and the weighted sum of the Kullback‑Leibler divergence to complete the training of the lightweight RNNLM; where the cross-entropy loss is the cross-entropy loss of the output vector of the lightweight RNNLM relative to the training data label vector, and the Kullback‑Leibler divergence is The Kullback‑Leibler divergence of the output vector of a lightweight RNNLM with respect to the output vector of a high-rank RNNLM. The method provided in this embodiment can effectively reduce the scale of RNNLM.

Description

technical field [0001] The embodiments of the present invention relate to the field of artificial intelligence, and in particular, to a recursive neural network language model training method, device, equipment and medium. Background technique [0002] Recurrent Neural Network (RNN) has large storage capacity and strong computing power, which makes it have great advantages over traditional language modeling methods, and is now widely used in language modeling. [0003] The Recurrent Neural Network Model (RNNLM) is a model proposed by Mikolov in 2010. By using the Recurrent Neural Network (RNN) to train the language model, a better expression effect can be obtained. RNNLM expresses each word in a continuous, low-dimensional space, and has the ability to represent historical information of various lengths through a recursive vector. [0004] However, in order to pursue better language expressiveness, large RNNLM models are often required, and it is precisely because of the la...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Patents(China)

IPC IPC(8): G06N3/04

CPCG06N3/045

Inventor 施阳阳黄美玉雷欣

Owner MOBVOI INFORMATION TECH CO LTD

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Recurrent neural network language model training method, device, equipment and medium

What is Al technical title? Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document. A neural network and language model technology, applied in the field of artificial intelligence, can solve problems such as hindering applications

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment 1

Embodiment 2

Embodiment 3

PUM

Abstract

Description

Claims

Application Information

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A neural network and language model technology, applied in the field of artificial intelligence, can solve problems such as hindering applications

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology