Chinese phonetic symbol keyword retrieving method based on feed forward neural network language model

A technology of neural network model and language model, which is applied in speech analysis, speech recognition, instruments, etc., to achieve the effect of improving performance and speed

Active Publication Date: 2017-06-16
INST OF ACOUSTICS CHINESE ACAD OF SCI +1
View PDF3 Cites 17 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0007] The purpose of the present invention is to overcome the above-mentioned defects in the current method for reducing the computational complexity of the output layer. By modifying the training criteria of the forward neural network language model, the probability regularizatio

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Chinese phonetic symbol keyword retrieving method based on feed forward neural network language model
  • Chinese phonetic symbol keyword retrieving method based on feed forward neural network language model
  • Chinese phonetic symbol keyword retrieving method based on feed forward neural network language model

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0039] The present invention will be further described below in conjunction with the accompanying drawings and specific embodiments.

[0040] Such as figure 1 Shown, based on the Chinese speech keyword retrieval method of forward neural network language model, described method comprises:

[0041] Step 1) Utilize the training sample and adopt the NCE criterion to train the forward neural network language model; specifically include:

[0042] Step 101) When inputting N training samples, simultaneously input a unary probability based on the target word statistics of the training samples;

[0043] Such as figure 2 As shown, the forward neural network language model of the present embodiment includes an input layer, a mapping layer, two layers of hidden layers and an output layer;

[0044] The training samples include input samples and target words; wherein N input samples are: u i (1≤i≤N), each u i by n-1 word history v ij (1≤j≤n-1) composition, u i =(v i1 ,v i2 ,...v i...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a Chinese phonetic symbol keyword retrieving method based on a feed forward neural network language model. The method comprises: (1), an input sample including historical words and target words are inputted into a feed forward neural network model; for each target word wi, a plurality of noise words with probability distribution q (wi) are added and an active output of a last hidden layer is transmitted to the target words and nodes where the noise words are located, and conversion matrixes between all layers are calculated based on an objective function; errors between an output of an output layer and the target words are calculated, all conversion matrixes are updated until the feed forward neural network model training is completed; (2), a target word probability of inputting a word history is calculated by using the feed forward neural network model; and (3), the target word probability is applied to a decoder and voice decoding is carried out by using the decoder to obtain word graphs of multiple candidate identification results, the word graphs are converted into a confusion network and an inverted index is generated; and a keyword is retrieved in the inverted index and a targeted key word and occurrence time are returned.

Description

technical field [0001] The invention belongs to the field of speech recognition, and in particular relates to a Chinese speech keyword retrieval method based on a forward neural network language model. Background technique [0002] In the speech keyword retrieval system, the most commonly used language model is the N-gram language model. However, even when N is a small value, such as 3 or 4, the N-gram language model still faces a serious problem of data sparsity. Therefore, many smoothing algorithms are used to alleviate this problem. However, the model’s estimation of the data that does not appear in the training set still has a large gap with the fully trained data, which leads to the inability to correctly identify these uncommon words and the common words around them, which in turn affects the performance of keyword retrieval . The Feed Forward Neural network language model (FFNNLM) maps each word in the dictionary to a continuous space, so it can provide better pred...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G10L15/16
CPCG10L15/16
Inventor 张鹏远王旭阳潘接林颜永红
Owner INST OF ACOUSTICS CHINESE ACAD OF SCI
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products