Automatic speech recognition method based on random depth delay neural network model

An automatic speech recognition and neural network model technology, applied in speech recognition, speech analysis, instruments, etc., can solve the problems of limiting neural network learning ability, model parameter growth, gradient disappearance, etc., to solve overfitting and gradient disappearance, Enhanced modeling ability, the effect of strong modeling ability
CN109065033AActive Publication Date: 2018-12-21SOUTH CHINA UNIV OF TECH

Patent Information

Authority / Receiving Office
CN · China
Current Assignee / Owner
SOUTH CHINA UNIV OF TECH
Publication Date
2018-12-21

Smart Images

  • Figure 1
    Figure 1
  • Figure 2
    Figure 2
  • Figure 3
    Figure 3
Patent Text Reader

Abstract

The invention belonging to the field of automatic speech recognition technology relates to an automatic speech recognition method based on a random depth delay neural network model. The method comprises: preparing training data; extracting acoustic features from trained speech audio data; training a traditional GMM-HMM model and carrying out forced alignment on the trained speech audio data by using the trained GMM-HMM model to obtain a corresponding frame level training label; supervising and training a random-depth-based time-delay neural network model by using the trained speech audio dataand the corresponding frame level training label and acquiring an acoustic model by combining a hidden Markov model; carrying out training by using corresponding text annotation data or texts of otherdata sets to obtain a trained language model; and constructing an automatic speech recognition decoder by using the trained language model and acoustic model. Therefore, the modeling ability of the model is strengthened and problems of over-fitting and gradient disappearing during the training process are solved, so that the accuracy of the speech recognition is improved.
Need to check novelty before this filing date? Find Prior Art

Description

technical field

[0001] The invention belongs to the technical field of automatic speech recognition, and relates to an automatic speech recognition method based on a random deep time-delay neural network model. Background technique

[0002] With the continuous development of deep learning technology, the scope of automatic speech recognition in practical applications is becoming wider and wider, such as Apple Siri and Amazon Alexa, and it continues to penetrate into people's work, study and life. Therefore, there is an increasing demand for models that are more robust and capable of modeling.

[0003] The main task of automatic speech recognition is to find a way to achieve the same recognition rate as human beings under the premise of effectively solving different environmental factors (such as speakers, vocal channels, etc.). features, the corresponding text is obtained by decoding the acoustic model and the language model. Traditional acoustic modeling uses Gaussian mix...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More