Acoustic model training method and device

An acoustic model and training method technology, applied in speech analysis, speech recognition, instruments, etc., can solve problems such as the inability to achieve good performance of SGD algorithm, second-order optimization without first-order optimization, etc., to reduce deviation, improve performance, and shorten time. Effect

Active Publication Date: 2014-11-12
TENCENT CLOUD COMPUTING BEIJING CO LTD
View PDF4 Cites 59 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, in big data, the second-order parameter optimization method often requires a lot of detailed adjustments. In the absence of prior knowledge, the second-order optimization is often not as robust as the first-order optimization.
Specifically, for DNN modeling in speech recognition, this algorithm does not achieve the good performance of the SGD algorithm

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Acoustic model training method and device
  • Acoustic model training method and device
  • Acoustic model training method and device

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0025] In order to make the object, technical solution and advantages of the present invention clearer, the present invention will be further described in detail below in conjunction with the accompanying drawings.

[0026] The embodiments of the present invention focus on the training of the acoustic model, which is the core step of the speech recognition technology.

[0027] Speech recognition is a serialized classification problem, the purpose of which is to convert a series of collected speech signals into a series of text outputs. Since the voice signals are temporally correlated, that is, the voice data at a certain moment is related to the voice data at several previous moments. In order to simulate the mechanism of speech data generation, the Markov model was introduced into the field of speech recognition. In order to further simplify the complexity of the model, each current state of the Markov model is only related to the state at the previous moment.

[0028] For...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The embodiment of the invention provides an acoustic model training method and device. The method includes the steps of establishing a deep neural network model initial model; dividing voice training data into N non-intersecting data subsets, renewing the deep neural network model initial model for each data subset by means of a stochastic gradient descend algorithm to obtain N deep neural network model sub models, wherein N is an integer larger than or equal to 2; integrating the N deep neural network model sub models to obtain a deep neural network model intermediate model, and judging that the deep neural network model intermediate model is a trained acoustic model when the deep neural network model intermediate model conforms to preset convergence conditions. By means of the acoustic model training method and device, training efficiency of the acoustic model is improved, and performance of voice recognition is not reduced.

Description

technical field [0001] The embodiments of the present invention relate to the technical field of speech recognition, and more specifically, to an acoustic model training method and device. Background technique [0002] Speech recognition is a technology that converts speech signals into text. It is a convenient way of human-computer interaction and is now widely used in mobile Internet and other fields. Speech recognition is a serialized classification problem, the purpose of which is to convert a series of collected speech signals into a series of text outputs. The fields involved in speech recognition technology include: signal processing, pattern recognition, probability theory and information theory, sound mechanism and auditory mechanism, artificial intelligence and so on. [0003] Traditional speech recognition systems are generally divided into three modules, namely: acoustic model, such as the model described by the HMM-GMM system framework; language model, such as ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G10L15/06
Inventor 王尔玉卢鲤张翔刘海波饶丰李露岳帅陈波
Owner TENCENT CLOUD COMPUTING BEIJING CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products