Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Acoustic model combination method and device, and voice identification method and system

An acoustic model and speech recognition technology, applied in speech recognition, speech analysis, instruments, etc.

Active Publication Date: 2014-11-26
CANON KK
View PDF6 Cites 9 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0018] The present invention aims to solve the problems described above

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Acoustic model combination method and device, and voice identification method and system
  • Acoustic model combination method and device, and voice identification method and system
  • Acoustic model combination method and device, and voice identification method and system

Examples

Experimental program
Comparison scheme
Effect test

no. 1 example

[0059] Below, will refer to figure 2 A first embodiment of the present invention will be described in detail.

[0060] figure 2 A flow chart of an acoustic model merging method according to an embodiment of the present invention is exemplarily shown.

[0061] In this embodiment, the merging operation is performed on the first acoustic model and the second acoustic model. Among them, the first acoustic model and the second acoustic model are based on the speech data in the training library and use such as a training method based on the maximum likelihood rule (Maximum Likelihood, ML) or a discrimination training method (Discriminative Training, DT) and other methods for training. Here, the speech data in the training database is usually provided by one or more native speakers. For example, the first acoustic model (also called universal acoustic model UAM) that can be used as the main acoustic model can be configured to recognize speech input in multiple languages ​​(for ...

no. 2 example

[0094] In the first embodiment, the distance of a pair of model constituent elements composed of one class of model constituent elements of an acoustic model is weighted using distribution information of modeling units, such as state occupancy probabilities, as a weight. The main difference between the second embodiment and the first embodiment is that the distances of pairs of model constituent elements respectively composed of at least two categories of model constituent elements of the acoustic model can be weighted using the distribution information of the modeling units as weights .

[0095] Below, will refer to image 3 A second embodiment of the present invention will be described in detail.

[0096] image 3 A flow chart of an acoustic model merging method according to the second embodiment of the present invention is exemplarily shown.

[0097] First, in step S301, similar to step S201 in the first embodiment, distribution information of modeling units is obtained....

no. 3 example

[0123] In the first and second embodiments, the first acoustic model is combined with the second acoustic model to obtain a bundled acoustic model. The main difference between the third embodiment and the first and second embodiments is that more than two acoustic models can be combined to obtain a bundled acoustic model.

[0124] Below, will refer to Figure 4 A third embodiment of the present invention will be described in detail.

[0125] Figure 4 A flow chart of an acoustic model merging method according to the third embodiment of the present invention is exemplarily shown.

[0126] Figure 4 Steps S401-S405 in the above may be similar to S201-S205 in the first embodiment, or may be similar to S301-S305 in the second embodiment.

[0127] In step S406, other acoustic models different from the first acoustic model and the second acoustic model can still be merged. The merging with the other acoustic models can adopt the method described above in the first embodiment or...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention relates to an acoustic model combination method and device, and a voice identification method and system. The acoustic model combination method that is used for combining a plurality of acoustic models including a first acoustic model and a second acoustic model comprises the following steps: a distribution information obtaining step; to be specific, obtaining distribution information, being capable of reflecting an importance degree of a modeling unit in a to-be-identified language, of modeling units of at least first acoustic model and the second acoustic model or a model unit of at least the first acoustic model or the second acoustic model; a distance calculation step; to be specific, respectively calculating distances of this type of model forming element pairs formed by model forming elements with the same type of the first acoustic model and the second acoustic model; a weighting step; to be specific, carrying out weighting processing on the distances of this type of corresponding model forming element pairs by using the distribution information; a sorting step; to be specific, sorting of this type of model forming element pairs based on the weighted distances; and a combination step; to be specific, according to the sorting result, combining the first acoustic model and the second acoustic model so as to obtain a combined acoustic model.

Description

technical field [0001] The present invention generally relates to a method for merging acoustic models for automatic speech recognition (ASR), a merging device for acoustic models for automatic speech recognition, a speech recognition method and a speech recognition system, and in particular to a method for merging multiple acoustic models. A method and apparatus for a model, and a speech recognition method and system utilizing the combined acoustic model. Background technique [0002] Acoustic models are one of the most important parts of a speech recognition system. In a speech recognition system, in order to ensure the accuracy of recognition, it is usually necessary to use multiple acoustic models (AM), for example, for different modeling units (such as phonemes, words, characters, initials, finals, etc.) Different AMs, different AMs for different languages, different AMs for different environments (eg, AMs obtained in quiet environments, AMs obtained in noisy environme...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G10L15/14G10L15/187
Inventor 刘贺飞郭莉莉
Owner CANON KK
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products