A Speech Recognition Method Based on Domain Invariant Features

A speech recognition and domain-invariant technology, applied in speech recognition, speech analysis, instruments, etc., to achieve the effects of good noise robustness, reduced dimensionality, and reduced labeled data

Active Publication Date: 2021-10-22
WUHAN UNIV OF TECH
View PDF11 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0006] There is currently no method for applying speech domain-invariant feature extraction models to end-to-end speech recognition models

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A Speech Recognition Method Based on Domain Invariant Features
  • A Speech Recognition Method Based on Domain Invariant Features
  • A Speech Recognition Method Based on Domain Invariant Features

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0050] In order to specifically illustrate the purpose, technical solution, advantages and realizability of the present invention, the present invention will be further described below in conjunction with the accompanying drawings and embodiments. It should be understood that the specific examples described here are only used to explain the present invention, not to limit the present invention. In addition, the technical features involved in the various embodiments of the present invention described below may be combined with each other as long as they do not constitute conflicts with each other.

[0051] Such as figure 1 Shown, a kind of speech recognition method based on domain invariant feature, this method comprises the following steps:

[0052] Step 1, constructing the training data set, including two main sub-steps of collecting speech data under different noise environments and annotating the content text corresponding to the speech, as follows:

[0053] (1.1) Collect...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention proposes a speech recognition method based on domain invariant features, and applies the speech domain invariant feature extraction model to the end-to-end speech recognition model. The feature extraction model used in the present invention aims at the problem of robustness. By adding more types of voice data to train the voice feature extraction model, better parameters can be obtained, and a better domain-invariant feature extraction model can be obtained. The speech recognition method based on domain-invariant features, using unlabeled pure speech data to train the feature extraction model, and using a small amount of speech with text annotations to train the end-to-end acoustic model, provides an important way to improve the robustness of the end-to-end acoustic model. technical support. Compared with the prior art, the present invention has higher recognition accuracy in different noise environments, a smaller task load of voice tagging tasks, and faster model training and testing speeds.

Description

technical field [0001] The invention belongs to the field of speech recognition, and relates to a robust speech recognition method in real noise environments, specifically a speech recognition method based on domain invariant features, which can be quickly and conveniently extended to new noise environments. Background technique [0002] In recent years, the application of end-to-end speech recognition models based on deep learning and sequence-to-sequence computing frameworks has become increasingly widespread. However, in the process of actually using speech recognition models, it is inevitable to encounter a variety of noise environments. The recognition accuracy is greatly reduced. Noise robustness refers to the ability of a speech recognition model to maintain the original recognition accuracy in a noisy environment. [0003] At present, the common methods to improve the noise robustness of the speech recognition model are: (1) adding a feature enhancement model for sp...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G10L15/20G10L15/06G10L15/16
CPCG10L15/063G10L15/16G10L15/20
Inventor 熊盛武李梦林泽华徐珊李小其董元杰路雄博刁月月
Owner WUHAN UNIV OF TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products