Voice adaptive recognition method, system and device and storage medium

A recognition method and self-adaptive technology, applied in speech recognition, speech analysis, instruments, etc., to achieve the effect of improving the recognition rate

Pending Publication Date: 2022-03-15
UNIV OF SCI & TECH OF CHINA
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Adaptation of end-to-end models is therefore expected to mitigate the mismatch in both acoustic and linguistic conditions, making it more challenging than adaptation of hybrid models

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Voice adaptive recognition method, system and device and storage medium
  • Voice adaptive recognition method, system and device and storage medium
  • Voice adaptive recognition method, system and device and storage medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0036] Such as figure 2 As shown, a speech adaptive recognition method mainly includes the following steps:

[0037] Step 1. Use the acoustic model to encode the input speech sequence into a deep feature sequence.

[0038] In the embodiment of the present invention, the acoustic model is a CTC-based end-to-end model.

[0039] Step 2. Apply the CTC criterion to the depth feature sequence, and convert the depth feature sequence into a probability distribution sequence. During the conversion process, each depth feature is activated through several hidden layers of the acoustic model, and at least one hidden layer is activated through the corresponding The attention-based gated scaling adaptation layer generates scaling transformation vectors for the corresponding deep features, and uses the scaling transformation vectors to reweight the activation outputs of the corresponding hidden layers.

[0040] In the embodiment of the present invention, the attention-based gated scaling ...

Embodiment 2

[0096] The present invention also provides a speech recognition system, which mainly includes: an acoustic model, and an attention-based gating scaling adaptive network composed of a plurality of attention-based gating scaling adaptive layers. Based on the acoustic model and the attention-based gated scaling adaptive network, the speech recognition is realized by adopting the solutions introduced in the foregoing method embodiments. The specific identification process has been introduced in detail in the aforementioned method embodiments, so it will not be repeated here.

Embodiment 3

[0098] The present invention also provides a processing device, such as Image 6 As shown, it mainly includes: one or more processors; memory for storing one or more programs; wherein, when the one or more programs are executed by the one or more processors, the One or more processors implement the methods provided in the foregoing embodiments.

[0099] Further, the processing device further includes at least one input device and at least one output device; in the processing device, the processor, memory, input device, and output device are connected through a bus.

[0100] In the embodiment of the present invention, the specific types of the memory, input device and output device are not limited; for example:

[0101] The input device can be a touch screen, an image acquisition device, a physical button or a mouse, etc.;

[0102] The output device can be a display terminal;

[0103] The memory may be random access memory (Random Access Memory, RAM), or non-volatile memory ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a voice adaptive recognition method, system and device and a storage medium, solves the problem of online speaker adaptation of voice recognition acoustic modeling, realizes adaptive learning of a main network (namely an acoustic model) by introducing an auxiliary network (namely a network formed by a plurality of attention-based gating scaling adaptive layers), and simultaneously realizes adaptive learning of the main network (namely the acoustic model). And the auxiliary network adopts a self-attention mechanism to learn more distinctive speaker personality features, so that the recognition rate is improved.

Description

technical field [0001] The invention relates to the technical field of speech signal processing, in particular to a speech recognition method, system, device and storage medium. Background technique [0002] In recent years, with the widespread application of neural networks in the field of speech recognition, the performance of speech recognition systems has been significantly improved. There are currently two mainstream speech recognition systems, one is HMM-based speech recognition system (Graves A, Fernández S, Gomez F, et al. Connectionist temporal classification: labeling unsegmented sequence data with recurrent neural networks[C] / / Proceedings of the 23rd international conference on Machine learning.ACM,2006:369-376.), the other is an end-to-end speech recognition system (Maas A,Xie Z,Jurafsky D,etal.Lexicon-free conversational speech recognition with neural networks[C] / / Proceedings of the 2015 Conference of the North American Chapter of the Association for Computati...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G10L15/06G10L15/07G10L15/16
CPCG10L15/063G10L15/07G10L15/16G10L2015/0631G10L2015/0633G10L2015/0635
Inventor 郭武丁枫林
Owner UNIV OF SCI & TECH OF CHINA
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products