Optimization method and system for single-channel speech recognition model

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A speech recognition model, a single-channel technology, applied in speech recognition, speech analysis, neural learning methods, etc., can solve the problems of complex models, poor performance, cumbersome training process, etc., and achieve model simplification, good performance, and improved model performance Effect

Active Publication Date: 2021-06-22

AISPEECH CO LTD

View PDF5 Cites 0 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0005] In order to at least solve the traditional model in the prior art is more complex, the training process is cumbersome, the training effect is not good, and the performance is poor

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment approach

[0062] As an implementation manner, the joint error determined according to the knowledge distillation loss and the direct loss includes:

[0063] The knowledge distillation loss and the direct loss are weighted and summed according to a preset training mode to determine a joint error.

[0064] In order to meet different recognition requirements, different training modes can be set according to different usage environments during the training process. Then, through different weighting ratios, speech recognition models that meet different needs are trained.

[0065] It can be seen from this embodiment that by setting different training modes, during the training process, the joint error of the knowledge distillation loss and the direct loss is determined according to different weight ratios, so as to meet different requirements of the recognition environment. Thus, the recognition effect of the speech recognition model is improved.

[0066] For further specific implementation...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The embodiment of the present invention provides an optimization method for a single-channel speech recognition model. The method includes: receiving each single-person voice with a real label vector, mixing the voices of multiple people, inputting the voice features extracted from each single-person voice to a target teacher model, and obtaining a target soft label vector corresponding to each single-person voice; The multi-person mixed speech is input to the end-to-end student model, and the output arrangement is determined; according to the output label vector of each person in the multi-person mixed speech with the determined output arrangement, the knowledge distillation loss and direct loss are determined; when determined according to the knowledge distillation loss and direct loss The end-to-end student model is optimized according to the joint error when the joint error of . The embodiment of the present invention also provides an optimization system for a single-channel speech recognition model. In the embodiment of the present invention, it is easier to learn good parameters, and at the same time, the model is relatively simplified, and the better parameters enable the trained student model to have better performance.

Description

technical field [0001] The invention relates to the field of speech recognition, in particular to an optimization method and system for a single-channel speech recognition model. Background technique [0002] With the development of intelligent voice, more and more devices have the function of speech recognition. However, due to the consideration of the usage scenarios of different devices, some devices are only equipped with a single microphone, and some devices are equipped with multiple microphones. Microphones are so-called single-channel and multi-channel. Since there is only a single microphone, this type of device has poor recognition performance when dealing with speech conversations such as banquets where multiple people speak at the same time and are mixed together. For this purpose, the knowledge distillation method of single-channel multi-speaker speech recognition based on bidirectional long-short-term memory network-cyclic neural network, or an end-to-end sing...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Patents(China)

IPC IPC(8): G10L15/06G10L15/16G06N3/08G06N3/02

CPCG06N3/02G06N3/08G10L15/063G10L15/16

Inventor 钱彦旻张王优常煊恺

Owner AISPEECH CO LTD

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Optimization method and system for single-channel speech recognition model

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment approach

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology