Method and device for multi-accent speech recognition

What is AI technical title?
AI technical title is built by Patsnap AI team. It summarizes the technical point description of the patent document.
A speech recognition and accent technology, applied in the field of model training, can solve the problems of large scale of experts, redundant parameters, and inability to quickly adjust the model, so as to improve speech enhancement performance and reduce word error rate

Active Publication Date: 2021-11-02

AISPEECH CO LTD

View PDF15 Cites 2 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0004] In the process of implementing this application, the inventor found that the existing technical solutions have the following defects: using a multi-expert system, each expert has a large scale and redundant parameters, and cannot quickly adjust the model according to the difficulty of accent discrimination

In addition, each accent must have an expert system to focus on the relevant information of the accent, and the model has a large amount of data

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0017] In order to make the purpose, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below in conjunction with the drawings in the embodiments of the present invention. Obviously, the described embodiments It is a part of embodiments of the present invention, but not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without creative efforts fall within the protection scope of the present invention.

[0018] Please refer to figure 1 , which shows a flow chart of an embodiment of the method for multi-accent speech recognition of the present application, a method for multi-accent speech recognition of this embodiment, wherein, for a single speech recognition system, an adaptive Layers are used to learn feature information related to accents, inc...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a method and a device for multi-accent speech recognition. The method for multi-accent speech recognition comprises the steps: adding an adaptive layer for learning accent-related feature information in a coding stage for a single speech recognition system, enabling an accent representation vector to serve as guidance information for each encoder block, inputting into the adaptive layer and guiding a conversion function in the adaptive layer, wherein one encoder is provided with a plurality of encoder blocks which are connected in series; inputting accent irrelevant features into the adaptive layer at the same time; and mixing the accent irrelevant features and the accent representation vector to form accent relevant features. According to the embodiment of the invention, the injection position, accent cardinal number and different types of accent cardinal numbers of the adaptive layer are further discussed so that better accent adaptation is realized.

Description

technical field [0001] The invention belongs to the technical field of model training, and in particular relates to a method and device for multi-accent speech recognition. Background technique [0002] In related technologies, the end-to-end (E2E, End-to-End) Automatic Speech Recognition (ASR) model directly optimizes the probability of the output sequence given the input acoustic features, and has made great progress in various speech corpora. progress. One of the most pressing needs of ASR today is to support multiple accents in a single system, which is often referred to in the literature as multi-accent speech recognition. Difficulties in recognizing accented speech such as phonetics, phonetics, and grammar pose serious challenges to current ASR systems. A simple approach is to build a single ASR model from mixed data (accents from non-native speakers and standard data from native speakers). However, such models often suffer from severe performance degradation due to...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & AuthorityApplications(China)

IPC IPC(8): G10L15/07G10L15/16G10L15/22G10L15/26

CPCG10L15/07G10L15/16G10L15/22G10L15/26Y02T10/40

Inventor钱彦旻龚勋卢怡宙周之恺

OwnerAISPEECH CO LTD

Method and device for multi-accent speech recognition

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology