Speech recognition method and device, electronic equipment and storage medium

What is AI technical title?
AI technical title is built by PatSnap AI team. It summarizes the technical point description of the patent document.
A speech recognition and speech recognition model technology, applied in speech recognition, speech analysis, neural learning methods, etc., can solve problems such as difficulty in recording a large amount of target language speech data, small corpus size, and low balance

Active Publication Date: 2021-12-21

TENCENT TECH (SHENZHEN) CO LTD

View PDF7 Cites 7 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

However, speech recognition for target languages with few data resources (such as Tibetan) is limited by the scope of data resources and language dissemination, and there are few researches on speech recognition for target languages

[0004] In the traditional modeling method, modules such as acoustic model, pronunciation dictionary, and language model need to be constructed separately. Due to the scarcity of corpus resources in the target language, it is difficult to record a large amount of speech data in the target language, resulting in a small corpus, difficult construction of a pronunciation dictionary, and related research. in a single dialect

In addition, the coverage and balance of pronunciation phenomena are low, which makes the recognition rate of the acoustic model obtained by corpus training also low.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0075] In order to make the purpose, technical solutions and advantages of the embodiments of the application clearer, the technical solutions of the application will be clearly and completely described below in conjunction with the accompanying drawings in the embodiments of the application. Obviously, the described embodiments are the Some embodiments of the technical solution, but not all embodiments. Based on the embodiments described in the application documents, all other embodiments obtained by persons of ordinary skill in the art without creative efforts fall within the protection scope of the technical solutions of the present application.

[0076] Some concepts involved in the embodiments of the present application are introduced below.

[0077] Migration learning: refers to transferring the parameters of the trained model (pre-trained model) to the new model to help the new model training. Considering that most of the data or tasks are related, through migration le...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention relates to the technical field of speech recognition, in particular to a speech recognition method and device, electronic equipment and a storage medium, which can be applied to various scenes such as cloud technology, artificial intelligence, intelligent traffic and auxiliary driving and are used for efficiently and accurately realizing speech recognition of multiple dialect target languages. The method comprises the following steps: acquiring to-be-recognized voice data of a target language; extracting voice acoustic features corresponding to each frame of voice data in the to-be-recognized voice data; performing deep feature extraction on the voice acoustic features to obtain corresponding dialect embedding features; encoding the voice acoustic features to obtain corresponding acoustic encoding features; and based on the dialect embedding feature and the acoustic coding feature, performing dialect speech recognition on the to-be-recognized speech data to obtain target text information and a target dialect category corresponding to the to-be-recognized speech data. According to the method, the dialect embedding feature and the acoustic coding feature are combined for comprehensive learning, so that speech recognition for recognizing various dialects can be efficiently and accurately realized.

Description

technical field [0001] The present application relates to the technical field of voice recognition, and in particular to a voice recognition method, device, electronic equipment and storage medium. Background technique [0002] With the rapid development of science and technology, related services based on speech recognition technology have been widely used in people's daily life and work, such as smart speakers and vehicle systems. [0003] In related technologies, the research work on speech recognition is mainly concentrated on some common languages with relatively rich data resources, and the amount of speech data gradually breaks through tens of thousands or even hundreds of thousands of hours. However, speech recognition for target languages with few data resources (such as Tibetan) is limited by data resources and the scope of language dissemination, and there are few researches on speech recognition for target languages. [0004] In the traditional modeling meth...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & AuthorityApplications(China)

IPC IPC(8): G10L15/00G10L15/02G10L15/06G10L15/16G06N3/08

CPCG10L15/005G10L15/02G10L15/063G10L15/16G06N3/084

Inventor颜京豪

OwnerTENCENT TECH (SHENZHEN) CO LTD

Speech recognition method and device, electronic equipment and storage medium

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology