Speech recognition method and system

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A speech recognition and speech recognition model technology, applied in speech recognition, speech analysis, instruments, etc., can solve problems such as high data procurement costs, difficulty in modifying the framework, and restrictions on the integration of advanced models, so as to reduce purchase costs, ensure consistency, The effect of improving accuracy

Pending Publication Date: 2022-04-15

北京恒天瑞讯科技有限公司

View PDF0 Cites 0 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

Among the above-mentioned technologies, the former one treats the model as a black box, trains the speech and text annotations, and directly obtains the text by inputting the speech. The disadvantage of this technology is that the model training requires very rich speech data and annotations. Corresponding to extremely high data procurement costs

The latter technique is to strictly fill in the training data and parameters according to the regulations of the framework, and the intermediate links are very tightly coupled, so the disadvantage is that the framework is difficult to modify, which limits the integration of other advanced models

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment

[0057] like figure 1 As shown, in the first aspect, the embodiment of the present invention provides a speech recognition method, comprising the following steps:

[0058] S1. Construct a speech recognition model based on the Transformer model and the WFST model;

[0059] S2. Obtain an audio signal to be identified;

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a voice recognition method and system, and relates to the technical field of voice recognition. The method comprises the following steps: constructing a speech recognition model based on a Transform model and a WFST model; acquiring a to-be-recognized audio signal; detecting a voice signal in the to-be-recognized audio signal to obtain a target voice signal; performing transformation processing on the target voice signal to obtain a voice feature vector sequence; performing transformation processing on the voice feature vector sequence to obtain a target voice feature sequence; inputting the target speech feature sequence into a speech recognition model; recognizing the target speech feature sequence through a Transform model in the speech recognition model, and outputting a phoneme sequence; and inputting the phoneme sequence into a WFST model in the speech recognition model, and outputting a Chinese character sequence to complete speech recognition. According to the invention, the data purchase cost can be effectively reduced, and the accuracy of speech recognition is ensured.

Description

technical field [0001] The present invention relates to the technical field of voice recognition, in particular, to a voice recognition method and system. Background technique [0002] Speech recognition is essentially the process of converting an audio sequence to a text sequence, that is, to find the text sequence with the highest probability given a speech input. Based on the Bayesian principle, the speech recognition problem can be decomposed into the conditional probability of the occurrence of this speech in a given text sequence and the prior probability of the occurrence of the text sequence. The model obtained by modeling the conditional probability is the acoustic model. The model obtained by modeling the prior probability of the text sequence is the language model. [0003] An acoustic model is the output of converting speech into an acoustic representation, that is, finding the probability that a given speech originates from an acoustic symbol. For acoustic sym...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Applications(China)

IPC IPC(8): G10L15/16G10L15/02G10L25/78

Inventor 姜松

Owner 北京恒天瑞讯科技有限公司

Speech recognition method and system

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology