Speech recognition model training method and device, equipment and medium

What is AI technical title?
AI technical title is built by Patsnap AI team. It summarizes the technical point description of the patent document.
A speech recognition model and training method technology, applied in speech recognition, speech analysis, instruments, etc., can solve the problems of large amount of calculation, low text efficiency, long translation lag time, etc., to achieve the effect of enhancing audio information and saving labor costs

Pending Publication Date: 2021-12-31

PING AN TECH (SHENZHEN) CO LTD

View PDF0 Cites 6 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0003] Online speech translation generally involves two steps. The first is to perform speech recognition, that is, to convert the speech signal of the first language input by the user into text; the second is to translate the text online through a machine translation device to obtain the first translation result. The text of the second language, and finally provide the user with the text or voice information of the second language. However, the voice recognition in the existing scheme is usually obtained by using a large number of voice samples marked by artificial inefficiency, and the trained voice recognition model The complex structure and large amount of calculation lead to low efficiency of the output text, and finally there is a long translation lag time, resulting in poor real-time online voice translation and low user experience satisfaction

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0029] The following will clearly and completely describe the technical solutions in the embodiments of the present invention with reference to the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are some of the embodiments of the present invention, but not all of them. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without creative efforts fall within the protection scope of the present invention.

[0030] The speech recognition model training method provided by the present invention can be applied in such as figure 1 In the application environment of , where the client (computer device or terminal) communicates with the server through the network. Wherein, clients (computer devices or terminals) include but are not limited to various personal computers, notebook computers, smart phones, tablet computers and portable wearable devices. The server ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention relates to the field of artificial intelligence, and provides a speech recognition model training method and device, equipment and a medium, and the method comprises the steps: obtaining a speech sample set containing speech samples; inputting the speech sample into an initial recognition model; obtaining a to-be-processed audio clip through audio enhancement processing; performing teacher acoustic feature extraction through a teacher network in the initial recognition model to obtain a first feature vector, and performing student acoustic feature extraction through a student network in the initial recognition model to obtain a second feature vector; performing alignment comparison processing in combination with a dynamic queue in a teacher network to obtain a loss value; and when the loss value does not reach a preset convergence condition, carrying out iterative updating until convergence, and obtaining a trained speech recognition model. According to the invention, common speech recognition through the teacher network and the student network is realized, and the training efficiency is improved. The method is suitable for the field of artificial intelligence, and can further promote the construction of a smart city.

Description

technical field [0001] The invention relates to the field of speech recognition of artificial intelligence, in particular to a speech recognition model training method, device, computer equipment and storage medium. Background technique [0002] Speech translation is the process of converting one natural language (source language) into another natural language (target language). Unlike traditional machine translation, the input of voice translation is directly voice, and the output is text. With international communication With the increase of the number of people, different languages are used to communicate more and more frequently. In order to overcome language communication barriers, online voice translation based on clients has been widely used. [0003] Online speech translation generally involves two steps. The first is to perform speech recognition, that is, to convert the speech signal of the first language input by the user into text; the second is to translate th...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & AuthorityApplications(China)

IPC IPC(8): G10L15/06G10L15/02

CPCG10L15/063G10L15/02

Inventor李泽远王健宗

OwnerPING AN TECH (SHENZHEN) CO LTD

Speech recognition model training method and device, equipment and medium

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology