Methods and devices for training voice recognition model and recognizing voice

What is AI technical title?
AI technical title is built by Patsnap AI team. It summarizes the technical point description of the patent document.
A speech recognition model and speech recognition technology, applied in speech recognition, speech analysis, instruments, etc., can solve the problems of small space distance, poor generalization ability of speech recognition model, and difficulty in improving the accuracy of speech recognition model.

Inactive Publication Date: 2020-02-07

BEIJING DIDI INFINITY TECH & DEV

View PDF6 Cites 5 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0003] However, due to the small space distance in the car, the noise inside the car is complex, for example, in addition to human voices, there will be engine roar, friction sound between the vehicle and the ground during driving, and noise when the on-board equipment is running. , the structure of the car is different, the size of the space in the car is different, the installation position of the vehicle equipment is different, the configuration of the vehicle equipment is different, etc., all of which will make the noise in the vehicle quite different.

Therefore, when training the speech recognition model, if the detailed training samples are not screened for different models and different external conditions, the generalization ability of the obtained speech recognition model will be poor, which will lead to inaccurate recognition of the speech in the car.

However, in practice, it is difficult to obtain more comprehensive training samples for different models and different external conditions, resulting in the difficulty of improving the accuracy of the speech recognition model

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment 1

[0101] refer to figure 1 As shown, it is a schematic flow chart of a method for training a speech recognition model provided in the embodiment of the present application, and the specific steps are as follows:

[0102] S101: Obtain multiple pieces of basic voice information.

[0103]S102: Based on noise speech information in various environments and / or in-vehicle shock response information corresponding to different types of vehicles, the basic speech information is expanded to obtain sample speech information.

[0104] S103: Train a speech recognition model based on the sample speech information and an actual speech recognition result corresponding to the sample speech information; the speech recognition model is used to perform speech recognition on the speech to be recognized.

[0105] The above S101 to S103 will be described in detail below respectively.

[0106] I: In the above S101, the basic voice information refers to the voice information including the voice of a pe...

Embodiment 2

[0164] refer to Figure 4 As shown, the embodiment of the present application also provides a method for recognizing speech, the method comprising:

[0165] S401. Acquire a trained speech recognition model.

[0166] The speech recognition model is trained based on the sample speech information and the actual speech recognition results corresponding to the sample speech information. The sample speech information is based on the noise speech information in various environments and / or the vehicle interior shock response information corresponding to different types of vehicles. It is obtained by expanding the voice information.

[0167] For the specific training method of the speech recognition model, refer to the above-mentioned first embodiment, which will not be repeated here.

[0168] S402. After receiving the speech information to be recognized, input the speech information to be recognized into the speech recognition model, and obtain a speech recognition result correspond...

Embodiment 3

[0171] The embodiment of the present application provides a device for training a speech recognition model, such as Figure 5 As shown, it is a schematic diagram of the architecture of the device for training the speech recognition model provided by the embodiment of the present application, including: the first acquisition module 501, the expansion processing module 502, and the training module 503, specifically:

[0172] The first obtaining module 501 is used to obtain a plurality of pieces of basic voice information;

[0173] An extended processing module 502, configured to perform extended processing on the basic speech information based on noise speech information in various environments and / or in-vehicle shock response information corresponding to different types of vehicles, to obtain sample speech information;

[0174] The training module 503 is configured to train a speech recognition model based on the sample speech information and an actual speech recognition result...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention provides methods and devices for training a voice recognition model and recognizing voice. The method for training the voice recognition model comprises the following steps of acquiringa plurality of basic voice messages; expanding the basic voice messages based on noise voice information under various environments and / or in-vehicle impact response information corresponding to different types of vehicles to obtain sample voice information; training the voice recognition model based on the sample voice information and an actual voice recognition result corresponding to the samplevoice information, wherein the voice recognition model is used for carrying out voice recognition on voice to be recognized. By means of the methods and devices in the embodiment, the voice recognition model can be higher in generalization capability, the accuracy of the voice recognition model is improved, and then the accuracy of voice recognition is improved.

Description

technical field [0001] The present application relates to the technical field of machine learning, in particular, to a method and device for training a speech recognition model and recognizing speech. Background technique [0002] In recent years, with the continuous promotion of voice products, voice input, as an important means of human-computer interaction, has been accepted by more and more people. For example, in the field of online car-hailing, in many cases, it is necessary to capture the voice of the service provider or service requester in the vehicle through the vehicle equipment, and recognize the captured voice based on the voice recognition model. [0003] However, due to the small space distance in the car, the noise inside the car is complex, for example, in addition to human voices, there will be engine roar, friction sound between the vehicle and the ground during driving, and noise when the on-board equipment is running. , The structure of the car is diffe...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & AuthorityApplications(China)

IPC IPC(8): G10L15/06G10L15/20G10L15/26G10L21/0208

CPCG10L15/063G10L15/20G10L15/26G10L21/0208

Inventor赵帅江赵茜罗讷

OwnerBEIJING DIDI INFINITY TECH & DEV

Methods and devices for training voice recognition model and recognizing voice

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment 1

Embodiment 2

Embodiment 3

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology