Speech recognition method and device

What is AI technical title?
AI technical title is built by PatSnap AI team. It summarizes the technical point description of the patent document.
A speech recognition and phoneme technology, applied in the computer field, can solve problems such as defects in the ability to model complex speech characteristics, ignoring speech phase information, data dependence, etc., so as to reduce the cost of manual labeling, reduce the time-consuming of labeling, and improve performance.

Pending Publication Date: 2022-05-27

JD DIGITS HAIYI INFORMATION TECHNOLOGY CO LTD

View PDF0 Cites 3 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0005] In view of this, the embodiments of the present invention provide a speech recognition method and device, which can solve the data dependence and speech representation problems of speech recognition in various business fields and application scenarios, and can effectively utilize a large number of unlabeled speech recognition products in existing speech recognition products. Audio data is used to improve the performance of speech recognition, reduce the cost of manual labeling, reduce the time-consuming labeling, and improve the accuracy of labeling. It is suitable for super-large-scale speech recognition training, and solves the problem of ignoring voice phase information and modeling complex voice characteristics in existing technologies. flawed problem

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0032] Exemplary embodiments of the present invention are described below with reference to the accompanying drawings, which include various details of the embodiments of the present invention to facilitate understanding and should be considered as exemplary only. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the invention. Also, descriptions of well-known functions and constructions are omitted from the following description for clarity and conciseness.

[0033] figure 1 is a schematic diagram of the main steps of a speech recognition method according to an embodiment of the present invention; such as figure 1 As shown, the speech recognition method according to an embodiment of the present invention mainly includes the following steps S101 to S102:

[0034] Step S101: Extract the pre-training feature corresponding to the unl...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a speech recognition method and device, and relates to the technical field of computers. A specific embodiment of the method comprises the steps of extracting pre-training features corresponding to an unlabeled first audio data sample through a feature extraction network, and obtaining a normalized weight vector of phonemes of the first audio data sample through a feature mapping network based on the pre-training features; and taking the normalized weight vector as a training target corresponding to the first audio data sample and taking a label of a labeled second audio data sample as a training target corresponding to the second audio data sample, training a speech recognition model, and performing speech recognition by using the trained speech recognition model. According to the implementation mode, the problems of data dependence and voice representation of voice recognition can be solved, the voice recognition performance is improved by effectively utilizing unmarked audio data in a voice recognition product, the manual marking cost is reduced, and the problems that in the prior art, voice phase information is ignored, and complex voice characteristic modeling capacity has defects are solved.

Description

technical field [0001] The present invention relates to the field of computer technology, and in particular, to a speech recognition method and device. Background technique [0002] Speech recognition technology is designed to solve the conversion problem from speech audio signal to speech text. Based on the results of speech recognition, the integration of natural language understanding, multi-modal fusion and other technologies can achieve the purpose of human-computer interaction. The current speech recognition system usually adopts a supervised training scheme, that is, based on manual annotation of the collected audio data, according to the original audio data and features, with text annotation as the final goal, a classifier for speech recognition is trained. There are two types of speech recognition technologies in common use today. A kind of hybrid framework based on Hidden Markov Deep Neural Network (HMM-DNN), which is divided into two modules: acoustic model and ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & AuthorityApplications(China)

IPC IPC(8): G10L15/02G10L15/06G10L25/30

CPCG10L15/02G10L15/063G10L25/30G10L2015/025G10L2015/0631

Inventor雪巍范璐丁国宏

OwnerJD DIGITS HAIYI INFORMATION TECHNOLOGY CO LTD

Speech recognition method and device

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology