Speech recognition method and system based on comparative predictive coding

What is AI technical title?
AI technical title is built by Patsnap AI team. It summarizes the technical point description of the patent document.
A technology of speech recognition and predictive coding, applied in speech analysis, instruments, etc., can solve problems such as distortion, affecting classification results, and missing

Active Publication Date: 2022-07-22

北京信工博特智能科技有限公司

View PDF4 Cites 0 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

At this time, it is very important whether the data in each category is sufficient and representative. If the data is insufficient and not typical, some features related to the specific category will be missing and distorted, which will affect the final classification result.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0067] In order to further understand the content, characteristics and effects of the present invention, the following embodiments are exemplified and described in detail below in conjunction with the accompanying drawings.

[0068] see Figure 1 to Figure 4 , a speech recognition method based on contrastive predictive coding, comprising:

[0069] S1. Collect A voice files of each voice category, and preprocess each voice file to obtain PCM-encoded voice time series data; A is a natural number greater than 1;

[0070] S2. Construct a paired data set of the voice time series data; the paired data set includes N triples (X 1 , X 2 , Y); where: X 1 is the first voice sequence data of the triplet, X 2 It is the second voice time series data of the triplet, the label Y is defined as 0 when the same pair is paired, and the label Y is defined as 1 when the heterogeneous pair is paired; each data in the same pairing set and each data in the heterogeneous pairing set are composed o...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a voice recognition method and system based on comparative predictive coding, and belongs to the technical field of voiceprint recognition, and the method is characterized in that the method comprises the following steps: S1, collecting A voice files of each voice category, carrying out the preprocessing of each voice file, and obtaining PCM-coded voice time sequence data; s2, constructing a pairing data set of the voice time sequence data; s3, constructing a paired fragment data set; s4, constructing an artificial neural network; s5, training a speech recognition network formed by the first converter, the second converter and the one-dimensional convolutional neural network; and S6, voice recognition is carried out through the voice recognition network. According to the method, a large amount of insufficient voice data acquired by a background is fully utilized, the voice data are regarded as time sequence data, end-to-end conversion is directly realized, and the extraction of voice time sequence data characteristics is not needed.

Description

technical field [0001] The invention belongs to the technical field of voiceprint recognition, and in particular relates to a speech recognition method and system based on contrast prediction coding. Background technique [0002] As we all know, speech recognition often needs to collect a large amount of speech data, that is, in various background environments, the number of data pieces needs to be sufficient under the conditions of various semantics (including various sounds and dialects) of speech to be recognized. If enough data cannot be collected for the speech produced by a special dialect (or text semantics) in a special context, when the speech recognition model is used under this condition, there may be models such as decreased detection accuracy and inability to recognize models. failure phenomenon. The traditional method to solve such problems is: most of them need to perform various feature extraction methods such as MFCC feature extraction, and then perform cla...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

IPC IPC(8): G10L17/02G10L17/18

CPCG10L17/02G10L17/18

Inventor戴亦斌

Owner北京信工博特智能科技有限公司

Speech recognition method and system based on comparative predictive coding

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology