Speech recognition method and system based on comparative predictive coding

A technology of speech recognition and predictive coding, applied in speech analysis, instruments, etc., can solve problems such as distortion, affecting classification results, and missing

Active Publication Date: 2022-07-22
北京信工博特智能科技有限公司
View PDF4 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

At this time, it is very important whether the data in each category is sufficient and representative. If the data is insufficient and not typical, some features related to the specific category will be missing and distorted, which will affect the final classification result.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Speech recognition method and system based on comparative predictive coding
  • Speech recognition method and system based on comparative predictive coding
  • Speech recognition method and system based on comparative predictive coding

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0067] In order to further understand the content, characteristics and effects of the present invention, the following embodiments are exemplified and described in detail below in conjunction with the accompanying drawings.

[0068] see Figure 1 to Figure 4 , a speech recognition method based on contrastive predictive coding, comprising:

[0069] S1. Collect A voice files of each voice category, and preprocess each voice file to obtain PCM-encoded voice time series data; A is a natural number greater than 1;

[0070] S2. Construct a paired data set of the voice time series data; the paired data set includes N triples (X 1 , X 2 , Y); where: X 1 is the first voice sequence data of the triplet, X 2 It is the second voice time series data of the triplet, the label Y is defined as 0 when the same pair is paired, and the label Y is defined as 1 when the heterogeneous pair is paired; each data in the same pairing set and each data in the heterogeneous pairing set are composed o...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a voice recognition method and system based on comparative predictive coding, and belongs to the technical field of voiceprint recognition, and the method is characterized in that the method comprises the following steps: S1, collecting A voice files of each voice category, carrying out the preprocessing of each voice file, and obtaining PCM-coded voice time sequence data; s2, constructing a pairing data set of the voice time sequence data; s3, constructing a paired fragment data set; s4, constructing an artificial neural network; s5, training a speech recognition network formed by the first converter, the second converter and the one-dimensional convolutional neural network; and S6, voice recognition is carried out through the voice recognition network. According to the method, a large amount of insufficient voice data acquired by a background is fully utilized, the voice data are regarded as time sequence data, end-to-end conversion is directly realized, and the extraction of voice time sequence data characteristics is not needed.

Description

technical field [0001] The invention belongs to the technical field of voiceprint recognition, and in particular relates to a speech recognition method and system based on contrast prediction coding. Background technique [0002] As we all know, speech recognition often needs to collect a large amount of speech data, that is, in various background environments, the number of data pieces needs to be sufficient under the conditions of various semantics (including various sounds and dialects) of speech to be recognized. If enough data cannot be collected for the speech produced by a special dialect (or text semantics) in a special context, when the speech recognition model is used under this condition, there may be models such as decreased detection accuracy and inability to recognize models. failure phenomenon. The traditional method to solve such problems is: most of them need to perform various feature extraction methods such as MFCC feature extraction, and then perform cla...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G10L17/02G10L17/18
CPCG10L17/02G10L17/18
Inventor 戴亦斌
Owner 北京信工博特智能科技有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products