Parallel Feature Extraction System and Method for General Specific Speech in Speech Signal

A speech signal and feature extraction technology, applied in speech analysis, speech recognition, instruments, etc., can solve the problems of low recognition accuracy, low integration accuracy, and uncommon emotion recognition task models, so as to improve recognition accuracy Effect

Active Publication Date: 2022-05-06
DALIAN NEUSOFT UNIV OF INFORMATION
View PDF4 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0009] However, there are many defects or deficiencies in the actual application or design of the above three recognition tasks
For example: Voiceprint recognition, speech recognition, and emotion recognition task models are not universal, input forms are not uniform, there is no universal solution, the integration accuracy is not high, the recognition accuracy rate of a single emotion recognition task is not high, etc.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Parallel Feature Extraction System and Method for General Specific Speech in Speech Signal
  • Parallel Feature Extraction System and Method for General Specific Speech in Speech Signal
  • Parallel Feature Extraction System and Method for General Specific Speech in Speech Signal

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0092] The present invention will be further described below in conjunction with accompanying drawing:

[0093] The model of the present invention mainly includes a voice signal, an emotion recognition model, a voiceprint recognition model and a speech recognition model;

[0094] Described spectrogram is the display image of the Fourier analysis of speech signal, and spectrogram is a kind of three-dimensional frequency spectrum, represents the graph that speech spectrum changes with time, and its vertical axis is frequency, and horizontal axis is time; Any given The intensity of a frequency component at a given moment is represented by the shade of gray or tone at the corresponding point. The acquisition method is as follows: For a piece of speech signal x(t), first divide it into frames and change it to x(m,n) (n is the frame length, m is the number of frames), perform fast Fourier transform, and obtain X(m,n ), get the periodogram Y(m,n)(Y(m,n)=X(m,n)*X(m,n)'), take 10*log1...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a parallel feature extraction system and method for general specific speech in speech signals. The model mainly includes a speech signal, an emotion recognition model, a voiceprint recognition model and a speech recognition model; the output features obtained by the LLD channel are combined with the The output features obtained by the spectrogram and TEO channel are combined to obtain 1*1024-dimensional emotional features; the spectrogram is fed into the convolutional neural network CNN as input, and the spectrogram and MFCC are fed into Seq2Seq, and the model of two channels Fusion adds attention mechanism to form voiceprint recognition model; 42-dimensional MFCC is used as input, combined with BIMLSTM and Seq2Seq channel for output; spectrogram is combined with Seq2Seq channel for output; model fusion of two channels is added with attention mechanism to form speech recognition Model. The invention has the advantages of high accuracy rate, high integration degree, free choice of scheme and the like.

Description

technical field [0001] The invention relates to the field of signal processing and extraction, in particular to a feature extraction system for speech tasks. Background technique [0002] Speech is the most effective, natural and important form of communication for human beings. Realizing communication between humans and machines through voice requires machines to be intelligent enough to recognize human voices. With the development of machine learning, neural network and deep learning theory, the completion of speech recognition related tasks is gradually improving, which is of great help to the computer to understand the content of speech. Currently, speech recognition tasks mainly involve the following three recognition tasks: [0003] 1. Voiceprint recognition [0004] Voiceprint recognition, also known as speaker recognition, is a form of biometric recognition. It analyzes and processes the continuous speech signal of the speaker to extract discrete speech features, a...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G10L25/63G10L25/24G10L25/30G10L17/18G10L17/04G10L17/02G10L15/02G10L15/16
CPCG10L25/63G10L25/24G10L25/30G10L17/18G10L17/04G10L17/02G10L15/02G10L15/16
Inventor 郑纯军贾宁陈明华周伊佳张轶
Owner DALIAN NEUSOFT UNIV OF INFORMATION
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products