Parallel feature extraction system and method for general specific voice in voice signal

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A speech signal and feature extraction technology, applied in speech analysis, speech recognition, instruments, etc., can solve problems such as no universal solution, low recognition accuracy, and uncommon emotion recognition task models

Active Publication Date: 2020-04-10

DALIAN NEUSOFT UNIV OF INFORMATION

View PDF4 Cites 20 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0009] However, there are many defects or deficiencies in the actual application or design of the above three recognition tasks

For example: Voiceprint recognition, speech recognition, and emotion recognition task models are not common, input forms are not uniform, there is no universal solution, the integration accuracy is not high, the recognition accuracy rate of a single emotion recognition task is not high, etc.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0092] The present invention will be further described below in conjunction with accompanying drawing:

[0093] Model of the present invention mainly comprises speech signal, emotion recognition model, voiceprint recognition model and speech recognition model;

[0094] Described spectrogram is the display image of the Fourier analysis of speech signal, and spectrogram is a kind of three-dimensional frequency spectrum, represents the graph that speech spectrum changes with time, and its vertical axis is frequency, and horizontal axis is time; Any given The intensity of a frequency component at a given moment is represented by the shade of gray or tone at the corresponding point. The acquisition method is as follows: For a piece of speech signal x(t), first divide it into frames and change it to x(m,n) (n is the frame length, m is the number of frames), and perform fast Fourier transform to obtain X(m,n ), get the periodogram Y(m,n)(Y(m,n)=X(m,n)* X(m,n)'), take 10*log10(Y(m,n)...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a parallel feature extraction system and method for general specific voice in a voice signal. A model mainly comprises a voice signal, an emotion recognition model, a voiceprint recognition model and a voice recognition model. Output features obtained by an LLD channel are combined with output features obtained by a spectrogram and a TEO channel to obtain 1*1024-dimensionalemotion features; the spectrogram is fed into a convolutional neural network CNN as an input, the spectrogram and MFCC are fed into Seq2Seq, the models of the two channels are fused and an attentionmechanism is add to form a voiceprint recognition model; 42-dimensional MFCC is adopted as input, and output is carried out in combination with a BIMLSTM channel and a Seq2Seq channel; the spectrogramis combined with a Seq2Seq channel for outputting; and the models of the two channels are fused and an attention mechanism is added to form a voice recognition model. The method has the advantages ofhigh accuracy, high integration level, capability of freely selecting schemes and the like.

Description

technical field [0001] The invention relates to the field of signal processing and extraction, in particular to a feature extraction system for speech tasks. Background technique [0002] Speech is the most effective, natural and important form of communication for human beings. To achieve communication between humans and machines through voice requires machines to be intelligent enough to recognize human voices. With the development of machine learning, neural network and deep learning theory, the degree of completion of speech recognition-related tasks is gradually improving, which is of great help to the computer to understand the content of speech. At present, speech recognition tasks mainly involve the following three recognition tasks: [0003] 1. Voiceprint recognition [0004] Voiceprint recognition, also known as speaker recognition, is a form of biometric recognition. It analyzes and processes the continuous speech signal of the speaker to extract discrete speech...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

IPC IPC(8): G10L25/63G10L25/24G10L25/30G10L17/18G10L17/04G10L17/02G10L15/02G10L15/16

CPCG10L25/63G10L25/24G10L25/30G10L17/18G10L17/04G10L17/02G10L15/02G10L15/16

Inventor 郑纯军贾宁陈明华周伊佳张轶

Owner DALIAN NEUSOFT UNIV OF INFORMATION

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Parallel feature extraction system and method for general specific voice in voice signal

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology