System and method for tandem feature extraction for general speech tasks in speech signals

A voice signal and series feature technology, applied in voice analysis, voice recognition, instruments, etc., can solve the problems of emotion recognition task models not universal, no universal solution, low integration accuracy, etc., to meet the needs of rich and diverse emotions High-quality, high-quality recorded voice, strong voice effect

Active Publication Date: 2022-02-01
DALIAN NEUSOFT UNIV OF INFORMATION
View PDF14 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0009] However, there are many defects or deficiencies in the actual application or design of the above three recognition tasks
For example: Voiceprint recognition, speech recognition, and emotion recognition task models are not universal, input forms are not uniform, there is no universal solution, the integration accuracy is not high, the recognition accuracy rate of a single emotion recognition task is not high, etc.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • System and method for tandem feature extraction for general speech tasks in speech signals
  • System and method for tandem feature extraction for general speech tasks in speech signals
  • System and method for tandem feature extraction for general speech tasks in speech signals

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0052] The present invention will be further described below in conjunction with accompanying drawing:

[0053] The system of the present invention mainly comprises emotional corpus, speech preprocessing model, speech feature extraction model; Emotional corpus is the collection based on the form of natural language construction real emotion, introduces statistical method on the basis of traditional greedy algorithm when setting up database; Emotional corpus Perform voice preprocessing after establishment;

[0054] The emotional corpus is characterized by large data scale, accurate emotional expression, and high-quality recorded voice. It has the characteristics of multi-age and high-level speech emotion, and the speech involved is dense and highly recognizable, which meets the needs of rich and diverse emotions to a certain extent. Emotional speech can be divided into natural speech, induced speech and performance speech according to different collection methods. This databa...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a series feature extraction system and method for general speech tasks in speech signals. After the emotional corpus is established, the speech preprocessing is performed; the audio file after the speech preprocessing is used as an input and connected with the speech feature extraction model; the speech feature extraction model The basic information and salient features of the speaker can be obtained through the voiceprint recognition layer and fed into the speech recognition layer together to obtain speaker information and text information, which are sent to the emotion recognition layer together with low-level descriptors on the basis of keyword extraction. Use personalized voice features and emotional features to perform emotion recognition to obtain related emotional features; the voice feature extraction model can also obtain the basic information and salient features of the speaker through the voiceprint recognition layer, and feed them into the voice together with the features of voice recognition The recognition layer sends speaker information and low-level descriptors to the emotion recognition layer to output emotion features. The invention has the advantages of large data scale, accurate emotion expression, high integration degree and the like.

Description

technical field [0001] The invention relates to the field of signal processing and extraction, in particular to a feature extraction model of speech tasks. Background technique [0002] Speech is the most effective, natural and important form of communication for human beings. Realizing communication between humans and machines through voice requires machines to be intelligent enough to recognize human voices. With the development of machine learning, neural network and deep learning theory, the completion of speech recognition related tasks is gradually improving, which is of great help to the computer to understand the content of speech. Currently, speech recognition tasks mainly involve the following three recognition tasks: [0003] 1. Voiceprint recognition [0004] Voiceprint recognition, also known as speaker recognition, is a form of biometric recognition. It analyzes and processes the continuous speech signal of the speaker to extract discrete speech features, and...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G10L17/02G10L17/18G10L15/02G10L15/16G10L25/63G10L25/30G10L25/03
CPCG10L17/02G10L15/02G10L25/63G10L25/03G10L25/30G10L17/18G10L15/16
Inventor 贾宁郑纯军褚娜周慧孙风栋李绪成张轶
Owner DALIAN NEUSOFT UNIV OF INFORMATION
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products