Series feature extraction system and method aiming at general voice task in voice signal

What is AI technical title?
AI technical title is built by Patsnap AI team. It summarizes the technical point description of the patent document.
A technology of speech signals and serial features, applied in speech analysis, speech recognition, instruments, etc., can solve the problems of uncommon models of emotion recognition tasks, no universal solutions, and low recognition accuracy, so as to meet the needs of rich and diverse emotions. The effect of high quality of recorded voice and high recognition

Active Publication Date: 2019-12-31

DALIAN NEUSOFT UNIV OF INFORMATION

View PDF14 Cites 15 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0009] However, there are many defects or deficiencies in the actual application or design of the above three recognition tasks

For example: Voiceprint recognition, speech recognition, and emotion recognition task models are not universal, input forms are not uniform, there is no universal solution, the integration accuracy is not high, the recognition accuracy rate of a single emotion recognition task is not high, etc.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0052] The present invention will be further described below in conjunction with accompanying drawing:

[0053] The system of the present invention mainly comprises emotional corpus, speech preprocessing model, speech feature extraction model; Emotional corpus is the collection based on the form of natural language construction real emotion, introduces statistical method on the basis of traditional greedy algorithm when setting up database; Emotional corpus Voice preprocessing after establishment;

[0054] The emotional corpus is characterized by large data scale, accurate emotional expression, and high-quality recorded voice. It has the characteristics of multi-age and high-level speech emotion, and the speech involved is dense and highly recognizable, which meets the needs of rich and diverse emotions to a certain extent. Emotional speech can be divided into natural speech, induced speech and performance speech according to different collection methods. This database is a ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a series feature extraction system and method aiming at a general voice task in a voice signal. After an emotional corpus is built, voice preprocessing is performed; an audio file subjected to voice preprocessing is used as input to be connected with a voice feature extraction model; the voice feature extraction model can obtain basic information and prominent features of aspeaker through a voiceprint recognition layer; the basic information and the prominent features are fed into a voice recognition layer together; the information of the speaker and the text information are obtained; on the basis of extracting keywords, the information and low level descriptors are transmitted into an emotional recognition layer together; personalized voice features and emotionalfeatures are used for performing emotional recognition to obtain emotional relevant features; a voice feature extraction model can also obtain the basic information and the prominent features of the speaker through the voiceprint recognition layer; the basic information and the prominent features are fed into the voice recognition layer together with voice recognition features; the information ofthe speaker and the low level descriptors are transmitted into the emotional recognition layer together; and the emotional feature output is performed. The series feature extraction system and methodhave the advantages of great data scale, accurate emotion expression, high integration level and the like.

Description

technical field [0001] The invention relates to the field of signal processing and extraction, in particular to a feature extraction model of speech tasks. Background technique [0002] Speech is the most effective, natural and important form of communication for human beings. Realizing communication between humans and machines through voice requires machines to be intelligent enough to recognize human voices. With the development of machine learning, neural network and deep learning theory, the completion of speech recognition related tasks is gradually improving, which is of great help to the computer to understand the content of speech. Currently, speech recognition tasks mainly involve the following three recognition tasks: [0003] 1. Voiceprint recognition [0004] Voiceprint recognition, also known as speaker recognition, is a form of biometric recognition. It analyzes and processes the continuous speech signal of the speaker to extract discrete speech features, and...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & AuthorityApplications(China)

IPC IPC(8): G10L17/02G10L17/18G10L15/02G10L15/16G10L25/63G10L25/30G10L25/03

CPCG10L17/02G10L15/02G10L25/63G10L25/03G10L25/30G10L17/18G10L15/16

Inventor贾宁郑纯军褚娜周慧孙风栋李绪成张轶

OwnerDALIAN NEUSOFT UNIV OF INFORMATION

Series feature extraction system and method aiming at general voice task in voice signal

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology