A multidimensional voice message identification system based on a progressive neural network and a method thereof

What is AI technical title?
AI technical title is built by Patsnap AI team. It summarizes the technical point description of the patent document.
A neural network and speech information technology, applied in speech recognition, speech analysis, instruments, etc., can solve the problem of not making full use of the correlation of single-task speech information, and achieve the effect of preventing forgetting effect, improving accuracy, and improving recognition efficiency.

Active Publication Date: 2018-12-07

NANJING UNIV OF POSTS & TELECOMM

View PDF3 Cites 10 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

However, this work does not take full advantage of the correlation between single-task speech information

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0029] Below in conjunction with accompanying drawing and embodiment the multidimensional speech information recognition method based on ProgNets that the present invention proposes is described in detail:

[0030] The corpus used in the present embodiment is KSU-Emotions, and the corpus has two stages, and the present embodiment selects the second stage to study, and the corpus uses 14 (7 males and 7 females) speakers to simulate five kinds of emotions ( neutral, sad, happy, surprised and angry), each emotion has 336 sentences, a total of 1680 sentences, the corpus size of the second stage corpus is about 2 hours and 21 minutes.

[0031] In order to better estimate the recognition effect of multi-dimensional speaker information, the present embodiment adopts an adaptive method based on i-vector to extract features, on the basis of Mel-frequency cepstral coefficient (MFCC) features, combined with Gaussian mixture model (GMM ), and the Universal Background Model (UBM) is traine...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention provides a multidimensional voice message identification system based on a progressive neural network and a method thereof. Based on a baseline system, a progressive neural network is introduced. The baseline system takes i-vector, which is a feature vector, as an input, wherein the vector includes three SNN identification models for gender identification, emotional information identification and identity information identification. On the basis of gender identification, the progressive neural network combines the SNN identification model for gender-related emotional informationidentification with the SNN identification model for gender-related identity information identification, and the information migrates to each other to construct an identification system.

Description

technical field [0001] The invention belongs to the technical field of multi-dimensional speech information recognition, and in particular relates to a progressive neural network-based multi-dimensional speech information recognition system and a method for recognizing various speech information, specifically gender, emotion and speaker identity information. Background technique [0002] Speech signal is the main tool for information transmission and communication between human beings. In daily situations, a speaker's voice often not only conveys semantic information, but also carries information such as the speaker's emotional state, identity, geographical location, and gender. This means that the speech signal we collect is actually a mixed signal of many kinds of information. However, the current speech recognition research is mainly focused on identifying a single message, which is not conducive to understanding the true meaning of speech. Simultaneous recognition of mu...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

IPC IPC(8): G10L15/22G10L15/16G10L15/06G10L25/63

CPCG10L15/063G10L15/16G10L15/22G10L25/63

Inventor陈海霞杨震

OwnerNANJING UNIV OF POSTS & TELECOMM

A multidimensional voice message identification system based on a progressive neural network and a method thereof

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology