Emotional speech processing

What is AI technical title?
AI technical title is built by Patsnap AI team. It summarizes the technical point description of the patent document.
An emotion and speech technology, applied in speech analysis, speech recognition, instruments, etc., can solve problems such as poor performance, ambiguity, and indistinguishability of statistical sound recognition models

Active Publication Date: 2016-05-11

SONY COMPUTER ENTERTAINMENT INC

View PDF4 Cites 12 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

However, processing emotional speech is very challenging

For example, emotional speech characteristics are significantly different from spoken / conversational speech, and thus statistical sound recognition models trained with spoken speech do not perform well when encountered with emotional speech

In addition, emotion recognition is difficult because different speakers have different ways of expressing their emotions, and thus the classes are ambiguous and indistinguishable

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment

[0017] According to aspects of the present disclosure, the sentiment clustering method may be based on Probabilistic Linear Discriminant Analysis (PLDA). For example, each sentimental utterance can be modeled as a Gaussian Mixture Model (GMM) mean supervector. figure 1 An example of generating a GMM supervector (GMMSV) is shown. Initially, one or more speech signals 101 are received. Each speech signal 101 may be any segment of human speech. By way of example and not limitation, the signal 101 may comprise single syllables, words, sentences, or any combination of these. By way of example and not limitation, the voice signal 101 may be captured with a local microphone or received over a network, recorded, digitized and / or stored in computer memory or other non-transitory storage medium. Afterwards, the speech signal 101 can be used for PLDA model training and / or for emotion clustering or emotion classification. In some embodiments, the speech signal used for PLDA model trai...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention relates to emotional speech processing. A method for emotion or speaking style recognition and / or clustering comprises receiving one or more speech samples, generating a set of training data by extracting one or more acoustic features from every frame of the one or more speech samples, and generating a model from the set of training data, wherein the model identifies emotion or speaking style dependent information in the set of training data. The method may further comprise receiving one or more test speech samples, generating a set of test data by extracting one or more acoustic features from every frame of the one or more test speeches, and transforming the set of test data using the model to better represent emotion / speaking style dependent information, and use the transformed data for clustering and / or classification to discover speech with similar emotion or speaking style. It is emphasized that this abstract is provided to comply with the rules requiring an abstract that will allow a searcher or other reader to quickly ascertain the subject matter of the technical disclosure. It is submitted with the understanding that it will not be used to interpret or limit the scope or meaning of the claims.

Description

[0001] Related applications [0002] This application claims priority to commonly assigned US Provisional Patent Application No. 62 / 030,013, filed July 28, 2014, the entire disclosure of which is incorporated herein by reference. This application also claims priority to commonly assigned US Patent Application Serial No. 14 / 743,673, filed June 18, 2015, the entire disclosure of which is incorporated herein by reference. technical field [0003] The present disclosure relates to speech processing, and more particularly to emotional speech processing. Background technique [0004] Emotional speech processing is important for many applications including user interfaces, games and more. However, processing emotional speech is very challenging. For example, emotional speech characteristics are significantly different from spoken / conversational speech, and thus statistical sound recognition models trained with spoken speech do not perform well when encountered with emotional spee...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & AuthorityApplications(China)

IPC IPC(8): G10L15/06G10L15/18G10L25/63G10L25/27

CPCG10L15/07G10L17/26G10L25/63G10L15/063

InventorO.卡林利-阿卡巴卡克陈如新

OwnerSONY COMPUTER ENTERTAINMENT INC

Emotional speech processing

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology