Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

PCNN spectrogram feature integration based emotion voice recognition system

A speech recognition and feature fusion technology, applied in speech analysis, character and pattern recognition, instruments, etc., can solve the problem of less research on the combination of time domain and frequency domain correlation of speech signals

Inactive Publication Date: 2018-03-27
TAIYUAN UNIV OF TECH
View PDF3 Cites 22 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Among these features, the time-domain characteristics and frequency-domain characteristics of the speech signal play an important role, but there are relatively few studies on the combination of the time-domain and frequency-domain correlation of the speech signal.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • PCNN spectrogram feature integration based emotion voice recognition system
  • PCNN spectrogram feature integration based emotion voice recognition system
  • PCNN spectrogram feature integration based emotion voice recognition system

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0087]The present invention uses the windows 7 system as the program development software environment, uses MATLAB R2010a as the program development platform, and adopts the German Berlin speech library as the experimental data. The speech database is recorded by 10 different people, 5 men and 5 women, including 7 different emotions of calm, fear, disgust, joy, disgust, sadness, and anger, with a total of 800 sentence corpus. In this paper, 494 statements are selected to form a database for experiments. The sentences of 5 people are used as the training set, and 30 sentences of each emotion are selected from the remaining sentences, and a total of 210 sentences are used as the test set.

[0088] Carry out windowing to speech signal s (n), the window function that the present invention adopts is Hamming window w (n):

[0089]

[0090] Multiply the speech signal s(n) by the window function w(n) to form a windowed speech signal x(n):

[0091] x(n)=s(n)*w(n)

[0092] The win...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention relates to the technical field of voice recognition and provides a PCNN spectrogram feature integration based emotion voice recognition system. According to the invention, windowing andframing are performed on voice signals, discrete Fourier transform is performed and a spectrogram of the voice signals is drawn; a PCNN model is constructed and the spectrogram is processed through apulse coupling neural network; performing convolution on a PCNN graph and 5-scale 8-direction Gabor wavelets and extracting Gabor amplitude features so as to obtain 40 Gabor spectrograms; extracting uniform mode LBP features on each Gabor spectrogram and acquiring a feature vector as indicated in the description through cascading connection of histograms in the 40 Gabor spectrograms. According tothe invention, Fourier analysis is made on the voice signals; the voice signals are converted to the spectrogram; the spectrogram is processed through the pulse coupling neural network; 40 Gabor spectrograms are obtained through convolution on the spectrogram and the 5-scale 8-direction Gabor wavelets; a local two-value mode feature and a local Hu matrix feature are extracted; and a support vectormachine is adopted for classification and recognition after the two features are integrated.

Description

technical field [0001] The invention relates to the technical field of speech recognition. Background technique [0002] With the rapid development of information technology, more and more attention has been paid to human-computer interaction. Emotional speech recognition, as a key technology of human-computer interaction, has become the focus of research in this field. Emotional speech recognition is a speech recognition technology in which computers extract and analyze the emotional information of human voices to make judgments on human emotional states. It is widely used in many fields such as business, medical care, and education. [0003] At present, the acoustic features used for speech emotion recognition can be roughly summarized into prosodic features, spectrum-based correlation features and timbre features. Prosodic feature distinguishes speech emotion through features such as duration, fundamental frequency, and energy, and its emotion recognition ability has bee...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G10L25/63G06K9/62
CPCG10L25/63G06F18/2135G06F18/2411G06F18/253
Inventor 白静郭倩岩闫建政
Owner TAIYUAN UNIV OF TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products