Real-time speech emotion recognition method and device thereof
A technology for emotion recognition and real-time speech, applied in the field of signal processing, can solve the problems of large amount of calculation, inability to achieve real-time performance, low system generalization ability, etc., and achieve the effect of over-fitting problem and small amount of calculation.
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment 1
[0071] refer to figure 1 , the embodiment of the present invention provides a kind of real-time voice emotion recognition method, comprises the following steps:
[0072] Step 1: Extraction of Mel Spectrum
[0073] After pre-emphasizing the original speech signal, use a sliding Hamming window of 25ms (recommended value) with a step size of 15ms for processing, and each sample frame is processed by Fast Fourier Transform (FFT) and Mel filter , the Mel frequency of each sampling frame is obtained by the following conversion formula, and the conversion formula of Mel frequency and Hertz scale frequency is:
[0074]
[0075] where m is the Mel frequency and f is the Hertz scale frequency.
[0076] The center frequency of the Mel filter bank can be expressed as:
[0077]
[0078] where f(l) denotes the center frequency of the Mel filter bank l on the Hertz scale, m l is the lower bound of the Mel filter bank l on the Mel scale, m l+1 is the lower bound of the Mel filter b...
Embodiment 2
[0141] refer to figure 2 , the present invention also provides a real-time voice emotion recognition device, mainly comprising the following modules:
[0142] Mel spectrum extraction module 1, formant extraction module and registration module 2, syllable segmentation and statistics module 3, syllable emotion classification module 4 and sentence-level confidence aggregation module 5;
[0143] In some embodiments, the mel spectrum extraction module 1 specifically includes:
[0144] The pre-emphasis module is used to perform pre-emphasis processing on the original speech signal to obtain a pre-emphasis signal;
[0145] Frame-based windowing and Fourier transform module, used to perform frame-based windowing and Fourier transform processing on the pre-emphasized signal to obtain a transformed signal;
[0146] Mel filtering module, for processing the transformed signal through a Mel filter bank to obtain the Mel frequency of each sampling frame;
[0147] The adjacent frame conn...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com