Interaction-oriented speech corpus processing method and device

A processing method and corpus technology, applied in speech analysis, instruments, etc., can solve the problems of limited corpus, limited recognition ability of speech emotion recognition model, difficulty in corpus collection, etc., and achieve the effect of improving recognition ability and increasing quantity

Inactive Publication Date: 2018-03-30
HEFEI UNIV OF TECH +1
View PDF2 Cites 12 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] However, in the process of implementing the embodiments of the present invention, the inventors found that for the existing speech emotion recognition model, due to the limited number of corpus used for training and the difficulty of collecting appropriate corpus, the recognition ability of the speech emotion recognition model limited

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Interaction-oriented speech corpus processing method and device
  • Interaction-oriented speech corpus processing method and device
  • Interaction-oriented speech corpus processing method and device

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0042] The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are only a part of the embodiments of the present invention, but not all of the embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those of ordinary skill in the art without creative efforts shall fall within the protection scope of the present invention.

[0043] In the first aspect, the embodiment of the present invention provides an interaction-oriented speech corpus processing method, such as figure 1 shown, including:

[0044] S101. Perform short-time Fourier transform on the speech segment, and move on the frequency spectrum according to a preset window function to obtain a time-frequency diagram of the speech segment;

[0045] That is to say, first perform short-time Fourier tran...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The embodiments of the invention provide an interaction-oriented speech corpus processing method and device. The method includes the following steps: a speech fragment is converted into a time-frequency graph; the features of the time-frequency graph are learned by using two convolution layers to get a feature graph matrix; the feature graph matrix is compressed by using a maximum pooling layer, and the compressed matrix is converted into a vector; and the generation of the vector is learned by using two LSTM layers, and the learned feature vector is taken as the input corpus of SVM. Therefore, the number of effective corpus can be increased, the training of a speech emotion recognition model can be facilitated, and the recognition ability of the speech emotion recognition model can be improved.

Description

technical field [0001] The present invention relates to the technical field of software, in particular to an interaction-oriented speech corpus processing method and device. Background technique [0002] Speech emotion recognition has the same characteristics as text and image recognition. There are three types of learning methods: supervised learning, semi-supervised learning and unsupervised learning. Most of the speech emotion recognition methods at this stage: first use traditional methods to extract features including speech rate, prosody intensity, Mel frequency cepstral coefficient, etc.; then use classifiers to classify, including support vector machines, Gaussian models and hidden Markov models and other classification methods. [0003] Similar emotion recognition methods have achieved many results, but there are still some shortcomings. In the current speech emotion recognition, it is not clear which features have a greater impact on the classification of emotion...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G10L25/63G10L25/30G10L25/27
CPCG10L25/27G10L25/30G10L25/63
Inventor 孙晓曹馨月丁帅杨善林赵大平屈炎伟丁彬彬
Owner HEFEI UNIV OF TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products