Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Method for improving overall classification accuracy of long-tail distribution speech based on transfer learning

A transfer learning and accuracy technology, applied in the field of deep learning network training, can solve the problems of insufficient representation ability, unbalanced data distribution, insufficient amount of tail category data, etc., to achieve the effect of improving the effect.

Active Publication Date: 2021-04-09
TIANJIN UNIV
View PDF12 Cites 5 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

The difficulty in dealing with long-tail distribution data lies in two aspects, one is the imbalance of data distribution, and the other is the insufficient representation ability of tail categories due to insufficient data volume.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method for improving overall classification accuracy of long-tail distribution speech based on transfer learning

Examples

Experimental program
Comparison scheme
Effect test

Embodiment

[0017] Feature extraction is performed on the original speech data in the long-tail distribution speech data set to obtain the logarithmic Mel features corresponding to the speech data.

[0018] figure 1 The upper part shows the fitting of the CNN network to the logarithmic Mel feature: the obtained logarithmic Mel feature is processed as the input of the CNN network to extract the speech features, and the parameter θ of the network model is obtained after training n =(w n ,b n ), where n represents the number of network layers of CNN, w is the weight value, b is the offset, and θ represents the parameters of this layer; the data distribution used in the first model training presents a long-tail distribution; Processing: that is, use the features obtained above to perform timing modeling and analyze the timing information of the features;

[0019] figure 1 The lower part shows the migration learning process, through the first model training, the model parameters θ of the C...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a method for improving the overall classification accuracy of long-tail distribution speech based on transfer learning, and the method comprises the steps: firstly building an R-CNN model composed of a CNN and an RNN network through the training of a data set presenting long-tail distribution, enabling the CNN network to be used for extracting speech features, and enabling the RNN network to carry out the time sequence modeling of the speech features extracted by the CNN network; further mining speech information, and extracting inter-class separable features for subsequent speech classification; then, training the R-CNN model twice, in the first model training, using data of long tail distribution for model training, and obtaining preliminary model parameters; in the secondary model training, using the data in balanced distribution for model training, and fixing and migrating the CNN network shallow parameters obtained in the primary model training to the secondary model training; and carrying out speech classification prediction by using the model after secondary training, thereby improving the overall classification effect of the speech classification model.

Description

technical field [0001] The invention belongs to the technical field of deep learning network training, in particular to a method for improving the overall classification accuracy of long-tail distribution speech based on transfer learning. Background technique [0002] Speech classification is an important field in the current deep learning field and has a wide range of commercial application value. However, most of the data sets currently used for model training present a long-tailed distribution, which is a special asymmetric distribution. Some of the categories contain a large amount of data, called head categories, and the corresponding other categories contain The amount of data is very small, called tail categories. Since the tail category contains less data than the head category, the classification result will be biased towards the head category, resulting in a deviation of the overall classification result. Often the information contained in the tail category has ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/65G06K9/62G06N3/04G06N3/08
CPCG06F16/65G06N3/08G06N3/045G06F18/2415G06F18/214Y02T10/40
Inventor 谢宗霞王艳清
Owner TIANJIN UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products