Method for improving overall classification accuracy of long-tail distribution speech based on transfer learning

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A transfer learning and accuracy technology, applied in the field of deep learning network training, can solve the problems of insufficient representation ability, unbalanced data distribution, insufficient amount of tail category data, etc., to achieve the effect of improving the effect.

Active Publication Date: 2021-04-09

TIANJIN UNIV

View PDF12 Cites 5 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

The difficulty in dealing with long-tail distribution data lies in two aspects, one is the imbalance of data distribution, and the other is the insufficient representation ability of tail categories due to insufficient data volume.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment

[0017] Feature extraction is performed on the original speech data in the long-tail distribution speech data set to obtain the logarithmic Mel features corresponding to the speech data.

[0018] figure 1 The upper part shows the fitting of the CNN network to the logarithmic Mel feature: the obtained logarithmic Mel feature is processed as the input of the CNN network to extract the speech features, and the parameter θ of the network model is obtained after training n =(w n ,b n ), where n represents the number of network layers of CNN, w is the weight value, b is the offset, and θ represents the parameters of this layer; the data distribution used in the first model training presents a long-tail distribution; Processing: that is, use the features obtained above to perform timing modeling and analyze the timing information of the features;

[0019] figure 1 The lower part shows the migration learning process, through the first model training, the model parameters θ of the C...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a method for improving the overall classification accuracy of long-tail distribution speech based on transfer learning, and the method comprises the steps: firstly building an R-CNN model composed of a CNN and an RNN network through the training of a data set presenting long-tail distribution, enabling the CNN network to be used for extracting speech features, and enabling the RNN network to carry out the time sequence modeling of the speech features extracted by the CNN network; further mining speech information, and extracting inter-class separable features for subsequent speech classification; then, training the R-CNN model twice, in the first model training, using data of long tail distribution for model training, and obtaining preliminary model parameters; in the secondary model training, using the data in balanced distribution for model training, and fixing and migrating the CNN network shallow parameters obtained in the primary model training to the secondary model training; and carrying out speech classification prediction by using the model after secondary training, thereby improving the overall classification effect of the speech classification model.

Description

technical field [0001] The invention belongs to the technical field of deep learning network training, in particular to a method for improving the overall classification accuracy of long-tail distribution speech based on transfer learning. Background technique [0002] Speech classification is an important field in the current deep learning field and has a wide range of commercial application value. However, most of the data sets currently used for model training present a long-tailed distribution, which is a special asymmetric distribution. Some of the categories contain a large amount of data, called head categories, and the corresponding other categories contain The amount of data is very small, called tail categories. Since the tail category contains less data than the head category, the classification result will be biased towards the head category, resulting in a deviation of the overall classification result. Often the information contained in the tail category has ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Applications(China)

IPC IPC(8): G06F16/65G06K9/62G06N3/04G06N3/08

CPCG06F16/65G06N3/08G06N3/045G06F18/2415G06F18/214Y02T10/40

Inventor 谢宗霞王艳清

Owner TIANJIN UNIV

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Method for improving overall classification accuracy of long-tail distribution speech based on transfer learning

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology