Unlock instant, AI-driven research and patent intelligence for your innovation.

Voice data enhancement method

A technology of speech data and idioms, applied in speech analysis, speech recognition, instruments, etc., to achieve the effect of expanding speech data sets

Active Publication Date: 2019-02-15
UNIV OF ELECTRONIC SCI & TECH OF CHINA
View PDF6 Cites 13 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0008] The purpose of the present invention is to provide a data enhancement method for machine learning models in the speech category in view of the above-mentioned problems. The method can ensure that the machine learning model can train and analyze the spectrogram of speech. The original training data is used to synthesize the data, so that the amount of data and the form of data can be expanded on the basis of the original training data. In this way, the amount of data can be increased so that the machine learning model can be fully trained, and the user can You can try to use more complex machine learning models to fit speech-related problems, without the constraints and limitations between the amount of data and the amount of model parameters

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Voice data enhancement method
  • Voice data enhancement method
  • Voice data enhancement method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0027] In order to make the purpose, technical solution and advantages of the present invention clearer, the present invention will be further described in detail below in conjunction with the implementation methods and accompanying drawings.

[0028]In today's machine learning tasks, problems based on speech as a modeling object are more common, and the applications are widely used in related fields such as speech recognition, speech sentiment analysis, and narrator recognition. The modeling structure is used as input (more commonly, the spectrogram of speech), and the machine learning model is trained to finally realize the input of speech and the corresponding output of the task. Common applications such as search, smartphones, and web browsing. Therefore, in speech-related machine learning tasks, training the learning model is often the most important step, which also determines that the performance of the machine learning model often has a strong relationship with the qua...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a voice data enhancement method and belongs to the technical field of voice data enhancement during machine learning processing. The method comprises steps that multiple automatic encoders are trained through the spectrogram of voice data in a training set, and the automatic encoders are separately trained through utilizing the spectrogram of a to-be-enhanced voice data set, so the N automatic encoders are obtained based on the to-be-enhanced voice data, the automatic encoders with these different structures are utilized to encode and represent original data, after theto-be-enhanced voice data spectrogram is inputted, multiple sets of structurally-differentiated output spectrogram structures are obtained, and lastly, the spectrogram structures are merged to obtaina newly-generated voice spectrogram data which can be trained, certain consistency is maintained compared with data before input in the main structure, but also some inconsistent expressions exist onsome structural features. The method is advantaged in that machine learning model performance based on the voice data is improved.

Description

technical field [0001] The invention relates to the technical field of machine learning, in particular to the technical field of speech data enhancement during machine learning processing. Background technique [0002] One of the biggest difficulties in today's machine learning tasks is that it is difficult to train models when faced with small data sets. However, due to the particularity of some unnatural scenarios (business scenarios have strong timeliness, category labeling logic is difficult, and data label settings have a lot of subjectivity), it is also a very difficult task to collect and label data. Work. From the perspective of more common deep learning tasks such as image and text analysis, its data collection is relatively feasible to download, process and label millions of pictures and documents from the Internet. However, for speech data, not only is it difficult to collect and preprocess it, but also for a randomly collected speech, it is resource-intensive a...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G10L15/06G10L15/08G10L19/00G10L19/02G10L19/24G10L25/27
CPCG10L15/063G10L15/083G10L19/02G10L19/24G10L25/27G10L19/00
Inventor 王锐罗光春田玲张栗粽陈琢
Owner UNIV OF ELECTRONIC SCI & TECH OF CHINA
Features
  • R&D
  • Intellectual Property
  • Life Sciences
  • Materials
  • Tech Scout
Why Patsnap Eureka
  • Unparalleled Data Quality
  • Higher Quality Content
  • 60% Fewer Hallucinations
Social media
Patsnap Eureka Blog
Learn More