Supercharge Your Innovation With Domain-Expert AI Agents!

Voice sample screening method based on improved dynamic time warping algorithm

A dynamic time warping and speech sample technology, applied in the field of data processing, can solve problems such as heavy workload, low efficiency, and high cost

Active Publication Date: 2020-05-19
SOUTH CHINA UNIV OF TECH
View PDF7 Cites 1 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] Both the construction of the corpus and the learning of the deep neural network require reasonable and correct speech samples. If the speech samples belong to the same text through manual audition, it will cause huge workload, low efficiency
Especially for languages ​​with low resources, such as various dialects of Chinese, it is difficult and costly to rely on manual screening of such speech samples

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Voice sample screening method based on improved dynamic time warping algorithm
  • Voice sample screening method based on improved dynamic time warping algorithm
  • Voice sample screening method based on improved dynamic time warping algorithm

Examples

Experimental program
Comparison scheme
Effect test

Embodiment

[0066] Such as figure 1 Shown is a flow chart of a voice sample screening method based on an improved dynamic time warping algorithm, said method comprising steps:

[0067] (1) Use multiple speech samples recorded based on the same text, remove the background sound and mark the vowels and consonants of the speech samples, and construct the sample speech feature sequence expression.

[0068] In this embodiment, the recorded Jiangxi Hakka speech samples are used as a data set. The data set includes 115 speakers in total, and each person records 672 sentences constructed according to keywords. Each sentence is recorded once per person, and 10 speakers are selected. Keywords are used as the detection target, and the speech samples corresponding to sentences containing 10 keywords are selected to construct a speech sample set based on similar texts, and the test set and training set are divided according to the ratio of 3:7. After preprocessing the samples of each type of sample s...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a voice sample screening method based on an improved dynamic time warping algorithm. The method comprises the steps: recording a plurality of voice samples based on a same text, removing background sounds, marking vowels and consonants of the voice samples, and constructing a voice feature sequence expression of the samples; according to the annotation types of vowels and consonants in the voice short-time frames after removing of the background sounds, determining transition sounds through change information of the maximum amplitude of signals in the voice short-time frames, and annotating the transition sounds; respectively carrying out weighted calculation on a local distance and an overall distance of an improved DTW algorithm to obtain a distance between everytwo samples, and constructing a distance matrix of all the samples; and screening the voice samples according to the distance matrix. According to the method, a problem of screening the voice samplesof the same text under the conditions that the sample data volume is large and the sample quality cannot be guaranteed is solved, the screening cost is reduced, and more reliable sample data are provided for subsequent processing (such as corpus construction and deep neural network learning).

Description

technical field [0001] The invention relates to the technical field of data processing, in particular to a voice sample screening method based on an improved dynamic time warping algorithm. Background technique [0002] With the rapid development of mobile portable devices and the Internet, voice samples can be obtained from multiple channels, and their data volume is also increasing day by day. Huge speech data provides a foundation for the construction of various language corpora. At the same time, with the development of artificial intelligence, Deep Neural Network (DNN) has achieved remarkable results in the field of speech processing. In 2012, Hinton made new breakthroughs in speech recognition by using deep neural networks. Later, more network structures that conform to the characteristics of speech sequence timing appeared in the research, such as recurrent neural network (Recurrent Neural Network, RNN), long-term short-term memory recurrent neural network (Long Shor...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G10L15/06G10L15/05G10L15/02G10L15/08G10L15/26G06F40/117
CPCG10L15/063G10L15/05G10L15/02G10L15/08G10L15/26G10L2015/0631G10L2015/022
Inventor 贺前华詹俊瑶严海康苏健彬
Owner SOUTH CHINA UNIV OF TECH
Features
  • R&D
  • Intellectual Property
  • Life Sciences
  • Materials
  • Tech Scout
Why Patsnap Eureka
  • Unparalleled Data Quality
  • Higher Quality Content
  • 60% Fewer Hallucinations
Social media
Patsnap Eureka Blog
Learn More