An automatic voice summation tone detection method based on a deep neural network

A deep neural network and automatic voice technology, applied in voice analysis, instruments, etc., can solve problems such as lack of deep neural network applications, and achieve accurate detection results and high detection effects

Active Publication Date: 2017-05-03
INST OF ACOUSTICS CHINESE ACAD OF SCI +1
View PDF8 Cites 13 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] The purpose of the present invention is to overcome the defect that still lacks the application of deep neural network in the field of auto

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • An automatic voice summation tone detection method based on a deep neural network
  • An automatic voice summation tone detection method based on a deep neural network
  • An automatic voice summation tone detection method based on a deep neural network

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0030] The present invention will be further described now in conjunction with accompanying drawing.

[0031] The automatic voice overtone detection method of the present invention introduces the DNN model, and combines the Viterbi algorithm to determine whether the segmented speech segment contains overtones and the time point at which the overtones occur.

[0032] refer to figure 1 , the method mainly includes the following steps:

[0033] Step 1), training the deep neural network model (DNN model) that is used for overlapping tone detection.

[0034] This step can include:

[0035] Step 1-1), collect a certain amount of speech data as training data, and set up a corresponding frame-level state target value;

[0036] In the dual tone detection method, the frame-level state target values ​​set up for the voice frame include: dual voice voice, single-person voice, and non-speech. These three types of target values ​​reflect the three possible states of the voice frame.

[...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to an automatic voice summation tone detection method based on a deep neural network. The method comprises a step of training a deep neural network model for summation tone detection, wherein an input layer of the deep neural network model is characteristic information of voice and an output layer is probability output values for three states which are summation tone voice, voice of a single person and non-speech sound. The depth neural network model is used to perform summation tone detection on the automatic voice.

Description

technical field [0001] The invention relates to a speech detection method, in particular to an automatic speech duplication detection method based on a deep neural network. Background technique [0002] Automatic voice overlap detection is to automatically detect which positions in the voice where multiple people speak at the same time, and mark these positions. The appearance of double sound phenomenon will affect the effect of speech signal processing technology. In the field of speaker classification, duplication is one of the main causes of speaker classification errors. Traditional speaker classification can only judge whether the segmented speech segment is a certain speaker. When overlapping segments appear, it is obviously incorrect to judge any speaker; in the field of speech recognition, overlapping segments Due to the overlapping of other people's voices in the area, the voice that needs to be recognized is interfered, and the corresponding recognition performan...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G10L25/78G10L25/30
CPCG10L25/30G10L25/78
Inventor 颜永红陈梦喆潘接林刘建
Owner INST OF ACOUSTICS CHINESE ACAD OF SCI
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products