An automatic voice summation tone detection method based on a deep neural network

What is AI technical title?
AI technical title is built by Patsnap AI team. It summarizes the technical point description of the patent document.
A deep neural network and automatic voice technology, applied in voice analysis, instruments, etc., can solve problems such as lack of deep neural network applications, and achieve accurate detection results and high detection effects

Active Publication Date: 2017-05-03

INST OF ACOUSTICS CHINESE ACAD OF SCI +1

View PDF8 Cites 13 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0005] The purpose of the present invention is to overcome the defect that still lacks the application of deep neural network in the field of auto

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0030] The present invention will be further described now in conjunction with accompanying drawing.

[0031] The automatic voice overtone detection method of the present invention introduces the DNN model, and combines the Viterbi algorithm to determine whether the segmented speech segment contains overtones and the time point at which the overtones occur.

[0032] refer to figure 1 , the method mainly includes the following steps:

[0033] Step 1), training the deep neural network model (DNN model) that is used for overlapping tone detection.

[0034] This step can include:

[0035] Step 1-1), collect a certain amount of speech data as training data, and set up a corresponding frame-level state target value;

[0036] In the dual tone detection method, the frame-level state target values set up for the voice frame include: dual voice voice, single-person voice, and non-speech. These three types of target values reflect the three possible states of the voice frame.

[...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention relates to an automatic voice summation tone detection method based on a deep neural network. The method comprises a step of training a deep neural network model for summation tone detection, wherein an input layer of the deep neural network model is characteristic information of voice and an output layer is probability output values for three states which are summation tone voice, voice of a single person and non-speech sound. The depth neural network model is used to perform summation tone detection on the automatic voice.

Description

technical field [0001] The invention relates to a speech detection method, in particular to an automatic speech duplication detection method based on a deep neural network. Background technique [0002] Automatic voice overlap detection is to automatically detect which positions in the voice where multiple people speak at the same time, and mark these positions. The appearance of double sound phenomenon will affect the effect of speech signal processing technology. In the field of speaker classification, duplication is one of the main causes of speaker classification errors. Traditional speaker classification can only judge whether the segmented speech segment is a certain speaker. When overlapping segments appear, it is obviously incorrect to judge any speaker; in the field of speech recognition, overlapping segments Due to the overlapping of other people's voices in the area, the voice that needs to be recognized is interfered, and the corresponding recognition performan...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

IPC IPC(8): G10L25/78G10L25/30

CPCG10L25/30G10L25/78

Inventor颜永红陈梦喆潘接林刘建

OwnerINST OF ACOUSTICS CHINESE ACAD OF SCI

An automatic voice summation tone detection method based on a deep neural network

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology