Voice activity detection method and system

What is AI technical title?
AI technical title is built by Patsnap AI team. It summarizes the technical point description of the patent document.
A voice activity detection and voice segment technology, which is applied in voice analysis, instruments, etc., can solve problems such as performance degradation and impact of detection results, and achieve good detection performance and good self-adaptive effects

Active Publication Date: 2016-07-27

SPREADTRUM COMM (SHANGHAI) CO LTD

View PDF8 Cites 14 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

In the background environment with low signal-to-noise ratio or unstable noise, the detection result will be affected by the noise feature quantity, resulting in performance degradation

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0034] In order to make the above objects, features and advantages of the present invention more comprehensible, specific embodiments of the present invention will be described in detail below in conjunction with the accompanying drawings.

[0035] refer to figure 1 , which illustrates the voice activity detection method 100 according to an embodiment of the present invention. The method includes the following steps.

[0036] S101. Calculate the spectral density of the current frame of the audio signal.

[0037] In method 100, voice activity detection is performed on a frame-by-frame basis. Specifically, the audio signal is divided into multiple frames, and then each frame of the audio signal is detected separately, so as to determine the speech segment and the non-speech segment of the audio signal. Wherein, the length range of each frame may be set to 10ms to 30ms. Therefore, the current frame of the audio signal is the current frame for which voice activity detection ne...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention provides a voice activity detection method and system. The method comprises the steps: calculating the spectrum density of a current frame of an audio signal; estimating an expected value of the spectrum density of noise; calculating the signal to noise ratio of the current frame based on the spectrum density of the current frame and the spectrum density of noise; and generating a voice activity detection result based on the signal to noise ratio of the current frame and a preset threshold. Therefore, the voice activity detection result is related with the probability statistics distribution of noise, thereby overcoming the impact on the detection result from noise. Meanwhile, the preset threshold is a dynamic threshold and is related with the change of noise, thereby enabling the detection result to be adapted to the noise environment of the current frame.

Description

technical field [0001] The invention relates to voice recognition technology, in particular to a voice activity detection method and system thereof. Background technique [0002] Voice activity detection (Voice Activity detection, VAD), also known as voice detection, is used in voice processing to detect the presence or absence of voice, thereby separating voice segments and non-speech segments in a signal. VAD can be used for echo cancellation, noise suppression, speaker recognition and speech recognition, etc. [0003] The traditional VAD algorithm often selects the short-term energy, spectrum energy, zero-crossing rate and other characteristics of the audio signal for judgment. Therefore, in a pure speech environment and a high SNR environment, the performance is better. However, in a background environment with a low signal-to-noise ratio or unstable noise, the detection result will be affected by the noise feature quantity, resulting in performance degradation. [00...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

IPC IPC(8): G10L21/0208G10L25/84

Inventor孙廷玮林福辉

OwnerSPREADTRUM COMM (SHANGHAI) CO LTD

Voice activity detection method and system

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology