Unlock instant, AI-driven research and patent intelligence for your innovation.

Multi-channel voice activity detection method based on vector machine framework

A voice activity detection, multi-channel technology, applied in the field of Internet information, can solve problems such as inability to meet real-time performance, achieve good discrimination ability, meet real-time requirements, and get rid of the effect of threshold dependence.

Inactive Publication Date: 2017-12-01
NANJING UNIV OF POSTS & TELECOMM
View PDF4 Cites 11 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

The first type of detection method uses two test hypotheses, namely H 1 and H 0 , which respectively represent the existence of speech and the absence of speech, use an appropriate statistical model to calculate the likelihood ratio (LikehoodRatio), and make a decision between this value and the corresponding threshold; the method based on the statistical model can model the acoustic model well , but the noise model needs to be trained, which cannot meet the real-time requirements

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Multi-channel voice activity detection method based on vector machine framework
  • Multi-channel voice activity detection method based on vector machine framework
  • Multi-channel voice activity detection method based on vector machine framework

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0034] In order to make the object, technical solution and advantages of the present invention clearer, the present invention will be further described in detail below in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described here are only used to explain the present invention, not to limit the present invention.

[0035] like figure 1 As shown, this embodiment is a multi-channel voice activity detection algorithm based on a support vector machine (SVM) framework. Enhance the signal-to-noise ratio of speech signals by performing beamforming on multiple channels, using Mel-Frequency Cepstral Coefficients (MFCCs) close to the perceptual characteristics of the human ear as features, and taking the first 12 dimensional coefficients as features Vector, combined with a support vector machine (SVM) model with good classification ability for classification. The specific steps will be described in detail below.

[...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a multi-channel voice activity detection method based on a support vector machine (SVM) framework. Large noises are easily introduced due to the adoption of the traditional voice activity detection method. Meanwhile, the threshold value of the traditional voice activity detection method d is difficult to adjust automatically according to the change of the environment. According to the technical scheme of the invention, the space-time information characteristics of voice signals are fused through a microphone array, and a Mel frequency cepstrum coefficient (MFCC) which is close to the human ear perception characteristic is combined. A support vector machine (SVM) with good classification capability is used for classifying voices / non-voices, and a model for voices and non-voices is established. Therefore, the voice activity detection can be accurately carried out. The problem of the traditional voice activity detection algorithm can be effectively solved.

Description

technical field [0001] The invention relates to the field of Internet information technology, in particular to a multi-channel voice activity detection method based on a vector machine framework. Background technique [0002] In the exchange of human information, a considerable part is expressed in acoustic form, and voice is the most direct and effective form of information transmission. Speech is the combination of sound and meaning, and it is the fastest way to communicate and exchange information. [0003] Voice activity detection (VAD) refers to a detection technology that accurately and intelligently determines the start and end points of a voice from a signal containing voice. This detection technology has nothing to do with the specific content of speech, and it has become a non-negligible link in the application fields of speech coding, speech enhancement and speech recognition. Usually, voice activity detection with a low accuracy rate is prone to adverse effects...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G10L25/24G10L25/45G10L21/0216G10L21/0208G10L15/14G10L15/10
CPCG10L15/10G10L15/14G10L21/0208G10L21/0216G10L25/24G10L25/45G10L2021/02166
Inventor 万新旺廖鹏程王吉沈利祥
Owner NANJING UNIV OF POSTS & TELECOMM