Method for self-adaptively adjusting background noise in voice endpoint detection

An adaptive adjustment and background noise technology, applied in speech analysis, speech recognition, instruments, etc., to achieve high detection efficiency and good quality

Inactive Publication Date: 2010-01-13
CHINA DIGITAL VIDEO BEIJING
View PDF0 Cites 19 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0008] The purpose of the present invention is to provide a background noise self-adaptive adjustment method in the speech endpoint detection in view of the characteristics of the automatic subtitle genera

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method for self-adaptively adjusting background noise in voice endpoint detection
  • Method for self-adaptively adjusting background noise in voice endpoint detection
  • Method for self-adaptively adjusting background noise in voice endpoint detection

Examples

Experimental program
Comparison scheme
Effect test

Example Embodiment

[0035] The present invention will be described in detail below with reference to the drawings and specific embodiments.

[0036] The background noise adaptive adjustment method in voice endpoint detection provided by the present invention can adjust voice parameters in time according to changes in background noise, thereby improving the voice endpoint detection efficiency under a complex noise background. The overall process of voice endpoint detection is as follows figure 1 As shown, after the audio data is input, the audio file is parsed and the digital sample value is extracted, and the obtained audio sample sequence is band-pass filtered with a bandwidth of 400hz to 3500hz. The main purpose is to filter out noise or music outside the frequency band of human pronunciation. In addition, it can greatly reduce the impact of background music on voice endpoint detection. Subsequently, window processing is performed on the audio sample sequence, divided into frames of 10ms length, a...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to voice detection technology in an automatic caption generating system, in particular to a method for self-adaptively adjusting background noise in voice endpoint detection. The method can re-determine a short-time energy frequency value of each frame to acquire a new short-time energy frequency value sequence by recalculating short-time energy Eb, short-time zero-crossing rate Zb and short-time information entropy Hb of the background noise according to the real-time change of the background noise; therefore, the method can carry out endpoint detection for the continuous voice under a complex background noise environment so as to improve the voice endpoint detection efficiency under the complex noise background.

Description

technical field [0001] The invention relates to a speech detection technology in an automatic subtitle generation system, in particular to a background noise self-adaptive adjustment method in speech endpoint detection. Background technique [0002] Speech endpoint detection technology is a new field of speech technology research, which is applied in automatic subtitle generation system. The current subtitle production method first needs to prepare a subtitle manuscript. This subtitle manuscript refers to a text file written in advance before making a TV program, which records the title of the program, what the host wants to say, and what the interviewee said. words and other content. When making TV programs, editors add audio and video materials to the storyboard of non-linear editing software, and then edit them according to the gist of the program. Editing operations generally include modifying the position of the material, adding some special effects, adding subtitles,...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G10L11/00G10L11/02G10L15/04
Inventor 李祺马华东郑侃彦韩忠涛张婷
Owner CHINA DIGITAL VIDEO BEIJING
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products