Self-adaptive voice endpoint detection method

An endpoint detection and adaptive technology, applied in speech analysis, speech recognition, instrumentation, etc.

Inactive Publication Date: 2010-01-13
CHINA DIGITAL VIDEO BEIJING
View PDF0 Cites 94 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0016] The object of the present invention is to provide a kind of self-adaptive speech endpoint detection method for the characteristic of automatic subtitle generation system and the defect of existing speech endpoint detection method, can under the situation that background noise often changes, speech endpoint can be carried out to continuous speech. detection, thereby improving the efficiency of speech endpoint detection in complex noise background

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Self-adaptive voice endpoint detection method
  • Self-adaptive voice endpoint detection method
  • Self-adaptive voice endpoint detection method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0095] The present invention will be described in detail below in conjunction with the accompanying drawings and specific embodiments.

[0096] The self-adaptive speech endpoint detection method provided by the present invention is applied in the automatic subtitle generation system, and the automatic subtitle generation system accepts the user to input one and adopts PCM audio compression format, sampling frequency 48k, sampling number of bits 16, channel number 2 (stereo ), the file format is an audio file of wav, and a corresponding subtitle manuscript; the output is a subtitle file in srt format, and the content is each sentence in the subtitle manuscript and its corresponding start time point and end time point. The whole system structure is as figure 2 shown.

[0097] Voice endpoint detection process provided by the present invention is as follows figure 1 As shown, the speech parameters can be adjusted in time according to the change of the background noise, so as to...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to voice detection technology in an automatic caption generating system, in particular to a self-adaptive voice endpoint detection method. The method comprises the following steps: dividing an audio sampling sequence into frames with fixed lengths, and forming a frame sequence; extracting three audio characteristic parameters comprising short-time energy, short-time zero-crossing rate and short-time information entropy aiming at data of each frame; calculating short-time energy frequency values of the data of each frame according to the audio characteristic parameters, and forming a short-time energy frequency value sequence; analyzing the short-time energy frequency value sequence from the data of the first frame, and seeking for a pair of voice starting point and ending point; analyzing background noise, and if the background noise is changed, recalculating the audio characteristic parameters of the background noise, and updating the short-time energy frequency value sequence; and repeating the processes till the detection is finished. The method can carry out voice endpoint detection for the continuous voice under the condition that the background noise is changed frequently so as to improve the voice endpoint detection efficiency under a complex noise background.

Description

technical field [0001] The invention relates to a speech detection technology in an automatic subtitle generation system, in particular to an adaptive speech endpoint monitoring method. Background technique [0002] Speech endpoint detection technology is a new field of speech technology research, which is applied in automatic subtitle generation system. The current subtitle production method first needs to prepare a subtitle manuscript. This subtitle manuscript refers to a text file written in advance before making a TV program, which records the title of the program, what the host wants to say, and what the interviewee said. words and other content. When making TV programs, editors add audio and video materials to the storyboard of non-linear editing software, and then edit them according to the gist of the program. Editing operations generally include modifying the position of the material, adding some special effects, adding subtitles, and so on. When adding subtitles...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G10L11/00G10L11/02G10L15/04G10L25/03G10L25/78
Inventor 李祺马华东郑侃彦韩忠涛张婷
Owner CHINA DIGITAL VIDEO BEIJING
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products