Method for extracting short-time energy frequency value in voice endpoint detection

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A technology of endpoint detection and extraction method, which is applied in speech analysis, speech recognition, instruments, etc., and can solve problems such as poor performance, poor discrimination effect, and failure

Inactive Publication Date: 2010-01-13

CHINA DIGITAL VIDEO BEIJING

View PDF0 Cites 16 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

This method can accurately distinguish speech from noises such as car engines and door closing sounds, but it is less effective in distinguishing speech from music

[0008] No matter which audio parameters are used, traditional speech endpoint detection methods have great shortcomings in specific noise environments

For example, energy-based methods do not perform well in low SNR environments; information-entropy-based algorithms fail in music backgrounds

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0049]The present invention will be described in detail below in conjunction with the accompanying drawings and specific embodiments.

[0050] (1) Extraction of three audio characteristic parameters of short-term energy, short-term zero-crossing rate and short-term information entropy

[0051] 1. Short-term energy

[0052] Energy is one of the most frequently used audio feature parameters and is the most intuitive representation of speech signals. The energy analysis of speech signals is based on the fact that speech signal amplitudes vary considerably over time. The energy can be used to distinguish the unvoiced and voiced segments of pronunciation, the larger energy value corresponds to the unvoiced segment, and the smaller energy value corresponds to the voiced segment. For signals with high signal-to-noise ratio, energy can be used to judge whether there is speech or not. The noise energy without speech signal is small, but the energy will increase significantly when th...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention relates to voice detection technology in an automatic caption generating system, in particular to a method for extracting a short-time energy frequency value in voice endpoint detection. The method comprises the following steps: dividing an audio sampling sequence into frames with fixed lengths, and forming a frame sequence; extracting three audio characteristic parameters comprising short-time energy, short-time zero-crossing rate and short-time information entropy aiming at data of each frame; and calculating short-time energy frequency values of the data of each frame according to the audio characteristic parameters, and forming a short-time energy frequency value sequence. By combining the audio characteristic parameters of a time domain and a frequency domain, the method can develop respective advantages of the characteristic parameters, and can avoid respective disadvantages to a certain extent at the same time so as to effectively treat background noise of various different types.

Description

technical field [0001] The invention relates to a speech detection technology in an automatic subtitle generation system, in particular to a method for extracting short-time energy-frequency values in speech endpoint detection. Background technique [0002] Speech endpoint detection technology is a new field of speech technology research, which is applied in automatic subtitle generation system. The current subtitle production method first needs to prepare a subtitle manuscript. This subtitle manuscript refers to a text file written in advance before making a TV program, which records the title of the program, what the host wants to say, and what the interviewee said. words and other content. When making TV programs, editors add audio and video materials to the storyboard of non-linear editing software, and then edit them according to the gist of the program. Editing operations generally include modifying the position of the material, adding some special effects, adding ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

IPC IPC(8): G10L11/00G10L11/02G10L15/04

Inventor 李祺马华东郑侃彦韩忠涛张婷

Owner CHINA DIGITAL VIDEO BEIJING

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Method for extracting short-time energy frequency value in voice endpoint detection

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology