End-point detecting method, apparatus and speech recognition system based on sliding window

What is AI technical title?
AI technical title is built by PatSnap AI team. It summarizes the technical point description of the patent document.
An endpoint detection and voice technology, applied in voice recognition, voice analysis, instruments, etc., can solve problems such as deterioration of system recognition rate

Active Publication Date: 2006-04-26

INST OF ACOUSTICS CHINESE ACAD OF SCI +2

View PDF0 Cites 28 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

A good endpoint detection algorithm can provide good system robustness; on the contrary, a poor endpoint detection algorithm will lead to a sharp deterioration of the system recognition rate

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0039] figure 1 is a schematic diagram of a sliding window in the time domain. The present invention adopts the idea of a sliding window, takes a certain number of frames as the size of a sliding window, and then judges whether the voice begins to appear and ends according to whether the energy sum of all voices in the window is greater than or less than a certain parameter, to improve robustness. Such as figure 1 As shown, the horizontal horizontal axis represents the time of each frame of the input speech signal, and the vertical vertical axis represents the signal (level) amplitude of each frame of the input speech signal. The white strip frame is the sliding window.

[0040] figure 2 is another schematic diagram of a sliding window. exist figure 2 In , the lower horizontal axis is still the time axis, which represents the time of each frame of the input speech signal. However, the vertical axis represents the frequency spectrum transformed in the frequency domai...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses an end detection method and device, which comprises the following steps: applying a window to the input phonetic signal; selecting certain frame quantity as window dimension; affirming the background noise starting point in the input phonetic signal; calculating the background noise energy; calculating the present frame phonetic energy and window energy; comparing the widow total phonetic energy whether more than the product of background noise energy multiplied by phonetic starting point signal-to-noise ratio; sliding the window to the next frame and returning to calculate the present frame phonetic energy if not; judging the present frame as phonetic starting point if yes. The invention improves the detection accuracy, robustness and total discrimination of phonetic identification system, which is used in the phonetic identification system.

Description

technical field [0001] The present invention relates to a method of endpoint detection (VAD), more specifically, the present invention relates to a method and device for detecting a speech endpoint used in a speech recognition system, and a speech recognition system using the detection method. Background technique [0002] In the speech recognition application system, the input signal includes the speech signal of the user speaking, the background noise signal and so on. The process of extracting the speech signal of the user's utterance from the input signal is called endpoint detection. [0003] The difficulty of commercialization of the speech recognition system lies in the improvement of robustness (Robustness). The robustness of the speech recognition system is affected by many uncertain factors such as the speaker and the speech channel used in the environment. A speech recognition system may have a high recognition rate in a normal test, but when used in an actual e...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & AuthorityApplications(China)

IPC IPC(8): G10L15/20G10L21/0232

Inventor余洪涌赵庆卫

OwnerINST OF ACOUSTICS CHINESE ACAD OF SCI

End-point detecting method, apparatus and speech recognition system based on sliding window

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology