Unlock instant, AI-driven research and patent intelligence for your innovation.

Voice endpoint detection method and device

An endpoint detection and voice technology, applied in voice analysis, speech recognition, instruments, etc., can solve the problems of low detection accuracy and inaccurate endpoint detection technology, and achieve a very high accuracy effect

Active Publication Date: 2019-09-17
BAIDU ONLINE NETWORK TECH (BEIJIBG) CO LTD
View PDF8 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] However, the existing endpoint detection technology has the problem of inaccuracy, and the detection accuracy is low

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Voice endpoint detection method and device
  • Voice endpoint detection method and device
  • Voice endpoint detection method and device

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0063] In order to make the purpose, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below in conjunction with the drawings in the embodiments of the present invention. Obviously, the described embodiments It is a part of embodiments of the present invention, but not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without creative efforts fall within the protection scope of the present invention.

[0064] figure 1 It is a schematic diagram of the speech recognition principle of the speech recognition system provided by the embodiment of the present invention. The problem to be solved by Automatic Speech Recognition (ASR) is to enable computers to "understand" human speech and convert speech into text. Such as figure 1 As shown, the recogn...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

A voice activity detection method and an apparatus are provided by embodiments of the present application. The method includes: performing framing processing on a voice to be detected to obtain a plurality of audio frames to be detected; obtaining an acoustic feature of each of the audio frames to be detected, and sequentially inputting the acoustic feature of the each of the audio frames to be detected to a VAD model, wherein the VAD model is configured to classify a first N voice frame in the voice to be detected as a noise frame, classify frames from an (N+1)-th voice frame to a last voice frame as voice frames, and classify a M noise frame after the last voice frame as a voice frame, where N and M are integers; and determining, according to a classification result output by the VAD model.

Description

technical field [0001] Embodiments of the present invention relate to the technical field of voice recognition, and in particular, to a voice endpoint detection method and device. Background technique [0002] With the development of human-computer interaction technology, speech recognition technology shows its importance. In a speech recognition system, a speech endpoint detection technology is a very important technology, and is usually also called a voice activity detection technology (voice activity detection, VAD). Speech endpoint detection refers to finding the starting point and ending point of the speech part in the continuous sound signal. [0003] In the prior art, a VAD model can be used to determine the start point and end point of a speech segment in a piece of audio, wherein the VAD model is a classification model. In the specific implementation, the audio is divided into frames, and the acoustic features of each audio frame are extracted and input into the V...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G10L15/05G10L15/08G10L15/26
CPCG10L15/05G10L15/08G10L15/26G10L25/30G10L25/84G10L2025/783G10L15/02G10L15/063G10L15/16G10L15/22G10L25/78G10L25/87
Inventor 李超朱唯鑫
Owner BAIDU ONLINE NETWORK TECH (BEIJIBG) CO LTD