Voice endpoint detection method and device

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
An endpoint detection and voice technology, applied in voice analysis, speech recognition, instruments, etc., can solve the problems of low detection accuracy and inaccurate endpoint detection technology, and achieve a very high accuracy effect

Active Publication Date: 2019-09-17

BAIDU ONLINE NETWORK TECH (BEIJIBG) CO LTD

View PDF8 Cites 0 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0004] However, the existing endpoint detection technology has the problem of inaccuracy, and the detection accuracy is low

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0063] In order to make the purpose, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below in conjunction with the drawings in the embodiments of the present invention. Obviously, the described embodiments It is a part of embodiments of the present invention, but not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without creative efforts fall within the protection scope of the present invention.

[0064] figure 1 It is a schematic diagram of the speech recognition principle of the speech recognition system provided by the embodiment of the present invention. The problem to be solved by Automatic Speech Recognition (ASR) is to enable computers to "understand" human speech and convert speech into text. Such as figure 1 As shown, the recogn...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

A voice activity detection method and an apparatus are provided by embodiments of the present application. The method includes: performing framing processing on a voice to be detected to obtain a plurality of audio frames to be detected; obtaining an acoustic feature of each of the audio frames to be detected, and sequentially inputting the acoustic feature of the each of the audio frames to be detected to a VAD model, wherein the VAD model is configured to classify a first N voice frame in the voice to be detected as a noise frame, classify frames from an (N+1)-th voice frame to a last voice frame as voice frames, and classify a M noise frame after the last voice frame as a voice frame, where N and M are integers; and determining, according to a classification result output by the VAD model.

Description

technical field [0001] Embodiments of the present invention relate to the technical field of voice recognition, and in particular, to a voice endpoint detection method and device. Background technique [0002] With the development of human-computer interaction technology, speech recognition technology shows its importance. In a speech recognition system, a speech endpoint detection technology is a very important technology, and is usually also called a voice activity detection technology (voice activity detection, VAD). Speech endpoint detection refers to finding the starting point and ending point of the speech part in the continuous sound signal. [0003] In the prior art, a VAD model can be used to determine the start point and end point of a speech segment in a piece of audio, wherein the VAD model is a classification model. In the specific implementation, the audio is divided into frames, and the acoustic features of each audio frame are extracted and input into the V...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Patents(China)

IPC IPC(8): G10L15/05G10L15/08G10L15/26

CPCG10L15/05G10L15/08G10L15/26G10L25/30G10L25/84G10L2025/783G10L15/02G10L15/063G10L15/16G10L15/22G10L25/78G10L25/87

Inventor 李超朱唯鑫

Owner BAIDU ONLINE NETWORK TECH (BEIJIBG) CO LTD

Voice endpoint detection method and device

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology