Voice signal endpoint detection method based on characteristic value code

What is AI technical title?
AI technical title is built by Patsnap AI team. It summarizes the technical point description of the patent document.
A voice signal and endpoint detection technology, applied in speech analysis, speech recognition, instruments, etc., can solve the problems of ignoring the minimum requirement of short-term energy, the accuracy rate of endpoint detection, and the interruption of consonants and vowels, so as to occupy storage space. Less, improve the accuracy rate, avoid the effect of missed detection

Active Publication Date: 2017-08-15

NANJING UNIV OF SCI & TECH

View PDF12 Cites 19 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0010] In practice, it has been found that when using the traditional double-threshold method to detect consonants, as long as the short-term energy is lower than EL and the short-term zero-crossing rate is higher than ZH, it is judged as a consonant, ignoring the minimum requirement for short-term energy in voiced segments, often Cause false detection or missed detection

In addition, it is also found that some consonants have a more obvious initial segment and stronger energy, but when the tail is close to the vowel, the energy is weakened and the zero-crossing rate is also reduced.

[0011] In addition, the result of speech signal endpoint detection is heavily dependent on the thresholds of short-term energy and short-term zero-crossing rate, and how to set these thresholds, there is no unified and recognized method. If the threshold is not selected properly, it is easy to cause false detection. Or missed detection, the correct rate of endpoint detection is significantly reduced

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0025] Speech signals can be divided into silent segments and voiced segments, and voiced segments can be further divided into consonant segments, vowel segments, and transition segments between consonants and vowels. These speech segments have obvious characteristics and are easy to distinguish. Between the consonant segment and the silent segment, there are generally inconspicuous and ambiguous speech segments. The present invention first regards these speech segments as suspected consonants, and then conducts comprehensive discrimination according to the characteristics of adjacent frames, and finally merges some suspected consonants into In the obvious consonant segment, another part of suspected consonant is judged as silent segment.

[0026] The present invention comprehensively utilizes two characteristic parameters of short-time energy and short-time zero-crossing rate, and proposes a speech signal endpoint detection method based on eigenvalue coding. A threshold is se...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a voice signal endpoint detection method based on a characteristic value code. Characteristic parameters including a short-time energy and a short-time zero-crossing rate are extracted in frames, an average value of the short-time energy, an average value of the short-time zero-crossing rate and a maximal value of the short-time zero-crossing rate are calculated, four thresholds are set for the short-time energy via a statistical result and empirical parameters, one threshold is set for the short-time zero-crossing rate, a voice characteristic is coded according to the thresholds, and endpoint detection is carried out on a voice signal according to five-grade determination rules on the basis of the characteristic value code of each frame. The lowest threshold is set for the short-time energy of an audio segment, suspected voices are accepted / rejected according to the rules by combining characteristics of the adjacent frames, the five-grade determination rules can handle with different complex conditions effectively, false detection and neglected detection are avoided, and the correct rate of endpoint detection of the voice signal can be improved substantially.

Description

technical field [0001] The invention belongs to the field of speech signal processing, and in particular relates to a speech signal endpoint detection method based on eigenvalue coding. Background technique [0002] Speech signals can be divided into voiced segments and silent segments, and voiced segments can also be consonant segments, vowel segments, and transition segments between consonants and vowels. In speech recognition and speaker recognition systems, the mixing of silent segments will significantly reduce the recognition performance of the system, and the start and end positions of each vocal segment must be detected. This is the endpoint detection technology of speech signals. [0003] The speech signal has short-term stable characteristics. The short-term characteristics of the speech signal are extracted through frame processing. The short-term characteristics of the time domain mainly include short-term energy and short-term zero-crossing rate. Endpoint detect...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

IPC IPC(8): G10L15/05G10L25/78

CPCG10L15/05G10L25/78G10L2025/783

Inventor张二华王满洪王明合唐振民许昊

OwnerNANJING UNIV OF SCI & TECH

Voice signal endpoint detection method based on characteristic value code

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology