Rapid keyword detection method based on quantile self-adaption cutting

A quantile and keyword technology, applied in the field of fast adaptive clipping of local paths, can solve problems such as low system efficiency and inability to effectively clip local paths to the greatest extent, so as to improve detection efficiency, reduce scale, and improve recognition speed. Effect

Active Publication Date: 2013-03-20
HARBIN INST OF TECH
View PDF4 Cites 2 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] The purpose of the present invention is to solve the problem that in the decoding process of the keyword detection system, the adaptive clipping method cannot effectively clip the local path to the greatest extent, resulting in low system efficiency. The present invention provides a quantile-based adaptive clipping method. Fast Keyword Detection Method

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Rapid keyword detection method based on quantile self-adaption cutting
  • Rapid keyword detection method based on quantile self-adaption cutting
  • Rapid keyword detection method based on quantile self-adaption cutting

Examples

Experimental program
Comparison scheme
Effect test

specific Embodiment approach 1

[0029] Embodiment 1: This embodiment is a fast keyword detection method based on quantile adaptive clipping, which is realized through the following steps:

[0030] Step 1, input the speech signal to be detected, carry out preprocessing to the input speech signal to be detected, feature extraction obtains speech feature vector sequence X={x 1 , x 2 ,...x S}, where S represents a natural number;

[0031] Step 2, according to the Viterbi decoding algorithm, the speech feature vector sequence is decoded on the pre-defined recognition network;

[0032] Step 3. For any time t, all local paths are extended forward once to obtain the corresponding active model on the corresponding local path, and the state of each active model is calculated at the same time to generate x t , and summing the states of each active model yields x t The probability of the corresponding local path probability score, where x t ∈X, 1≤t≤S, t is an integer;

[0033] Step 4. Carry out quantile-based stat...

specific Embodiment approach 2

[0040] Embodiment 2: The difference between this embodiment and Embodiment 1 is that in step 4, the quantile-based state layer local path clipping is performed, as follows:

[0041] Step 1. Set the percentage α and the weighting factor λ of the local path required to be reserved at time t, where the value of α is 0<α<1, and the value of λ is 1<λ<3;

[0042] Step 2. Save all local path probability scores at time t (that is, the corresponding local path probability scores obtained in step 3) into the array score[1...N], assuming that there are N local paths at time t;

[0043] Step 3. Find the number S with the largest N×α in score[1...N] according to the binary search algorithm α , that is, the upper α quantile;

[0044] Step 4. Set the beam width clipped at time t as beam(t)=λ×(S max -S α ) (1<λ<3);

[0045] Step 5, set the clipping threshold at time t as thresh(t)=S max -beam(t), where S max is the maximum number in the array score[1...N];

[0046] Step 6. Traverse eac...

specific Embodiment approach 3

[0049] Specific embodiment three: the difference between this embodiment and specific embodiment one or two is that the process of feature extraction in step one to obtain the feature vector sequence is: the speaker signal s(n) (i.e. the speech signal to be detected) is sampled and quantized and Pre-emphasis processing, assuming that the speaker signal is short-term stable, so the speaker signal can be divided into frames. The specific frame division method is realized by using a movable finite length window for weighting. The weighted voice signal the s w (n) calculate Mel cepstrum coefficient (MFCC parameter), thereby obtain feature vector sequence X={x 1 , x 2 ,...,x s}. Other steps and parameters are the same as those in Embodiment 1 or Embodiment 2.

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

A rapid keyword detection method based on quantile self-adaption cutting relates to a rapid detection method of the keyword in a continuous speech. By using the self-adaption cutting method in a keyword detection system decoding process, a local path can not be cut maximumly so that a system efficiency is low. By using the method of the invention, the above problem can be solved. The method is characterized by: extracting a detection voice characteristic so as to obtain a characteristic vector sequence; according to Viterbi decoding, calculating a probability of generating the characteristic vector by a movable model state on the local path and accumulating so as to obtain a local path probability score; then carrying out state-layer local path cutting based on the quantile; and then determining whether a speech end is arrived; if the speech end is arrived, back tracking and researching the keyword according to a grid generated during a decoding process, and confirming a keyword candidate based on a posterior probability so as to obtain an identification result; otherwise, decoding again. The method can be well embedded into an original keyword detection system. In every moment of the decoding process, the impossible path can be effectively cut off. A search space scale can be reduced maximally and a system detection efficiency can be increased.

Description

technical field [0001] The invention relates to a method for quickly detecting keywords in continuous speech, in particular to a method for quickly and adaptively cutting out local paths in the Viterbi decoding process. Background technique [0002] Speech recognition is a technology in which a machine converts human voice signals into corresponding text or commands through the process of recognition and understanding. respond accordingly. Keyword detection is an important research field in speech recognition, which is the process of recognizing a set of given words from continuous speech. It is an unrestricted speech signal processing system that allows users to speak in a natural way without being restricted to a specific grammar. Compared with continuous speech recognition, keyword detection has the advantages of high detection rate, strong practicability, and less time consumption, and has broad application prospects. Although keyword detection technology has these ad...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G10L15/08G10L25/54
Inventor 韩纪庆袁浩李海洋
Owner HARBIN INST OF TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products