Check patentability & draft patents in minutes with Patsnap Eureka AI!

Keyword detection model based on end-to-end deep convolutional neural network

A keyword detection and neural network technology, applied in the field of keyword detection models based on end-to-end deep convolutional neural networks, can solve problems such as limited resources of embedded devices, achieve low computing and memory overhead, simple training methods, The effect of the simple structure of the model

Pending Publication Date: 2020-07-31
ZHEJIANG UNIV
View PDF1 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0006] Aiming at the problem of limited resources of embedded devices, the present invention provides a keyword detection model based on an end-to-end deep convolutional network

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Keyword detection model based on end-to-end deep convolutional neural network
  • Keyword detection model based on end-to-end deep convolutional neural network

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0017] The technical solution of the present invention will be further described below in conjunction with the accompanying drawings. The present invention proposes a keyword detection model based on an end-to-end deep convolutional neural network. This model mainly uses a convolutional neural network. The specific network structure is as figure 1 As shown, the training method is as figure 2 Shown:

[0018] This model uses fixed-length speech as input, and the speech length is 2012ms, and the audio data is augmented, including changes in audio volume, audio speed and agc gain. After MFCC feature extraction, it is converted into a fixed-length two-dimensional feature with a feature dimension of 100*10, and it is transformed into TFrecord. After 3 layers of convolution, 1 layer of pooling and 2 layers of full connection, the confidence of each command word is finally output in the softmax layer. The model structure parameters are: 48 10 4 2 1 64 8 4 1 132 4 1 2 1 32 128, wh...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a keyword detection model based on an end-to-end deep convolutional neural network, which belongs to the field of keyword recognition in the field of voice recognition. According to the model, voice with a fixed length is used as input, after MFCC feature extraction, the voice is converted into two-dimensional features with a fixed length, and after three-layer convolution,one-layer pooling and two-layer full connection, the confidence of each command word is finally output on a softmax layer. The model provided by the invention has the characteristics of being simplein model structure, simple in training method, relatively low in calculation and memory overhead, high in identification precision and relatively strong in anti-noise performance, and is more suitablefor being deployed on embedded equipment.

Description

technical field [0001] The invention relates to the field of keyword recognition in the field of speech recognition, in particular to a keyword detection model based on an end-to-end deep convolutional neural network. Background technique [0002] In the scenario where the keyword recognition model is deployed on an embedded device, due to the limited resources of the embedded device itself, there will be insufficient memory and computing resources. Generally, there are two solutions: one is to quantize the model, and quantize the floating-point weight and bias into a fixed-point type; the other is to adjust the model structure and adopt a lightweight network model. The first method works well, but the combination of the second method and the first method has been proved to be more effective experimentally. [0003] Summarizing the current development status of speech recognition, dnn, rnn / lstm and cnn are considered to be several mainstream directions in speech recognition...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G10L15/06G10L15/16G06N3/08G06N3/04
CPCG10L15/063G10L15/16G06N3/08G06N3/047G06N3/045
Inventor 黄凯陈飘
Owner ZHEJIANG UNIV
Features
  • R&D
  • Intellectual Property
  • Life Sciences
  • Materials
  • Tech Scout
Why Patsnap Eureka
  • Unparalleled Data Quality
  • Higher Quality Content
  • 60% Fewer Hallucinations
Social media
Patsnap Eureka Blog
Learn More