Keyword detection model based on end-to-end deep convolutional neural network

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A keyword detection and neural network technology, applied in the field of keyword detection models based on end-to-end deep convolutional neural networks, can solve problems such as limited resources of embedded devices, achieve low computing and memory overhead, simple training methods, The effect of the simple structure of the model

Pending Publication Date: 2020-07-31

ZHEJIANG UNIV

View PDF1 Cites 0 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0006] Aiming at the problem of limited resources of embedded devices, the present invention provides a keyword detection model based on an end-to-end deep convolutional network

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0017] The technical solution of the present invention will be further described below in conjunction with the accompanying drawings. The present invention proposes a keyword detection model based on an end-to-end deep convolutional neural network. This model mainly uses a convolutional neural network. The specific network structure is as figure 1 As shown, the training method is as figure 2 Shown:

[0018] This model uses fixed-length speech as input, and the speech length is 2012ms, and the audio data is augmented, including changes in audio volume, audio speed and agc gain. After MFCC feature extraction, it is converted into a fixed-length two-dimensional feature with a feature dimension of 100*10, and it is transformed into TFrecord. After 3 layers of convolution, 1 layer of pooling and 2 layers of full connection, the confidence of each command word is finally output in the softmax layer. The model structure parameters are: 48 10 4 2 1 64 8 4 1 132 4 1 2 1 32 128, wh...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a keyword detection model based on an end-to-end deep convolutional neural network, which belongs to the field of keyword recognition in the field of voice recognition. According to the model, voice with a fixed length is used as input, after MFCC feature extraction, the voice is converted into two-dimensional features with a fixed length, and after three-layer convolution,one-layer pooling and two-layer full connection, the confidence of each command word is finally output on a softmax layer. The model provided by the invention has the characteristics of being simplein model structure, simple in training method, relatively low in calculation and memory overhead, high in identification precision and relatively strong in anti-noise performance, and is more suitablefor being deployed on embedded equipment.

Description

technical field [0001] The invention relates to the field of keyword recognition in the field of speech recognition, in particular to a keyword detection model based on an end-to-end deep convolutional neural network. Background technique [0002] In the scenario where the keyword recognition model is deployed on an embedded device, due to the limited resources of the embedded device itself, there will be insufficient memory and computing resources. Generally, there are two solutions: one is to quantize the model, and quantize the floating-point weight and bias into a fixed-point type; the other is to adjust the model structure and adopt a lightweight network model. The first method works well, but the combination of the second method and the first method has been proved to be more effective experimentally. [0003] Summarizing the current development status of speech recognition, dnn, rnn / lstm and cnn are considered to be several mainstream directions in speech recognition...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

IPC IPC(8): G10L15/06G10L15/16G06N3/08G06N3/04

CPCG10L15/063G10L15/16G06N3/08G06N3/047G06N3/045

Inventor 黄凯陈飘

Owner ZHEJIANG UNIV

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Keyword detection model based on end-to-end deep convolutional neural network

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology