Audio and video wake-up method, system and device, and storage medium

What is AI technical title?
AI technical title is built by Patsnap AI team. It summarizes the technical point description of the patent document.
A kind of audio and video, voice wake -up technology, is used in the field of equipment and storage medium, system, audio and video wake -up method, which can solve problems such as low wake -up rate and decline in the performance of the system.

Pending Publication Date: 2021-09-14

UNIV OF SCI & TECH OF CHINA

View PDF0 Cites 0 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0012] In summary, the existing HMM-GMM-based voice wake-up and deep learning-based voice wake-up schemes, the performance of the voice wake-up system will drop sharply in real complex environments, especially in noisy and far-field environments. The wake-up rate is still relatively low. Low

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0036] The technical solutions in the embodiments of the present invention will be clearly and completely described below in conjunction with the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are only some of the embodiments of the present invention, not all of them. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without making creative efforts belong to the protection scope of the present invention.

[0037] The embodiment of the present invention provides an audio and video wake-up method based on teacher-student cross-modal learning. Compared with the single-mode voice wake-up model, the system performance of this method is superior in complex environments such as high noise. At the same time, compared with only using multi-mode The performance of the audio and video wake-up model system obtained by random initialization training of dynamic aud...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses an audio and video wake-up method, system and device, and a storage medium, which introduces a video mode to improve and enhance the performance of a wake-up system, can adapt to a wake-up task in a real complex scene, improves the wake-up rate, and improves the interaction experience. moreover, aiming at the characteristic that the audio and video multi-mode wake-up data volume is relatively small, the invention provides effective information which is obtained by using a cross-mode-based teacher-student model and migrating and utilizing abundant large-data-volume single-mode acoustic data training, so system performance loss caused by relatively small multi-mode audio and video wake-up training data volume is ameliorated, and the wake-up rate is improved.

Description

technical field [0001] The present invention relates to the technical fields of voice signal processing and video signal processing, in particular to an audio and video wake-up method, system, device and storage medium. Background technique [0002] Voice wake-up, also known as wake-up word recognition technology, is a special speech recognition technology that aims to detect specific segments of speakers in real-time in continuous speech streams. It has been used in scenarios such as smart vehicles, service robots, and smart homes. widely used. Audio-video wake-up aims to further improve the performance of the wake-up model by using a video signal synchronized with speech as an auxiliary input. Voice wake-up technology based on deep learning is currently a hotspot and mainstream method in academia and industry research. Its research can be divided into two categories: voice wake-up based on HMM-GMM and voice wake-up based on deep learning. [0003] 1. Voice wake-up based ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & AuthorityApplications(China)

IPC IPC(8): G10L15/06G10L15/16G10L15/22G10L15/26G06K9/62G06N3/04G06N3/08

CPCG10L15/063G10L15/16G10L15/22G10L15/26G06N3/08G10L2015/223G06N3/047G06N3/045G06F18/2415G06F18/241

Inventor周恒顺杜俊

OwnerUNIV OF SCI & TECH OF CHINA

Audio and video wake-up method, system and device, and storage medium

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology