Combined model training method and system

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A technology that combines models and training methods, applied in systems, speech analysis, instruments, etc. to determine the direction or offset, can solve the problems of inability to guarantee positioning performance, increase the amount of calculation, and achieve an accurate and robust DOA Estimation effect, voice interaction effect improvement, effect of improving accuracy

Active Publication Date: 2019-05-03

AISPEECH CO LTD

View PDF2 Cites 24 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0005] Keyword-based target speaker localization method: Because it trains the mask network separately, the obtained time-frequency mask and localization tasks are independent of each other, which cannot guarantee the best localization performance; and the input features it uses are pre-extracted The phase difference feature between the sine-cosine channels increases the amount of additional calculation

The joint training method of time-frequency mask and DOA estimation network based on acoustic vector sensor: it uses an acoustic vector sensor, which is more complex and costly than ordinary microphone arrays; the estimated time-frequency mask is in the complex domain Compared with the real number field, it is more complex and has a large amount of calculation; the input features used are the data ratio between channels of the sub-band, power spectrum, coherence vector, etc., and feature extraction needs to be performed explicitly in advance, which increases the amount of additional calculation.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0023] In order to make the purpose, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below in conjunction with the drawings in the embodiments of the present invention. Obviously, the described embodiments It is a part of embodiments of the present invention, but not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without creative efforts fall within the protection scope of the present invention.

[0024] like figure 1 Shown is a flowchart of a joint model training method provided by an embodiment of the present invention, including the following steps:

[0025] S11: implicitly extracting the phase spectrum and the logarithmic magnitude spectrum of the noisy speech training set;

[0026] S12: Using the amplitude spectrum segment expanded b...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The embodiment of the invention provides a combined model training method. The method comprises the following steps: extracting the phase spectrum and the logarithm magnitude spectrum of a noisy voicetraining set in an implicit manner; by utilizing the magnitude spectrum fragments of the logarithm magnitude spectrum after expansion as the input features of a time frequency masking network, and byutilizing the noisy voice training set and a clear voice training set, determining a target masking label used for training the time frequency masking network, based on the input features and the target masking label, training the time frequency masking network, and estimating a soft threshold mask; and enhancing the phase spectrum of the noisy voice training set by utilizing the soft threshold mask, wherein the enhanced phase spectrum is adopted as the input features of a DOA (direction of arrival) estimation network, and training the DOA estimation network. The embodiment of the invention further provides a combined model training system. According to the embodiment of the invention, by setting the target masking label, the input features are extracted in an implicit manner, and the time frequency masking network and DOA estimation network combined training is more suitable for the DOA estimation task.

Description

technical field [0001] The invention relates to the field of sound source localization, in particular to a joint model training method and system. Background technique [0002] Sound source localization is the task of estimating the speaker DOA (Direction of arrival) from the received speech signal. DOA estimation is essential for various applications such as human-computer interaction and teleconferencing, and is also widely used in speech Enhanced beamforming. For example, sound source localization is added to the chat video. As the chat user's position changes, the voice received by the user at the other end can feel the change of the other party's position, improving the user experience. [0003] In order to determine the direction of arrival, the target speaker localization method based on keywords can be used: use the neural network to estimate the time-frequency mask separately, and then use the estimated mask to enhance the input features of the direction of arrival...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Applications(China)

IPC IPC(8): G10L15/06G10L15/16G01S3/808

Inventor 钱彦旻张王优周瑛

Owner AISPEECH CO LTD

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Combined model training method and system

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology