Target voice extraction method, device and equipment, medium and joint training method

A technology for target speech and speech extraction, applied in speech analysis, speech recognition, instruments, etc., can solve the problems of neglecting and not considering the output results of speech feature extraction models, and achieve the effect of improving signal-to-noise ratio and accuracy

Active Publication Date: 2020-05-19
TENCENT TECH (SHENZHEN) CO LTD
View PDF6 Cites 8 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Generally, the parameters of the speech extraction model are optimized only based on the extracted target speech data, without considering the output results

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Target voice extraction method, device and equipment, medium and joint training method
  • Target voice extraction method, device and equipment, medium and joint training method
  • Target voice extraction method, device and equipment, medium and joint training method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0026] The following will clearly and completely describe the technical solutions in the embodiments of the present disclosure with reference to the drawings in the embodiments of the present disclosure. Apparently, the described embodiments are only some of the embodiments of the present disclosure, not all of them. Based on the embodiments in the present disclosure, all other embodiments obtained by persons of ordinary skill in the art without creative efforts shall fall within the protection scope of the present disclosure.

[0027] "First", "second" and similar words used in the present disclosure do not indicate any order, quantity or importance, but are only used to distinguish different components. Likewise, "comprising" or "comprises" and similar words mean that the elements or items appearing before the word include the elements or items listed after the word and their equivalents, and do not exclude other elements or items. Words such as "connected" or "connected" a...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The disclosure provides a target voice extraction method, device and equipment, a medium and a joint training method. The target voice extraction system comprises a voice feature extraction model anda voice extraction model. The joint training method of the target voice extraction system comprises the following steps: performing feature extraction on training reference sample voice data by usingthe voice feature extraction model to obtain a reference voice feature vector, setting the training reference sample voice data to be pure voice data corresponding to the target object; performing feature fusion based on training voice data and the reference voice feature vector by using the voice extraction model to obtain a fusion feature vector, setting the training voice data to be noisy voicedata from which target voice data corresponding to a target object is to be extracted; utilizing the voice extraction model to perform voice extraction based on the training voice data and the fusionfeature vector to obtain the target voice data; and performing joint training on the voice feature extraction model and the voice extraction model based on the reference voice feature vector and thetarget voice data.

Description

technical field [0001] The present disclosure relates to the field of speech data processing, and more specifically, to a target speech extraction method, device, equipment, medium and joint training method. Background technique [0002] With the rapid development of artificial intelligence technology, a speech extraction system based on artificial intelligence has emerged. For example, by using a neural network configured to extract speech data, target speech data corresponding to a specific speaker can be extracted from noisy speech data including noise, the extraction or referred to as filtering. For example, the noise may be background noise, or voice data corresponding to one or more speakers other than the specific speaker. The neural network may include two parts: a speech feature extraction model and a speech extraction model. The speech feature extraction model is used to extract the speech features of the specific speaker from the reference sample speech data of ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G10L15/02G10L15/06
CPCG10L15/02G10L15/063
Inventor 纪璇于蒙张春雷苏丹俞栋于涛
Owner TENCENT TECH (SHENZHEN) CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products