Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Speaker-independent single-channel voice separation method

A speaker-independent, speech separation technology, applied in speech analysis, instruments, etc., can solve problems such as label ambiguity, speaker-independent speech separation algorithm poor performance, etc., and achieve better separation effect

Active Publication Date: 2020-08-25
NAT UNIV OF DEFENSE TECH
View PDF15 Cites 11 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

The label ambiguity (or permutation) problem is the most important reason for the poor performance of speaker-independent speech separation algorithms in previous studies

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Speaker-independent single-channel voice separation method
  • Speaker-independent single-channel voice separation method
  • Speaker-independent single-channel voice separation method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0041] The following will clearly and completely describe the technical solutions in the embodiments of the present invention with reference to the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are only some, not all, embodiments of the present invention. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without making creative efforts belong to the protection scope of the present invention.

[0042] When using the DNN regression model to solve the source separation problem, the mainstream masking-based training objectives include: ideal floating value masking IRM, phase sensitive masking PSM and complex ideal floating value masking cIRM, the above methods will be briefly introduced.

[0043] (1) Ideal floating value masking (IRM, Ideal Ratio Mask)

[0044] The speech signal is sampled at a certain frequency. At discrete time m, the target speech si...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a speaker-independent single-channel voice separation method, which comprises the following steps: preparing a data set, and carrying out data preprocessing; establishing a single-channel voice separation model based on complex ideal floating value masking; when the single-channel voice separation model is trained, adopting statement-level replacement invariance training; and inputting the mixed voice data into the trained model for voice separation. According to the speaker-independent single-channel voice separation method based on statement-level replacement invariance training and complex ideal floating value masking, complex ideal floating value masking estimation is effectively and accurately realized through statement-level replacement invariant training; inthe method, a bidirectional long-short-term memory neural network structure is adopted to estimate complex number ideal floating value masking, and a statement-level replacement invariant training standard is further utilized to solve the problem of label blurring, so that single-channel voice separation has a good effect.

Description

technical field [0001] The invention belongs to the technical field of intelligent speech processing, and in particular relates to a speaker-independent single-channel speech separation based on sentence-level permutation invariance training and complex ideal floating value masking. Background technique [0002] The goal of the speech source separation task is to extract multiple speech source signals from a mixed speech signal containing two or more speech sources, one for each speaker. In general, speech separation problems can be divided into monophonic (i.e., single-channel) and array-based (i.e., multi-channel) source separation problems, depending on the number of microphones or channels. For the former problem, the mainstream research method is to extract the target speech from the acoustic and statistical characteristics of the target speech and the interference signal, or remove the interference signal from the mixed signal. In multi-channel speech separation probl...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G10L21/0272
CPCG10L21/0272
Inventor 张文宋君强任开军李小勇邓科峰周翱隆汪祥任小丽邵成成吴国溧
Owner NAT UNIV OF DEFENSE TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products