Speech enhancement for target speakers and speed enhancement method

What is AI technical title?
AI technical title is built by PatSnap AI team. It summarizes the technical point description of the patent document.
A target and voice signal technology, which is applied in voice analysis, instruments, etc., can solve the problems of high hearing loss, voice quality reduction of the hearing impaired, difficulty in setting up microphones, etc.

Active Publication Date: 2018-04-17

GMEMS TECH SHENZHEN LTD

View PDF9 Cites 26 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0002] Speech / sound plays an important role in human-human interaction. However, the ubiquitous environmental noise and interference will significantly reduce the quality of the voice information captured by the microphone. Some applications (such as automatic speaker recognition (Automatic Speech Recognition, ASR) and speaker recognition) are particularly susceptible to these environmental noises and interferences, and hearing-impaired people also suffer from reduced voice quality, although people with normal hearing can tolerate the extracted voice signals There is considerable noise and interference in the middle, but the listener's hearing fatigue tends to increase with the increase of time exposed to low signal to noise ratio (Signal tonoise ratio, SNR) speech

[0003] On many devices (such as smartphones, tablets, or laptops), it is not easy to set up more than one microphone. The use of microphone arrays can be achieved by beamforming, blind source separation (BSS), independent component analysis (Independent Component Analysis, ICA) and other appropriate signal processing algorithms to improve voice quality, however, in the sound field set by the microphone array, there may be multiple voice sources, and these signal processing algorithms cannot decide which Which source signal to keep and cannot decide which one has to be suppressed along with noise and interference

In the existing technology, a linear array is used, and it is assumed that the sound wave of the desired sound source penetrates the array from the middle or both ends of the array. Therefore, correspondingly, broadside beamforming or endfire beamforming is used to increase the desired sound wave. speech signals, these prior art techniques limit the effectiveness of microphone arrays, at least in some cases

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0024] Below in conjunction with accompanying drawing, structural principle and working principle of the present invention are specifically described:

[0025] Summary of the invention

[0026] The present invention describes a speech enhancement method for at least one speaker among a plurality of target speakers, which uses at least two of a plurality of microphones to extract an audio mixture signal, using a blind source separation (BSS) algorithm or an independent component Analysis (ICA) algorithm to separate the audio mixture into nearly statistically independent audio components, for each audio component, at least one of a plurality of preset target speaker characteristic data is used to evaluate the selected The probability or probability that the audio components of the audio components belong to the desired target speaker is then weighted according to the aforementioned likelihood and then mixed to generate a single extracted speech signal that best matches the tar...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

A method of speech enhancement for target speakers and a speed enhancement method are presented. A blind source separation (BSS) module is used to separate a plurality of microphone recorded audio mixtures into statistically independent audio components. At least one of a plurality of speaker profiles are used to score and weight each audio components, and a speech mixer is used to first mix the weighted audio components, then align the mixed signals, and finally add the aligned signals to generate an extracted speech signal. Similarly, a noise mixer is used to first weight the audio components, then mix the weighted signals, and finally add the mixed signals to generate an extracted noise signal. Post processing is used to further enhance the extracted speech signal with a Wiener filtering or spectral subtraction procedure by subtracting the shaped power spectrum of extracted noise signal from that of the extracted speech signal.

Description

technical field [0001] The present invention relates to a method for enhancing digital speech signals of a target object by using signal processing algorithms and acoustic models. The present invention further relates to a speech enhancement system utilizing microphone array signal processing and speaker identification. Background technique [0002] Speech / sound plays an important role in human-human interaction. However, the ubiquitous environmental noise and interference will significantly reduce the quality of the voice information captured by the microphone. Some applications (such as automatic speaker recognition (Automatic Speech Recognition, ASR) and speaker recognition) are particularly susceptible to these environmental noises and interferences, and hearing-impaired people also suffer from reduced voice quality, although people with normal hearing can tolerate the extracted voice signals There is considerable noise and interference in the middle, but the listening f...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & AuthorityApplications(China)

IPC IPC(8): G10L19/032G10L19/26G10L21/0216G10L21/0272G10L25/21

CPCG10L19/032G10L19/26G10L21/0216G10L21/0272G10L25/21G10L2021/02166G10L21/0232G10L21/028G10L21/0308G10L25/51

Inventor李细林卢延祯

OwnerGMEMS TECH SHENZHEN LTD

Speech enhancement for target speakers and speed enhancement method

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology