Audio enhancement method and system

What is AI technical title?
AI technical title is built by Patsnap AI team. It summarizes the technical point description of the patent document.
An audio enhancement, multi-channel audio technology, applied in speech analysis, instruments, etc., can solve the problems of high computational complexity of the algorithm, fuzzy sorting, etc., to reduce the amount of calculation, overcome the fuzzy sorting, and improve the effect of speech enhancement.

Active Publication Date: 2021-10-12

AISPEECH CO LTD

View PDF6 Cites 0 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0003] However, the defects of the above method mainly lie in the following two aspects. The first aspect is that after the CGMM model parameters are randomly initialized, in order to make the CGMM model achieve better results, it is usually necessary to use the EM algorithm to iteratively update the parameters more than 20 times, so the calculation of the algorithm very complex

The second defect is that since the algorithm is performed in the frequency domain, the calculations between the frequency bands are independent of each other

[0006] In the implementation process of the method in the prior art, in order to ensure the later application of audio, such as recognition and other operations, it is necessary to iterate the original collected audio multiple times, so the computational complexity of the algorithm is very large

The category corresponding to each masking value is uncertain, resulting in ambiguous sorting problems

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0044] In order to make the purpose, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below in conjunction with the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments It is some embodiments of the present invention, but not all of them. Based on the implementation manners in the present invention, all other implementation manners obtained by persons of ordinary skill in the art without creative efforts fall within the protection scope of the present invention.

[0045]In order to solve the two defects of the existing method, the present invention uses a direction of arrival estimation method to process the original multi-channel audio to obtain the spatial spectrum information of the original audio. The DOA (direction of arrival, direction of arrival) corresponding to the peak va...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses an audio enhancement method. The spatial spectrum of the original multi-channel audio is obtained by the direction of arrival estimation algorithm. Acquiring multiple peaks greater than a set threshold from the spatial spectrum; acquiring multiple estimated direction values of the multiple peaks according to a DOA wave-of-arrival estimation method. A spatial covariance matrix of multiple estimated direction values is obtained according to the multiple estimated direction values and the steering vector of the microphone array. The CGMM complex Gaussian mixture model is initialized and established according to the spatial covariance matrix; the parameters of the CGMM complex Gaussian mixture model are iteratively updated by the clustering method. Enhanced audio is obtained by enhancing the original multi-channel audio through the MVDR minimum variance distortion-free response beamforming algorithm. This method reduces the number of iterations of the EM algorithm to update the parameters of the CGMM model, and greatly reduces the amount of calculation. At the same time, the category of time-frequency point masking values obtained in each frequency band is definite, so that the masking values of the same category in each frequency band can be merged together, which overcomes the problem of fuzzy sorting.

Description

technical field [0001] The invention belongs to the technical field of speech recognition, in particular to an audio enhancement method and system. Background technique [0002] At present, the masking value of the time-frequency point is mostly obtained through CGMM (complex Gaussian mixture model, complex Gaussian mixture model), and then MVDR (minimum variance distortionless response, minimum variance distortionless response) is used for speech enhancement. [0003] However, the defects of the above method mainly lie in the following two aspects. The first aspect is that after the parameters of the CGMM model are randomly initialized, in order to make the CGMM model achieve better results, it is usually necessary to use the EM algorithm to iteratively update the parameters more than 20 times, so the calculation of the algorithm The complexity is very large. The second defect is that since the algorithm is performed in the frequency domain, the calculations between the fr...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & AuthorityPatents(China)

IPC IPC(8): G10L21/0216

CPCG10L21/0216G10L2021/02166

Inventor任维怡周强

OwnerAISPEECH CO LTD

Audio enhancement method and system

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology