Audio enhancement method and system

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
An audio enhancement and multi-channel audio technology, which is applied in voice analysis, instruments, etc., can solve the problems of sorting fuzzy, algorithm calculation complexity, etc., and achieve the effect of overcoming sorting fuzzy

Active Publication Date: 2019-11-01

AISPEECH CO LTD

View PDF6 Cites 9 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0003] However, the defects of the above method mainly lie in the following two aspects. The first aspect is that after the CGMM model parameters are randomly initialized, in order to make the CGMM model achieve better results, it is usually necessary to use the EM algorithm to iteratively update the parameters more than 20 times, so the calculation of the algorithm very complex

The second defect is that since the algorithm is performed in the frequency domain, the calculations between the frequency bands are independent of each other

[0006] In the implementation process of the method in the prior art, in order to ensure the later application of audio, such as recognition and other operations, it is necessary to iterate the original collected audio multiple times, so the computational complexity of the algorithm is very large

The category corresponding to each masking value is uncertain, resulting in ambiguous sorting problems

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0044] In order to make the purpose, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below in conjunction with the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments It is some embodiments of the present invention, but not all of them. Based on the implementation manners in the present invention, all other implementation manners obtained by persons of ordinary skill in the art without creative efforts fall within the protection scope of the present invention.

[0045]In order to solve the two defects of the existing method, the present invention uses a direction of arrival estimation method to process the original multi-channel audio to obtain the spatial spectrum information of the original audio. The DOA (direction of arrival, direction of arrival) corresponding to the peak va...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses an audio enhancement method. Spatial spectrum of the original multi-channel audio frequency is obtained through a direction of arrival (DOA) estimation algorithm. A plurality of peak values greater than a set threshold value are obtained from the spatial spectrum; and a plurality of estimation direction values of the plurality of peak values are obtained according to the DOA estimation method. A spatial covariance matrix of the plurality of estimation direction values is obtained according to the plurality of estimation direction values and a guiding vector of a microphone array. A CGMM (complex Gaussian mixture model) is initialized and established according to the spatial covariance matrix; and a parameter of the CGMM is updated iteratively by a clustering method.The original multi-channel audio frequency is enhanced by an MVDR (minimum variance distortionless response) beamforming algorithm to obtain an enhanced audio. The method reduces the number of timesthat an EM algorithm iteratively updates the parameter of the CGMM, thereby greatly reducing the amount of calculations. Meanwhile, a category of time-frequency point masking values obtained for eachfrequency band is determined, so that masking values of the same category for each frequency band can be combined together, to solve the problem of fuzzy sequencing.

Description

technical field [0001] The invention belongs to the technical field of speech recognition, in particular to an audio enhancement method and system. Background technique [0002] At present, the masking value of the time-frequency point is mostly obtained through CGMM (complex Gaussian mixture model, complex Gaussian mixture model), and then MVDR (minimum variance distortionless response, minimum variance distortionless response) is used for speech enhancement. [0003] However, the defects of the above method mainly lie in the following two aspects. The first aspect is that after the parameters of the CGMM model are randomly initialized, in order to make the CGMM model achieve better results, it is usually necessary to use the EM algorithm to iteratively update the parameters more than 20 times, so the calculation of the algorithm The complexity is very large. The second defect is that since the algorithm is performed in the frequency domain, the calculations between the fr...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

IPC IPC(8): G10L21/0216

CPCG10L21/0216G10L2021/02166

Inventor 任维怡周强

Owner AISPEECH CO LTD

Audio enhancement method and system

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology