Beam forming method and system based on time-frequency masking value estimation

A technology of time-frequency masking and beamforming, applied in speech analysis, speech recognition, instruments, etc., can solve the problems of not fully utilizing phase information, long iteration time, affecting performance, etc., to solve the mismatch of training and test data, and good application Foreground, Accuracy Improvement Effect

Active Publication Date: 2021-04-30
PLA STRATEGIC SUPPORT FORCE INFORMATION ENG UNIV PLA SSF IEU +1
View PDF3 Cites 7 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

In the existing speech enhancement processing, the time-frequency masking value estimation based on the neural network has the problem of training-test data mismatch, which affects the performance. The time-frequency masking value estimation based on the spatial domain...

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Beam forming method and system based on time-frequency masking value estimation
  • Beam forming method and system based on time-frequency masking value estimation
  • Beam forming method and system based on time-frequency masking value estimation

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0026] In order to make the purpose, technical solutions and advantages of the present invention more clear and understandable, the present invention will be described in further detail below in conjunction with the accompanying drawings and technical solutions.

[0027] Embodiment of the present invention, see figure 1 As shown, a beamforming method based on time-frequency masking value estimation is provided for speech enhancement in speech recognition applications, including the following:

[0028] S101. Obtain a multi-channel speech sequence, perform Fourier transform on the speech sequence and extract amplitude spectrum features and spatial features;

[0029] S102, obtain the multi-channel speech spectrum feature sequence by logarithmic transformation to the amplitude spectrum feature; send the multi-channel speech spectrum feature sequence into the neural network model of pre-training optimization, and obtain the complex-valued time-frequency masking value by this neural...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention belongs to the technical field of speech enhancement, and particularly relates to a beam forming method and system based on time-frequency masking value estimation, and the method comprises the steps: obtaining a multi-channel speech sequence, and extracting amplitude spectrum features and spatial domain features through Fourier transform; carrying out logarithm transformation on the amplitude spectrum characteristics to obtain a multi-channel voice frequency spectrum characteristic sequence, and sending the multi-channel voice frequency spectrum characteristic sequence to a pre-trained and optimized neural network model to obtain a complex value time-frequency masking value; converting the complex value time-frequency masking value into a voice existence probability, and obtaining a time-frequency masking value by utilizing a probability model; calculating a voice signal covariance matrix according to the time-frequency masking value and the multi-channel voice feature sequence, and performing eigenvalue decomposition on the covariance matrix to obtain a beamforming filter coefficient; and in combination with the beamforming filter coefficient, performing filtering processing on the multi-channel voice sequence voice features by using a beamforming filter to obtain an enhanced voice signal. According to the invention, the neural network and spatial clustering are integrated to estimate the time-frequency masking value, and the performance of beam forming and speech recognition is improved.

Description

technical field [0001] The invention belongs to the technical field of speech enhancement, in particular to a beamforming method and system based on time-frequency masking value estimation. Background technique [0002] Speech coding and speech recognition research is often performed under laboratory conditions, that is, in environments with high signal-to-noise ratios or in noise-free environments. Therefore, when speech processing moves from the laboratory to practical application, many methods cannot be used due to the existence of actual environmental noise and interference, and the performance will drop rapidly. Therefore, it is a practical problem that must be solved to study the processing of improving the auditory effect or improving the signal-to-noise ratio of the noise-reduced speech. The essence of speech enhancement is speech noise reduction. In other words, in daily life, the speech collected by the microphone is usually "polluted" speech with different noises...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G10L21/0216G10L25/30G10L15/20
CPCG10L15/20G10L21/0216G10L25/30G10L2021/02166
Inventor 屈丹郭晓波杨绪魁邱泽宇李真郝朝龙魏雪娟
Owner PLA STRATEGIC SUPPORT FORCE INFORMATION ENG UNIV PLA SSF IEU
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products