Patents
Literature
Hiro is an intelligent assistant for R&D personnel, combined with Patent DNA, to facilitate innovative research.
Hiro

189 results about "Spectral subtraction" patented technology

Apparatus and method to classify sound to detect speech

Audio frames are classified as either speech, non-transient background noise, or transient noise events. Probabilities of speech or transient noise event, or other metrics may be calculated to indicate confidence in classification. Frames classified as speech or noise events are not used in updating models (e.g., spectral subtraction noise estimates, silence model, background energy estimates, signal-to-noise ratio) of non-transient background noise. Frame classification affects acceptance / rejection of recognition hypothesis. Classifications and other audio related information may be determined by circuitry in a headset, and sent (e.g., wirelessly) to a separate processor-based recognition device.
Owner:INTERMEC IP

Apparatus and method to classify sound to detect speech

Audio frames are classified as either speech, non-transient background noise, or transient noise events. Probabilities of speech or transient noise event, or other metrics may be calculated to indicate confidence in classification. Frames classified as speech or noise events are not used in updating models (e.g., spectral subtraction noise estimates, silence model, background energy estimates, signal-to-noise ratio) of non-transient background noise. Frame classification affects acceptance / rejection of recognition hypothesis. Classifications and other audio related information may be determined by circuitry in a headset, and sent (e.g., wirelessly) to a separate processor-based recognition device.
Owner:INTERMEC IP CORP

Voice enhancement method and device using same

The invention provides a voice enhancement method. The method comprises the following steps of: judging whether the current frame has pure noise by using a judgment device; if the current frame has pure noise and a plurality of previous frames of the current frame have pure noise, improving frequency domain signals by using a voice enhancement algorithm of an improved spectral subtraction method, otherwise, improving the frequency domain signals by using an enhancement algorithm of a voice generating model; and transforming the processed frequency domain signals to a time domain, performing de-emphasis processing and acquiring output signals. The invention also provides a device using the method. The voice enhancement method greatly improves the attenuation of residual noise, and ensures the voice intelligibility.
Owner:AAC TECH PTE LTD

Noise reduction method with self-controlling interference frequency

The present invention relates to a method with which speech is captured in a noisy environment with as high a speech quality as possible. To this end, a compact array of, for example, two single microphones is combined to form one system through signal processing methods consisting of adaptive beam formation and spectral subtraction. Through the combination with a spectral subtraction, the reference signal of the beam former is freed from speech signal components to the extent that a reference signal of the interference is formed and the beam former produces high gains.
Owner:CERENCE OPERATING CO +1

Speech recognition method, program and apparatus using multiple acoustic models

The present invention provides a speech recognition method for achieving a high recognition rate even under an environment where plural types of noise exist. Noise is eliminated by the spectral subtraction noise elimination method from each of speech data on which different types of noise are superposed, and acoustic models corresponding to each of the noise types are created based on the feature vectors obtained by analyzing the features of each of the speech data which have undergone the noise elimination. When a speech recognition is performed, a first speech feature analysis is performed on speech data to be recognized, and it is determined whether the speech data is a noise segment or a speech segment. When a noise segment is detected, the feature data thereof is stored, and when a speech segment is detected, the type of the noise is determined based on the feature data which has been stored, and a corresponding acoustic model is selected based on the result thereof. The noise is eliminated by the spectral subtraction noise elimination method from the speech data to be recognized, and a second feature analysis is performed on the speech data which has undergone the noise elimination to obtain a feature vector to be used in speech recognition.
Owner:SEIKO EPSON CORP

Microphone array multi-target voice enhancement method based on blind source separation and spectral subtraction

InactiveCN106504763ASolve environmental background noiseReduce complexitySpeech analysisBandpass filteringComputation complexity
The invention discloses a microphone array multi-target voice enhancement method based on blind source separation and spectral subtraction. The method comprises: a multi-channel multi-target signals are collected through a microphone array; band-pass filter processing is carried out on the collected single-channel signals respectively to shield non-voice noises and interference, and pre-emphasis processing is carried out; voice windowing and framing processing is carried out to obtain frame signals, short-time Fourier transform is carried out to transform all frames into a frequency domain, and amplitude spectrums and phase spectrums of all frames are extracted; a starting end point and an ending end point of a voice signal are detected and a noise power spectrum is estimated; on the basis of spectral subtraction, background noises of a voice frame are reduced; the signal outputted after spectral subtraction is combined with the phase spectrum to carry out short-time Fourier inverse transform, thereby obtaining a voice signal of a time domain; and then blind source separation is carried out to obtain all target signals. The method can be realized simply; the resource requirement is low; the computing complexity is low; and multi-target signal enhancement can be realized.
Owner:UNIV OF ELECTRONICS SCI & TECH OF CHINA

Optical fiber vibration identification system based on phi-OTDR technology and optical fiber vibration identification method thereof

The invention provides an optical fiber vibration identification system based on a phi-OTDR technology and an optical fiber vibration identification method thereof. The monitoring distance of the system is greatly enhanced through a dual-path detection structure; the adaptability of the system for environmental noise changes is enhanced by the method of characteristic threshold dynamic updating, and a vibration event is accurately positioned; background noise in signals can be greatly reduced through spectral subtraction noise reduction under the condition of maintaining the signal characteristic and energy of the vibration signals, and the signal-to-noise ratio of the signals and the sensitivity of system detection can be enhanced; and multi-characteristic parameter mode identification is performed on the vibration signals from the time domain and the wavelet domain so that the influence of other complex time-dependent interference noise can be effectively avoided, the correct rate of vibration event detection and vibration type classification can be enhanced, the false alarm rate of the system can be reduced, the detection performance of a vibration detection system based on the OTDR technology in the actual complex noise environment can be enhanced, and the national major project application requirements in the aspects of boundary safety and long-distance pipeline safety can be met.
Owner:ZHEJIANG UNIV

Noise suppressor and noise suppressing method

Speech / non-speech determining section 103 makes a speech / non-speech determination of whether a speech spectrum is of a speech interval with a speech included or of a non-speech interval with only a noise and no speech included. Noise spectrum estimating section 104 estimates a noise spectrum based on the speech spectrum determined as the non-speech interval. SNR estimating section 105 obtains speech signal power from the speech interval and noise signal power from the non-speech interval in the speech spectrum, and calculates SNR from a ratio of two values. Based on the speech / non-speech determination and a value of SNR, suppression coefficient control section 106 outputs a suppression lower limit coefficient to spectrum subtraction section 107. Spectral subtraction section 107 subtracts an estimated noise spectrum from the input speech spectrum, and outputs a speech spectrum with a noise suppressed.
Owner:PANASONIC CORP

Signal enhancement and speech recognition

Provides speech enhancement techniques which are effective even for extemporaneous noise without a noise interval and unknown extemporaneous noise. An example of a signal enhancement device includes: spectral subtraction means for subtracting a given reference signal from an input signal containing a target signal and a noise signal by spectral subtraction; an adaptive filter applied to the reference signal; and coefficient control means for controlling a filter coefficient of the adaptive filter in order to reduce components of the noise signal in the input signal. In the signal enhancement device, a database of a signal model concerning the target signal expressing a given feature by means of a given statistical model is provided, and the filter coefficient is controlled based on the likelihood of the signal model with respect to an output signal from the spectral subtraction means.
Owner:IBM CORP

Water supply pipeline leakage detecting and positioning method

The invention provides a water supply pipeline leakage detecting and positioning method which is characterized in that a signal collected by a sensor is subjected to enhancement through spectral subtraction, the frequency spectrum variance of the signal after enhancement is calculated, a double-threshold method is adopted for judgment, that is, if the frequency spectrum variance is within the threshold range, which indicates that leakage occurs, the leakage point is positioned, a BP neural network is utilized to form a filter, a water leakage signal is separated from noise and subjected to generalized correlation analysis, time delay estimation is carried out according to the weight function with a good signal-to-noise ratio selectivity, three time delays of the sensor are acquired, and a water leakage positioning model is utilized for calculation to acquire water leakage point position information. According to the water supply pipeline leakage detecting and positioning method, the spectral subtraction method is adopted for signal enhancement, the double-threshold method is utilized for leakage judgment, if the frequency spectrum variance is within the threshold range, which indicates that leakage occurs, and therefore the accuracy is higher; besides, the non-leakage estimation method is adopted for estimation of the noise spectrum, the signal-to-noise ratio is increased more obviously, meanwhile the BP neural network is adopted for filtering, the generalized correlation time delay estimation accuracy is improved, and the water leakage point positioning accuracy is effectively improved.
Owner:INNER MONGOLIA UNIVERSITY

Passive sound source two-dimensional DOA (direction of arrival) estimation method under complex environment

The invention discloses a passive sound source two-dimensional DOA (direction of arrival) estimation method under a complex environment, comprising the steps that (1) voice signals in a room are collected by a uniform circular array; (2) the voice signals received by the uniform microphone array are preprocessed in a spectral subtraction method; (3) an M_AEDA algorithm is adopted to estimate the relative time delay of each microphone; (4) a direction coefficient vector is determined according to a direction coefficient formula; (5) the direction coefficient vector and the voice signals preprocessed in the step (2) are correspondingly multiplied to serve as an input signal for minimum variance undistorted response; (6) an minimum variance undistorted response algorithm is adopted to process the input signal; and (7) the output average power is subjected to spectrum peak search, and the estimation value of the sound source two-dimensional DOA is obtained accordingly. The passive sound source two-dimensional DOA (direction of arrival) estimation method under the complex environment has the advantages that the sound source can be accurately located under a reverberation and low signal-to-noise ratio environment; during sound source location, the location accuracy and accuracy rate are high; and the required equipment is simple, the passive sound source two-dimensional DOA estimation method can be applicable to real life in the aspects of video conference, robots and the like.
Owner:LIAONING UNIVERSITY OF TECHNOLOGY

Method for spectral subtraction in speech enhancement

A method and system is provided for enhancing an audio signal based on spectral subtraction. The noise power spectrum for each frame of an audio signal is dynamically estimated based on a plurality of signal power spectrum values computed from a corresponding plurality of adjacent frames. An over-subtraction factor is then dynamically computed for each frame based on the noise power spectrum estimated for the frame. The signal power spectrum of the audio signal at each frame is then reduced in accordance with the over-subtraction factor computed for the corresponding frame.
Owner:INTEL CORP

Non-air conduction speech reinforcement method based on multi-band spectrum subtraction

The present invention discloses a non-air conduction speech enhancement method based on a multi-band spectral subtraction. Because of the noise in the radar-based non-air conduction speech is always colored and no uniform influence on the speech signals within the whole range of the frequency spectrum, the method divides the speech frequency spectrum in a targeted manner into five sections without overlapping; simultaneously, each section is provided with an individual spectral subtraction coefficient, so as to achieve the effectiveness and pertinence of the algorithm. The embodiment proves the non-air conduction speech enhancement method can effectively compensate the weakness of low pertinence in the traditional speech enhancement method; moreover, the method has the advantages of highly efficient implementation, simple algorithm and obvious effect. Therefore, the method has higher practical value and application prospects.
Owner:FOURTH MILITARY MEDICAL UNIVERSITY

Method and apparatus for resisting noise based on adaptive nonlinear spectral subtraction

The disclosed speech recognition anti-noise method based on adaptive nonlinear spectrum reduction comprises: detecting speech, if average SNR over set threshold, recognizing as speech information, or else as noise information; updating current noise estimation according to noise frame in last step for the first spectrum reduction calculation to obtain the speech with high SNR; then, taking the second calculation to further eliminate noise.
Owner:PANASONIC CORP

Speech enhancement method of multi-sub-band spectral subtraction based on phase adjustment and amplitude compensation

ActiveCN103021420AEliminate isolated peaksSuppress musical noiseSpeech analysisFast Fourier transformSignal-to-noise ratio (imaging)
The invention discloses a speech enhancement method of a multi-sub-band spectral subtraction based on phase adjustment and amplitude compensation. The method mainly includes truncating signals acquired by a microphone and performing fast Fourier transform (FFT); performing micro maximum search on an amplitude spectrum through a phase adjustment algorithm to obtain an adjusted amplitude spectrum of noisy speech; estimating the amplitude spectrum of noise; dividing a whole band into a plurality of sub-bands and calculating the signal to noise ratio of each sub-band; performing amplitude spectrum subtraction of an over-subtraction rule on each sub-band; performing amplitude compensation on speech spectrums after the spectrum subtraction; and obtaining time domain waveforms of the signals through fast Fourier inversion and signal overlapping.
Owner:INST OF AUTOMATION CHINESE ACAD OF SCI

Voice denoising method and system based on L1/2 sparse constraint convolution non-negative matrix decomposition

The invention discloses a voice denoising method and system based on L1 / 2 sparse constraint convolution non-negative matrix decomposition. In single-channel voice enhancement, it is assumed that noised voice signals v(i) are additively relevant to noise signals n(i) and voice signals s(i), i.e., v(i)=n(i)+s(i), and noise-base information is obtained by training specific noise by use of a CNMF method; and then by taking a noise base as prior information, a voice base is obtained by decomposing noised voice by use of a CNMF_L1 / 2 method, and finally, voice after denoising is synthesized. According to the method, correlation of voice between frames can be better described; and strong-sparse constraining is performed on a voice-base coefficient matrix by use of L1 / 2 regular item, and the voice after separation comprises less residual noise. Compared to conventional methods such as a spectral subtraction method, a wiener filtering method and a minimum mean square deviation logarithm domain spectrum estimation method and the like, the voice after enhancement can be understood more easily.
Owner:ANHUI UNIVERSITY

Method for reducing noise by using hearing threshold of impaired hearing

InactiveCN101901602ASmall noise reduction gainAvoid musical noiseSpeech analysisSignal-to-noise ratio (imaging)Spectral subtraction
The invention relates to a method for reducing a noise by using the hearing threshold of impaired hearing. The method comprises the following steps of: dividing an input speech signal into N subband signals, compensating the hearing, respectively reducing the noise of the subband signals, and comprehensively processing various paths of input subband signals into a path of output signals by a weighted stacking filter bank. The invention provides a generalized spectral subtraction method according to noise levels and hearing threshold adjustment parameters, wherein a posteriori signal-to-noise ratio is replaced by an a priori signal-to-noise ratio to adjust a gain function, so that musical noise generated by noise reduction can be effectively reduced. The noise reduction gain function is related to the signal-to-noise ratio, and is adjusted according to the noise levels and the hearing threshold of a patient; when the hearing threshold is high, the suppression to the noise is low, and speech distortion is reduced simultaneously; and when the hearing threshold is low, the suppression to the noise is increased so as to improve listening comfort, so that the method has pertinence to different hearing loss conditions and different noise conditions, and makes the suppression effect and the speech distortion relatively balanced.
Owner:INST OF ACOUSTICS CHINESE ACAD OF SCI

Self-adaptive spectral subtraction real-time speech enhancement

The invention discloses a self-adaptive spectral subtraction real-time speech enhancement method. The method comprises: establishing a dynamic threshold including discrimination of voice and no voice of noise voice, providing noise spectrum time varying update principles according to the dynamic threshold; making full use of correlation extraction information among adjacent frames, and realizing a pure voice spectrum smooth estimation method. Aimed at a practical problem that voice signals are difficult to extract in unstable noise and strong background noise, the self-adaptive spectral subtraction speech enhancement method is provided. The method uses a rapid tracking noise algorithm to perform smooth update on the unstable noise frame by frame, and can preferably estimate noise spectrums. The algorithm can effectively restrain background noise, and improves voice quality and intelligibility after noise reduction. The method is low in calculation cost, and is easy to realize, and has good real-time property. The method provides a new approach for denoising of strong background noise and detection of weak signals.
Owner:HUNAN INT ECONOMICS UNIV

Method and apparatus for suppressing noise components contained in speech signal

InactiveUS20020128830A1Suppressing noise components contained in an input speech signal without impairing the spectrum of the speech signalSpeech recognitionTransmissionTime domainFrequency spectrum
There is provided a method of suppressing noise components contained in an input speech signal. The method includes obtaining an input spectrum by executing frequency analysis of the input speech signal by a specific frame length, obtaining an estimated noise spectrum by estimating the spectrum of the noise components, obtaining the spectral slope of the estimated noise spectrum, multiplying the estimated noise spectrum by a spectral subtraction coefficient determined by the spectral slope, obtaining a subtraction spectrum by subtracting the estimated noise spectrum multiplied with the spectral subtraction coefficient from the input spectrum, and obtaining a speech spectrum by clipping the subtraction spectrum. The method may further include correcting the speech spectrum by smoothing in at least one of frequency and time domains. In this way, a speech spectrum in which noise components have been suppressed can be obtained.
Owner:KK TOSHIBA

Audio-reverberation inhibiting device and inhibiting method thereof

ActiveCN103440869AImplement reverberation suppressionImprove auditory perception qualitySpeech analysisSound producing devicesComputation complexitySpectral subtraction
The invention discloses an audio-reverberation inhibiting device and an inhibiting method thereof. The device includes a reverberation-time blind estimation module, a later-period reverberation power spectrum estimation module, a spectral-subtraction module and a complex-cepstrum-domain filtering module. A reverberation voice estimates a reverberation time through the reverberation-time blind estimation module. The later-period reverberation power spectrum estimation module establishes a reverberation statistical model through the estimated reverberation time and carries out analysis processing on the reverberation voice so that a later-period reverberation power spectrum is obtained. The spectral-subtraction module includes a gain function structure and a spectral-subtraction implementation module and a spectral-subtraction gain function is constructed firstly through use of a reverberation-voice power spectrum and a later-period reverberation power spectrum. Then the spectral-subtraction gain function and the reverberation voice are input into the spectral-subtraction implementation module so that an earlier-period voice is obtained. Finally, the earlier-period voice is input into the complex-cepstrum-domain filtering module so that a reverberation-removed voice is obtained. The audio-reverberation inhibiting device and the inhibiting method thereof is low in calculation complexity, convenient to handle in a real-time manner and capable of inhibiting audio reverberation obviously and improving voice quality efficiently.
Owner:DALIAN UNIV OF TECH

Method for spectral subtraction in speech enhancement

A method and system is provided for enhancing an audio signal based on spectral subtraction. The noise power spectrum for each frame of an audio signal is dynamically estimated based on a plurality of signal power spectrum values computed from a corresponding plurality of adjacent frames. An over-subtraction factor is then dynamically computed for each frame based on the noise power spectrum estimated for the frame. The signal power spectrum of the audio signal at each frame is then reduced in accordance with the over-subtraction factor computed for the corresponding frame.
Owner:INTEL CORP

Speech enhancement for target speakers

A method of speech enhancement for target speakers is presented. A blind source separation (BSS) module is used to separate a plurality of microphone recorded audio mixtures into statistically independent audio components. At least one of a plurality of speaker profiles are used to score and weight each audio components, and a speech mixer is used to first mix the weighted audio components, then align the mixed signals, and finally add the aligned signals to generate an extracted speech signal. Similarly, a noise mixer is used to first weight the audio components, then mix the weighted signals, and finally add the mixed signals to generate an extracted noise signal. Post processing is used to further enhance the extracted speech signal with a Wiener filtering or spectral subtraction procedure by subtracting the shaped power spectrum of extracted noise signal from that of the extracted speech signal.
Owner:GMEMS TECH SHENZHEN LTD

Speech enhancement method based on Gaussian mixture model (GMM) noise estimation

InactiveCN104464728AEasy to trackAccurate and pure voice signalSpeech recognitionTime domainFrequency spectrum
The invention discloses a speech enhancement method based on Gaussian mixture model (GMM) noise estimation, wherein the GMM is used for estimating background noise and a spectral subtraction coefficient, spectral subtraction is conducted on noisy speech, and pure speech is recovered. Firstly, the noisy speech is preprocessed so as to obtain the amplitude and phase of the noisy speech, the amplitude is used for noise estimation and spectral subtraction, and the phase is used for recovering a time-domain signal; then, the GMM is used for estimating noise parameters and pure speech cepstrum characteristics from the noisy speech in real time, and the spectral subtraction coefficient is calculated according to the estimated pure speech cepstrum characteristics; finally, spectral subtraction is conducted on the frequency spectrum of the noisy speech, the time-domain signal is recovered, and enhanced speech is obtained according to an overlap-add method. According to the speech enhancement method, the capability of the speech enhancement algorithm to track non-stationary noise can be improved remarkably.
Owner:HOHAI UNIV

Multiple fingerprinting of petroleum oils using normalized time-resolved laser-induced fluorescence spectral subtractions

A method based on spectral subtractions of normalized time-resolved laser-induced fluorescence (TRLIF) spectra produces multiple fingerprints of petroleum oils simultaneously. The method utilizes the simultaneous excitation of the TRLIF spectra of six oil samples using synchronized optical shutters. Five of the samples are standard oil samples while the sixth is the targeted sample. Instead of one fingerprint for the targeted sample, the technique produces multiple fingerprints representing the spectral subtractions between the normalized TRLIF spectra of the target sample and those of each of the standard oil samples. The technique provides fingerprints of higher distinguishing ability than a prior method, allowing it to discriminate between closely similar petroleum oils even under weathered conditions. The technique requires no sample preparation and can be applied remotely. It can also identify the original grade of the petroleum oils from their weathered remains.
Owner:KING FAHD UNIVERSITY OF PETROLEUM AND MINERALS

Bird voice recognition method using anti-noise power normalization cepstrum coefficients (APNCC)

InactiveCN102930870AThe average recognition effect is goodNoise robustnessSpeech recognitionMulti bandNoise power spectrum
The invention provides a bird voice recognition technology based on novel noise-proof feature extraction by aiming at the problem of bird voice recognition in various kinds of background noise in ecological environment. The bird voice recognition technology comprises the following steps of firstly, obtaining noise power spectrums by a noise estimation algorithm suitable for highly nonstationary environment; secondly, performing the noise reduction on the voice power spectrums by a multi-band spectral subtraction method; thirdly, extracting anti-noise power normalization cepstrum coefficients (APNCC) by combining the voice power spectrums for noise reduction; and finally, performing contrast experiments under the conditions of different environments and signal to noise ratios (SNR) on the voice of 34 species of birds by means of extracted APNCC, power normalization cepstrum coefficient (PNCC) and Mel frequency cepstrum coefficients (MFCC) by a support vector machine (SVM). The experiments show that the extracted APNCC have a better average recognition effect and higher noise robustness and are more suitable for bird voice recognition in the environment with less than 30 dB of SNR.
Owner:FUZHOU UNIV

Network speech recognition method in English oral language machine examination system

The invention relates to a scheme of realizing network speech recognition in an English oral language machine examination system. According to the scheme, traditional spectral subtraction (SS) noise reduction technology and cepstral mean normalization (CMN) noise reduction technology are improved, combined with a probability scale DP identification method of a continuous state hidden Markov model(HMM), the invention provides a network speech recognition scheme of unspecified people in an English network examination system, and by utilizing the scheme, a network speech recognition apparatus in a physical environment is realized. By employing the above method, an SS method with input amplitude spectrum self-adapting and a CMN method based on progressive adaptive mode MAP algorithm are combined, and influence of ambient noise on an identification system is substantially reduced. Simultaneously, according to the scheme, based on a traditional DP method, by utilizing a DP algorithm of probability scale, recognition is carried out, thus a DSP speech recognition apparatus can be applied to speech recognition of unspecified people of different outdoor occasions, and a recognition system scope and recognition precision are raised.
Owner:SOUTHEAST UNIV

Improved GSC self-adaptive speech enhancement method

The invention relates to an improved GSC self-adaptive speech enhancement method. An improved GSC self-adaptive speech enhancement system is used, a weight coefficient of a post filter is estimated based on the signal received by a microphone, incoherent noises in the signal are removed by using the post wiener filter, a self-adaptive blocking matrix is used for replacing a fixed blocking matrix in a conventional GSC structure so as to better block target signals and reduce cancellation of the target signals, the iterative mode of a self-adaptive algorithm is improved, the convergence rate and steady-state misadjustment signals and filter noise interference signals are balanced, and based on the characteristics of easy noise removal, small amount of calculation and the like of a spectral subtraction method, the possible residual noise signals can be further removed by using an improved spectral subtraction method. According to the invention, the method is capable of self-adaptive speech enhancement of signals at any angle and has a certain robustness for speech signal enhancement under a random environment, and the subsequent spectral subtraction method can be used for further removing the residual noises and improving the denoising capability of the entire system.
Owner:NANJING UNIV OF INFORMATION SCI & TECH

A voiceprint recognition method based on 3D convolution neural network

The invention discloses a voiceprint recognition method based on 3D convolution neural network, which comprises the following steps: step 1, preprocessing the speech signal. In the process of speech acquisition, there will be more channel noise, which will bring great difficulty to the recognition task, therefore, firstly, the input speech data is denoised by spectral subtraction, that is, noise spectrum estimation is subtracted from the noise speech estimation, so as to obtain the spectrum of pure speech. Here, the channel noise is eliminated, and the channel noise is the noise caused by therecording equipment; all the information about the speaker is completely preserved while the channel noise is removed. Compared with other methods, the spectral subtraction method introduces the leastconstraint conditions, the most direct physical meaning and the small calculation amount, so that the accuracy of recognition can be effectively improved.
Owner:GUANGDONG UNIV OF TECH

Speech enhancement processing method

The invention discloses a speech enhancement processing method. The method comprises the steps that a training sample is formed based on speech data and noise data; the training sample is preprocessedto obtain a processed denoising sample; the denoising sample is divided into multiple batches of denoising samples, a WGAN model is trained by adopting each batch of the denoising sample until training of multiple batches of the denoising samples is completed, and a final WGAN-MBGD model is obtained; an enhanced speech signal is output by adopting the final WGAN-MBGD model. The speech enhancementprocessing method has the advantages that the unstable adversarial network gradient is generated, the rate of convergence is quicker, the small-batch calculation is applied, the calculated amount isalso reduced, spectral subtraction factors and spectral lower limit factors are introduced, and the residual noise is reduced by reducing the error among frequency spectrums.
Owner:SHANGHAI MARITIME UNIVERSITY
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products