Patents
Literature
Hiro is an intelligent assistant for R&D personnel, combined with Patent DNA, to facilitate innovative research.
Hiro

41 results about "Gammatone filter" patented technology

A gammatone filter is a linear filter described by an impulse response that is the product of a gamma distribution and sinusoidal tone. It is a widely used model of auditory filters in the auditory system. The gammatone impulse response is given by g(t)=atⁿ⁻¹e⁻²πᵇᵗcos(2πft+ϕ), where f (in Hz) is the center frequency, ϕ (in radians) is the phase of the carrier, a is the amplitude, n is the filter's order, b (in Hz) is the filter's bandwidth, and t (in seconds) is time.

Underwater acoustic signal target classification and recognition method based on deep learning

The invention belongs to the technical field of underwater acoustic signal processing, and particularly relates to an underwater acoustic signal target classification and recognition method based on deep learning. The method comprises the following steps: (1) carrying out feature extraction on an original underwater acoustic signal through a Gammatone filtering cepstrum coefficient (GFCC) algorithm; (2) extracting instantaneous energy and instantaneous frequency by utilizing an improved empirical mode decomposition (MEMD) algorithm, fusing the instantaneous energy and the instantaneous frequency with characteristic values extracted by a GFCC algorithm, and constructing a characteristic matrix; (3) establishing a Gaussian mixture model GMM, and keeping the individual characteristics of theunderwater acoustic signal target; And (4) finishing underwater target classification and recognition by using a deep neural network (DNN). According to the underwater acoustic signal target classification and recognition method, the problems that a traditional underwater acoustic signal target classification and recognition method is single in feature extraction and poor in noise resistance can be solved, the underwater acoustic signal target classification and recognition accuracy can be effectively improved, and certain adaptability is still achieved under the conditions of weak target acoustic signals, long distance and the like.
Owner:HARBIN ENG UNIV

Bi-ear time delay estimating method based on frequency division and improved generalized cross correlation

The invention provides a bi-ear time delay estimating method based on frequency division and improved generalized cross correlation in reverberation environment, and relates to the field of sound source positioning. A Gammatone filter is used to effectively simulate characteristics of a basal membrane of a human ear, voice signals are subjected to frequency division processing, and two-ear cross-correlation delay is estimated under a reverberation environment. Compared with a generalized cross correlation delay estimating method, the method can estimate time delay more accurately. The sound source positioning system has better robustness. A Gammatone filter is used to conduct frequency dividing processing for bi-ear signals, and each sub-band signal is subjected to inverse transformation to a time domain after reverberation processing of cepstrum and pre-filtering. Each sub-band signal of left and right ears are subjected to generalized cross correlation operation, an improved phase transformation weight function is employed in a generalized cross correlation algorithm to obtain cross correlation value of each sub-band for summing operation, and the bi-ear time difference corresponding to maximal cross correlation value is obtained.
Owner:CHONGQING UNIV OF POSTS & TELECOMM

Automatic recognition method for pharyngeal fricative in cleft palate speech based on PICGTFs and SSMC enhancement

The invention discloses an automatic recognition method for pharyngeal fricative in cleft palate speech based on PICGTFs and SSMC enhancement, which relates to the field of speech signal processing. The method uses piecewise index compression Gammatone filters PICGTFs to carry out filter processing on the speech, a speech signal spectrogram in each channel is subjected to enhancement processing based on a SSMC (Softsign-based Multi-Channel) model and a DoG (Difference of Gaussian) model, a feature vector is extracted from the enhanced spectrogram respectively and put to a KNN classifier for pattern recognition to determine whether to belong to the pharyngeal fricative, and the same classification result is taken as a final recognition result of the algorithm. The method makes full use of the differences between the pharyngeal fricative and the normal speech in frequency domain distribution of spectral energy, and in comparison with the prior art, the detection result is objective and accurate, high-degree automatic measurement is realized, reliable reference data are provided for clinical digital evaluation of the pharyngeal fricative, the development needs of precise medical treatment are met, and more accurate and effective signal classification and recognition can be carried out.
Owner:SICHUAN UNIV

Gammatone filter bank chip system supporting voice real-time decomposition/synthesis

ActiveCN106486110ALow operating frequencyReduce the total number of interfacesSpeech synthesisDecompositionGammatone filter
The invention puts forward a Gammatone filter bank chip system supporting voice real-time decomposition/synthesis, and belongs to the field of digital circuit design. The system comprises five parts, namely, an input module, a parameter module, a control module, a calculation module, and an output module. The input module activates the control module after receiving a frame of voice data, adjusts the delay of each channel according to the delay of human ear basilar membranes on different sub-bands, and sends the voice data to the control module. The control module makes the parameter module read the parameters of the corresponding channels and transmit the parameters to the calculation module. The calculation module completes the Gammatone filtering algorithm of each channel, and saves the result to the output module. After the calculation module completes calculation of all the channels of the frame of voice data, the output module allows the stored data to be read externally. With the system, the number of clocks consumed for calculation of channels is reduced, and less power is consumed. A parameter configurable function is realized, and the parameters of the system can be adjusted flexibly according to the need. Voice decomposition and synthesis is realized.
Owner:TSINGHUA UNIV

Parkinson's disease speech detection method based on characteristics of power normalized cepstrum coefficients

The invention discloses a Parkinson's disease speech detection method based on characteristics of power normalized cepstrum coefficients. The Parkinson's disease speech detection method solves the problem that Parkinson's disease speech detection is prone to being interfered by noise. The robustness of the extracted characteristics is enhanced through methods such as a Gammatone filter, noise removal and power normalization, and the detection method comprises the following steps that (1) a Parkinson's disease speech library and a healthy speech library are established; (2) characteristics of the power normalized cepstrum coefficients are extracted on a speech signal, specifically, firstly a speech signal is preprocessed, then the Gammatone filter is used for filtering to obtain a speech short-time power spectrum, then the speech short-time power spectrum is weighted and smoothed, and finally the characteristics of the power normalized cepstrum coefficients are calculated; (3) the outerproduct is used for obtaining characteristic vectors; (4) the characteristic vectors are subjected to power and l<2> norm normalization; (5) an SVM is used for training a Parkinson's disease speech model and a healthy speech model; and (6) an SVM classification method is used for classifying, and Parkinson's disease speech detection is realized.
Owner:JILIN UNIV

Target identification method based on continuous spectral characteristics of Gammatone frequency band

The invention discloses a target identification method based on continuous spectral characteristics of a Gammatone frequency band. The target identification method comprises the following steps: firstly, performing windowing treatment on original target radiation noise data, selecting a Hamming window, establishing a corresponding window function, and performing fast Fourier transform on a windowed signal; determining number of Gammatone filter banks, determining center frequency of each filter at equal distance for an original signal frequency band, calculating impulse response of the Gammatone filter banks, performing fast Fourier transform on the impulse response, performing normalized processing, and establishing a corresponding Gammatone filter bank impulse response function. Comparedwith a conventional continuous spectral characteristic extraction, classification and identification method, the target identification method firstly performs preliminary identification, performs continuous spectrum in a sub band, can extract a stable sub band continuous spectrum as a typical sub band sample, and is more precise in whole frequency band continuous spectrum estimation, so that a correct identification rate of target radiation noise identification is increased.
Owner:750 TEST SITE OF CHINA SHIPBUILDING IND CORP

Multi-sound-source positioning method based on Gammatone filter and histogram

The invention discloses a multi-sound-source positioning method based on Gammatone filter and a histogram. A microphone array is used for collecting a sound source signal, and a sub-band signal is obtained through a Gammatone filter bank, and framing and windowing processing are carried out for converting into a frequency domain; next, a controllable response power value is calculated, a histogramis drawn, the number of the main peak orientation and the secondary peak orientation are counted, and the orientation of the primary and secondary sound sources are estimated. According to the methoddisclosed by the invention, the frequency domains are overlapped with each other and are not separated, so that the phase winding is avoided; the side lobes are inhibited by the average effect of thespatial spectrums of the multiple frequency components, so that the main lobe is protruded, the space between the array elements is not strictly limited to a half wavelength, multi-frame informationis not needed, and the fact that the sound source does not need to be kept static in the continuous multi-frame is not needed to be presumed, and the real-time multi-sound source positioning is realized; and all the sub-band information in the same frame is fused with the histogram to serve as the judgment amount of the azimuth estimation, so that the method is simple and easy to operate, the calculation amount is low, the positioning success rate of the main sound source and the secondary sound sources is remarkably improved, and particularly the positioning success rate of the secondary sound sources is more obvious.
Owner:NANJING INST OF TECH +1

Fault location method of capsule type floor heating device

InactiveCN108919187AHeating up fastDisposable cooling radiation transfer fastPosition fixationTime domainSound sources
The invention discloses a fault location method of a capsule type floor heating device. The fault location method is applied into a floor heating track. The fault location method comprises the steps that fault speech signals are acquired and unit impulse response is subjected to convolution to obtain binaural signals; and after the binaural signals pass through a Gammatone filter, sub-band signalsafter frequency division are obtained, dereverbration processing of minimum phase decomposition is carried out in each sub-band signal, and a generalized cross-correlation function is obtained by calculating the cross-correlation of various sub-bands after the inverse transformation from a cepstrum domain to a time domain. The fault location method takes binaural speech localization as a multi-classification problem, takes GCCF (Generalized Cross-Correlation Function) and an interaural level difference as localization characteristics to input into DNNs of a softmax regression structure on a top floor, outputs the probability of which a sound source is located in each direction, and takes a maximum probability azimuth as the position of the sound source. The fault location method is simpleand convenient in installation, and the capsule type floor heating device which can locate the fault is provided in an innovation mode.
Owner:毛述春

A real-time decomposition/synthesis method of digital speech based on auditory perception characteristics

The invention discloses a digital speech real-time decomposition / synthesis method based on auditory perception characteristics, and relates to the field of voice signal processing. The method comprises the following steps: forming an N-order Gammatone filter through N-stage-cascaded second-order band-pass filters, and then, constructing an arbitrary-order Gammatone digital filter model and parameters thereof; in the speech decomposition stage, decomposing an input speech into M paths of signals by adopting a floating-point algorithm or a fixed-point algorithm and through M paths of Gammatone filters; and in the speech synthesis stage, introducing time delay in a Gammatone filterbank to accord with characteristics of the human ear better, human ear basilar membrane time delay being inversely proportional to frequency, and finally, carrying out speech synthesis operation. Through reference to equiloudness curve characteristics of the human ear, the speech decomposition / synthesis method is improved, and thus the final speech synthesis effect is allowed to be close to the effect of an ideal band-pass filter. The method can be applied to speech equipment of a mobile phone, an artificial cochlea and a hearing aid and the like.
Owner:TSINGHUA UNIV

A binaural sound source localization method based on deep learning in digital hearing aids

The invention discloses a binaural sound source localization method based on deep learning in a digital hearing aid. First, binaural sound source signals are decomposed into several channels through a gammatone filter, and high-energy channels are extracted through weighting coefficients, and then the head correlation function ( head‑related‑transform function, HRTF) extracts the first type of features, that is, Interaural Time Difference (Interaural Time Difference, ITD) and Interaural Intensity Difference (Interaural Intensity Difference, IID) as the input of deep learning, and divides the horizontal plane into four Quadrant to narrow down the targeting. Then extract the second type of features of head-related transmission, namely, the interaural level difference (Interaural Level Difference, ILD) and the interaural phase difference (Interaural Phase Difference, IPD). Finally, in order to obtain more accurate positioning, the first type and The four features of the second category are used as the input of the next deep learning, so as to obtain the azimuth angle of the sound source localization. Realize the precise positioning of 72 azimuth angles from 0° to 360° on the horizontal plane with a step size of 5°.
Owner:BEIJING UNIV OF TECH

Speech Enhancement and Frequency Response Compensation Fusion Method in Digital Hearing Aid

The invention provides a speech enhancing and frequency response compensation fusion method in a digital hearing-aid. The speech enhancing and frequency response compensation fusion method includes the steps of (1) obtaining estimated noise and initial enhanced speech with an MCRA method, respectively carrying out filtering processing on the estimated noise and the initial enhanced speech through a gammatone filter, dividing a signal into M frequency bands through the perception mechanism of the cochleas to the signal, and meanwhile obtaining a time frequency expression mode of the signal, (2) computing masking threshold values of the frequency bands through factors such as the audio masking characteristic of the human ears and the frequency band signal to noise ratios, (3) dynamically computing masking values of noise-contained speech in a time frequency domain through a hearing curve of a hearing disorder patient, and processing speech enhancing and frequency response compensation at the same time, and (4) synthesizing output speech of the heating-aid through the masking values. According to the speech enhancing and frequency response compensation fusion method, the working mechanism of the human ears is sufficiently used, the speech characteristics are kept, music noise introduced through a spectral subtraction method is eliminated, the speech intelligibility of output signals of the hearing-aid is greatly improved, and the low complexity and the low power consumption are achieved.
Owner:湖南汨罗循环经济产业园区科技创新服务中心

A gamma-pass filter bank chip system supporting real-time speech decomposition/synthesis

ActiveCN106486110BLow operating frequencyReduce the total number of interfacesSpeech synthesisDecompositionGammatone filter
The invention puts forward a Gammatone filter bank chip system supporting voice real-time decomposition / synthesis, and belongs to the field of digital circuit design. The system comprises five parts, namely, an input module, a parameter module, a control module, a calculation module, and an output module. The input module activates the control module after receiving a frame of voice data, adjusts the delay of each channel according to the delay of human ear basilar membranes on different sub-bands, and sends the voice data to the control module. The control module makes the parameter module read the parameters of the corresponding channels and transmit the parameters to the calculation module. The calculation module completes the Gammatone filtering algorithm of each channel, and saves the result to the output module. After the calculation module completes calculation of all the channels of the frame of voice data, the output module allows the stored data to be read externally. With the system, the number of clocks consumed for calculation of channels is reduced, and less power is consumed. A parameter configurable function is realized, and the parameters of the system can be adjusted flexibly according to the need. Voice decomposition and synthesis is realized.
Owner:TSINGHUA UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products