Method for achieving MFCC (Mel Frequency Cepstrum Coefficient) parameter extraction by field-programmable gate array
A gate array and data technology, which is applied in speech analysis, speech recognition, instruments, etc., can solve problems such as long hardware development cycle, reduced calculation accuracy, and complex design
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment 1
[0083] Examples of the present invention Figure 1-10 As shown, a field programmable gate array (FPGA), including pre-emphasis processing module (1), framing processing module (2), windowing processing module (3), discrete power spectrum estimation module (4), Mel filter A device group module (5), a natural logarithm acquisition module (6) and a discrete cosine transform module (7), characterized in that the output of the pre-emphasis processing module (1) is connected to the input of the framing processing module (2); The output terminal of the frame processing module (2) is connected to the input terminal of the windowing processing module (3), and its enabling control terminal is respectively connected to the enabling terminal of the windowing processing module (3) and the discrete power spectrum estimation module (4) The output of the windowing processing module (3) is connected to the input of the discrete power spectrum estimation module (4); the output of the discrete p...
Embodiment 2
[0095] A method utilizing the above-mentioned Field Programmable Gate Array (FPGA) to realize speech MFCC parameter extraction, assuming that the speech signal to be extracted is a single audio signal of 8kHz sampling and 8bit quantization, the steps of the method are as follows:
[0096] 1) Preprocessing the speech signal to be tested
[0097] a. Perform pre-emphasis processing on the speech signal to be tested, so that the speech signal to be tested passes through a system function as H(z)=1-0.9375z -1 The pre-emphasis processing module improves the frequency spectrum of the high-frequency part in the speech signal, thereby increasing the resolution of the high-frequency part of the speech, where z is a complex variable;
[0098] b. The voice signal to be tested is processed in frames, and the frame of the signal is realized by using two FIFOs to store data mutually. The frame length is selected as 256 sampling values for one frame, and the frame shift is 128 sampling valu...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com