Audio data processing method and device
A technology of audio data and processing methods, applied in speech analysis, character and pattern recognition, instruments, etc., can solve problems such as error-prone, high cost, and difficulty in collecting samples, and achieve low cost and improved accuracy
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment 1
[0031] see figure 1 , figure 1 It is a schematic diagram of the audio data processing flow in Embodiment 1 of the present disclosure. The specific steps are:
[0032] Step 101, acquire audio data to be processed.
[0033] Step 102, extracting filter bank features of the audio data.
[0034] Filter bank (Fbank) is one of the methods for extracting speech feature parameters. Because of its unique cepstrum-based extraction method, it is more in line with the human hearing principle and is the most common and effective speech feature extraction algorithm.
[0035] The Fbank feature of the audio signal can be extracted based on the Filter Bank algorithm; the Fbank feature extraction method is equivalent to the Mel-Frequency Cepstral Coefficients (MFCC) without the discrete cosine transform (lossy transform) of the last step, which is similar to the MFCC feature Than, Fbank features retain more original speech data.
[0036] The embodiment of the present disclosure does not lim...
Embodiment 2
[0082] see Figure 5 , Figure 5 It is a schematic diagram of the audio data processing flow in Embodiment 2 of the present disclosure. The specific steps are:
[0083] Step 501, acquire audio data to be processed.
[0084] Step 502, extracting filter bank features of the audio data.
[0085] Fbank is one of the extraction methods that require speech feature parameters. Because of its unique cepstrum-based extraction method, it is more in line with the human hearing principle and is the most common and effective speech feature extraction algorithm.
[0086] The Fbank feature of the audio signal can be extracted based on the Filter Bank algorithm; the Fbank feature extraction method is equivalent to the discrete cosine transform (lossy transform) that removes the last step of MFCC. Compared with the MFCC feature, the Fbank feature retains more original speech data.
PUM
Login to View More Abstract
Description
Claims
Application Information
Login to View More 


