Method for identifying sound fault based on mel energy spectrum and convolution neural network
A convolutional neural network and fault identification technology, applied in speech analysis, instruments, etc., can solve the problems of large differences in staff discrimination, high detection costs, and slow information transmission speed, achieving strong separability, improving working conditions, The effect of saving manpower
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment 1
[0043] A sound fault identification method based on mel energy spectrum and convolutional neural network, comprising the following steps:
[0044] S1: Pre-emphasize the voice signal to increase the high-frequency resolution of the sound;
[0045] Pre-emphasizes the speech signal to increase the high-frequency resolution of the sound. The general transfer function of pre-emphasis is H(z)=1-az^-1. The present invention uses a first-order FIR high-pass filter to realize pre-emphasis, wherein a is a pre-emphasis coefficient, and the voice sampling value at n moments is x(n), and the result after pre-emphasis processing is y(n)=x(n) -ax(n-1), take a=0.95 here.
[0046] S2: Framing the voice signal. In terms of timing, a part of the audio data is intercepted at a certain interval to form a frame, and the interval time is the step size of the frame. Since the sound signal has short-term stationary characteristics, framing the audio helps to further subdivide the characteristics of...
Embodiment 2
[0075] A sound fault identification method based on mel energy spectrum and convolutional neural network, comprising the following steps:
[0076] S1: For the input audio data, pre-emphasize y(n)=x(n)-0.95*x(n-1) according to the following formula;
[0077] S2: Take the average value of the two-channel audio and change it to a single channel, and divide the data into frames with a single frame sampling point of 612 and a step size of 306
[0078] S3: Add a window to each frame, the window is a Hamming window, and the coefficient a=0.46
[0079] S4: Perform fast Fourier transform on each frame of data to generate energy spectrum
[0080] S5: passing the energy spectrum through a Mel-scale triangular band-pass filter. The number of filters is 64, the maximum frequency is 22050 (half of the sampling point frequency 44100)
[0081] S6: The data generated by S5, with the frequency domain as the Y axis and the time domain as the X axis, is converted into a Mei energy spectrum
...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com