Short-wave voice endpoint detection method based on image recognition
A technology of image recognition and endpoint detection, applied in speech analysis, image analysis, image data processing, etc., can solve the problem of unrealistic adjustment of endpoint detection parameters
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
specific Embodiment 1
[0077] Specific embodiment 1, typical noise background
[0078] Step 1. Read in the file and draw the time domain graph, see Figure 4 , the time-domain diagram after speech preprocessing is shown in Figure 5 .
[0079] The speech is divided into frames, the frame length is 200, and the frame shift is 80. The data obtained after the frame division is a two-dimensional matrix of 200*2964, and each column has 200 numbers (each frame) as a unit to perform Fourier transform to obtain the frequency spectrum of each frame. Then there are 2964 spectrums, the horizontal axis is time, and the vertical axis is frequency to draw a spectrogram see Image 6 , take the low frequency part (0Hz ~ 3500Hz) and do grayscale processing to get the spectrogram, see Figure 7 . where, for clarity, the Figure 7 , Figure 8 , Figure 9 rotate 90 degrees clockwise).
[0080] Figure 7 The white part can be seen in the middle, there are parallel ripples, that is, the voiceprint, which is the ...
specific Embodiment 2
[0082] Specific embodiment 2, strong noise background
[0083] The steps are the same as in Example 1, and the experimental results are as follows:
[0084] It should be noted that in the background of strong noise, strong noise spectrum will still be left after speech enhancement processing, such as Figure 14 As shown in the figure, there are speech segments in the figure where the energy is high and there are parallel lines. After the speech segment, due to the presence of strong noise, the spectrogram is left with low energy and exists in the form of dots. noise spectrum. Such as Figure 15 , when identifying a line segment, a part of the noise spectrum will be identified as a line segment, so it will cause misjudgment during endpoint detection. For the final test results, see Figure 16 to Figure 17 , it can be seen that all the speech segments in the speech are recognized, but some parts containing only strong noise are misjudged as speech.
PUM
Login to View More Abstract
Description
Claims
Application Information
Login to View More 


