Unlock instant, AI-driven research and patent intelligence for your innovation.

Short-wave voice endpoint detection method based on image recognition

A technology of image recognition and endpoint detection, applied in speech analysis, image analysis, image data processing, etc., can solve the problem of unrealistic adjustment of endpoint detection parameters

Active Publication Date: 2018-05-18
UNIV OF ELECTRONIC SCI & TECH OF CHINA
View PDF14 Cites 5 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, in the case of changing environments and requiring real-time communication, it is unrealistic to adjust the endpoint detection parameters, and traditional voice processing methods are no longer applicable

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Short-wave voice endpoint detection method based on image recognition
  • Short-wave voice endpoint detection method based on image recognition
  • Short-wave voice endpoint detection method based on image recognition

Examples

Experimental program
Comparison scheme
Effect test

specific Embodiment 1

[0077] Specific embodiment 1, typical noise background

[0078] Step 1. Read in the file and draw the time domain graph, see Figure 4 , the time-domain diagram after speech preprocessing is shown in Figure 5 .

[0079] The speech is divided into frames, the frame length is 200, and the frame shift is 80. The data obtained after the frame division is a two-dimensional matrix of 200*2964, and each column has 200 numbers (each frame) as a unit to perform Fourier transform to obtain the frequency spectrum of each frame. Then there are 2964 spectrums, the horizontal axis is time, and the vertical axis is frequency to draw a spectrogram see Image 6 , take the low frequency part (0Hz ~ 3500Hz) and do grayscale processing to get the spectrogram, see Figure 7 . where, for clarity, the Figure 7 , Figure 8 , Figure 9 rotate 90 degrees clockwise).

[0080] Figure 7 The white part can be seen in the middle, there are parallel ripples, that is, the voiceprint, which is the ...

specific Embodiment 2

[0082] Specific embodiment 2, strong noise background

[0083] The steps are the same as in Example 1, and the experimental results are as follows:

[0084] It should be noted that in the background of strong noise, strong noise spectrum will still be left after speech enhancement processing, such as Figure 14 As shown in the figure, there are speech segments in the figure where the energy is high and there are parallel lines. After the speech segment, due to the presence of strong noise, the spectrogram is left with low energy and exists in the form of dots. noise spectrum. Such as Figure 15 , when identifying a line segment, a part of the noise spectrum will be identified as a line segment, so it will cause misjudgment during endpoint detection. For the final test results, see Figure 16 to Figure 17 , it can be seen that all the speech segments in the speech are recognized, but some parts containing only strong noise are misjudged as speech.

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention belongs to the field of voice detection, and particularly relates to a short-wave voice endpoint detection method based on image recognition. The method comprises steps of: firstly preprocessing data to increase a signal-to-noise ratio; dividing the data into frames according to a specific length and performing short-time Fourier transform to obtain a spectrogram; finally using an image recognition method to search a voiceprint in the spectrogram and determining voice segments in the data according to voiceprint distribution. The preprocessed voices have the similar signal-to-noise ratio, and parameters are not required to be adjusted in subsequent steps. Therefore, the method can adaptively select voice segments from different background noises.

Description

technical field [0001] The invention belongs to the field of voice detection, in particular to an image recognition-based short-wave voice endpoint detection method. Background technique [0002] Despite the continuous emergence of new radio communication systems, short-wave radio stations are still widely valued due to their autonomous communication capabilities and wide coverage. However, the radio waves transmitted by short-wave communication need to be reflected by the ionosphere, so the noise is relatively large. The existence of strong background noise makes it impossible for monitors to work for a long time, so noise reduction processing must be done, and noise reduction processing must be performed on non-speech segments at the same time. At this time, in order to prevent missing listening, the performance of the speech endpoint detection method is particularly important. [0003] In traditional speech processing, there are already many endpoint detection methods b...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G10L25/84G10L25/18G10L25/03G10L25/45G10L25/27G06T7/13
CPCG10L25/03G10L25/18G10L25/27G10L25/45G10L25/84G06T7/13
Inventor 陈章鑫杨孟文司进修黄际彦
Owner UNIV OF ELECTRONIC SCI & TECH OF CHINA