Short-wave voice endpoint detection method based on image recognition

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A technology of image recognition and endpoint detection, applied in speech analysis, image analysis, image data processing, etc., can solve the problem of unrealistic adjustment of endpoint detection parameters

Active Publication Date: 2018-05-18

UNIV OF ELECTRONIC SCI & TECH OF CHINA

View PDF14 Cites 5 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

However, in the case of changing environments and requiring real-time communication, it is unrealistic to adjust the endpoint detection parameters, and traditional voice processing methods are no longer applicable

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

specific Embodiment 1

[0077] Specific embodiment 1, typical noise background

[0078] Step 1. Read in the file and draw the time domain graph, see Figure 4 , the time-domain diagram after speech preprocessing is shown in Figure 5 .

[0079] The speech is divided into frames, the frame length is 200, and the frame shift is 80. The data obtained after the frame division is a two-dimensional matrix of 200*2964, and each column has 200 numbers (each frame) as a unit to perform Fourier transform to obtain the frequency spectrum of each frame. Then there are 2964 spectrums, the horizontal axis is time, and the vertical axis is frequency to draw a spectrogram see Image 6 , take the low frequency part (0Hz ~ 3500Hz) and do grayscale processing to get the spectrogram, see Figure 7 . where, for clarity, the Figure 7 , Figure 8 , Figure 9 rotate 90 degrees clockwise).

[0080] Figure 7 The white part can be seen in the middle, there are parallel ripples, that is, the voiceprint, which is the ...

specific Embodiment 2

[0082] Specific embodiment 2, strong noise background

[0083] The steps are the same as in Example 1, and the experimental results are as follows:

[0084] It should be noted that in the background of strong noise, strong noise spectrum will still be left after speech enhancement processing, such as Figure 14 As shown in the figure, there are speech segments in the figure where the energy is high and there are parallel lines. After the speech segment, due to the presence of strong noise, the spectrogram is left with low energy and exists in the form of dots. noise spectrum. Such as Figure 15 , when identifying a line segment, a part of the noise spectrum will be identified as a line segment, so it will cause misjudgment during endpoint detection. For the final test results, see Figure 16 to Figure 17 , it can be seen that all the speech segments in the speech are recognized, but some parts containing only strong noise are misjudged as speech.

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention belongs to the field of voice detection, and particularly relates to a short-wave voice endpoint detection method based on image recognition. The method comprises steps of: firstly preprocessing data to increase a signal-to-noise ratio; dividing the data into frames according to a specific length and performing short-time Fourier transform to obtain a spectrogram; finally using an image recognition method to search a voiceprint in the spectrogram and determining voice segments in the data according to voiceprint distribution. The preprocessed voices have the similar signal-to-noise ratio, and parameters are not required to be adjusted in subsequent steps. Therefore, the method can adaptively select voice segments from different background noises.

Description

technical field [0001] The invention belongs to the field of voice detection, in particular to an image recognition-based short-wave voice endpoint detection method. Background technique [0002] Despite the continuous emergence of new radio communication systems, short-wave radio stations are still widely valued due to their autonomous communication capabilities and wide coverage. However, the radio waves transmitted by short-wave communication need to be reflected by the ionosphere, so the noise is relatively large. The existence of strong background noise makes it impossible for monitors to work for a long time, so noise reduction processing must be done, and noise reduction processing must be performed on non-speech segments at the same time. At this time, in order to prevent missing listening, the performance of the speech endpoint detection method is particularly important. [0003] In traditional speech processing, there are already many endpoint detection methods b...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

IPC IPC(8): G10L25/84G10L25/18G10L25/03G10L25/45G10L25/27G06T7/13

CPCG10L25/03G10L25/18G10L25/27G10L25/45G10L25/84G06T7/13

Inventor 陈章鑫杨孟文司进修黄际彦

Owner UNIV OF ELECTRONIC SCI & TECH OF CHINA

Short-wave voice endpoint detection method based on image recognition

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

specific Embodiment 1

specific Embodiment 2

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology