Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Real-time speech paragraph tracking method in complex noise scene

A real-time voice and noise technology, applied in voice analysis, instruments, etc., can solve problems such as difficulty in tracking voice paragraphs, and achieve the effect of enhancing the sense of hearing and suppressing noise

Active Publication Date: 2020-06-09
AVIC HUADONG OPTOELECTRONICS (SHANGHAI) CO LTD
View PDF8 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Statistical properties Speech tracking in a single noise scene is relatively easy to handle, while speech paragraph tracking in a complex noise scene is a difficult problem

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Real-time speech paragraph tracking method in complex noise scene
  • Real-time speech paragraph tracking method in complex noise scene
  • Real-time speech paragraph tracking method in complex noise scene

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0033] see Figure 1-5 , Embodiment 1: In the embodiment of the present invention, a real-time voice paragraph tracking method under a complex noise scene comprises the following steps:

[0034] A. Pretreatment. Framing and windowing the input audio signal. Take 16ms (256 samples) data as a frame x i (n), where i is the frame number. Add a window to it, and the window function is a Hamming window:

[0035]

[0036] B. Calculate the input audio frame The discrete Fourier transform coefficients Y i (ω k ), where k is the label of the spectral component:

[0037] Y i (ω k )=Y k exp(jθ y (k))

[0038] C. Assuming that the previous L frames are noise frames, calculate the power of the initial noise, that is, calculate Arithmetic mean of the Fourier transform magnitude spectrum:

[0039] Assuming that the data after L frames is a noisy signal, calculate the power of the noisy signal

[0040] |Y i (ω k )| 2 ;

[0041] D. Calculate the posterior signal-to-noi...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a real-time speech paragraph tracking method in a complex noise scene. The real-time speech paragraph tracking method comprises the following steps: A, preprocessing; B, calculating a discrete Fourier transform coefficient of an input audio frame; C, assuming that a previous frame is a noise frame, and calculating the power of initial noise, i.e., calculating an arithmeticmean value of a Fourier transform amplitude spectrum; assuming that the data after the frame is a noisy signal, and calculating the power of the noisy signal; D, calculating a posterior signal-to-noise ratio; E, calculating a prior signal-to-noise ratio; F, carrying out voice activity detection; G, updating a noise spectrum; and H, calculating a gain coefficient, estimating a frequency spectrum attribute of stationary noise in a scene by utilizing paragraph noise between phrases, and then designing a gain function to enhance speech and suppress the stationary noise. On the basis, voiced sounddetection is carried out, speech paragraphs are tracked, and various noises among the speech paragraphs are shielded. In this way, the accuracy of speech detection can be improved, the noise of speechsegment superposition is suppressed, and the inter-speech-segment noise influencing the listening feeling is thoroughly shielded.

Description

technical field [0001] The invention relates to the technical field of voice processing, in particular to a real-time voice paragraph tracking method in a complex noise scene. Background technique [0002] Engineering implementation in the field of speech signal processing has to face complex noise scenarios, including stationary noise, instantaneous noise, time-varying noise, and strong noise with different statistical characteristics. When using a proximity pickup device for voice collection, voice communication, and voice recognition, background noise is easily picked up by the microphone, which directly affects voice communication in terms of hearing, and will further affect the performance of processing modules such as back-end voice recognition. . In a complex noise scene, suppress the steady-state noise mixed in the voice, shield other types of noise mixed in the voice paragraphs, and track the pure voice paragraphs, which can effectively improve the sense of hearing...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G10L25/84G10L25/27G10L25/45G10L25/21G10L25/93G10L21/0216
CPCG10L25/84G10L25/27G10L25/45G10L25/21G10L25/93G10L21/0216
Inventor 马翼平张玮
Owner AVIC HUADONG OPTOELECTRONICS (SHANGHAI) CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products