A Voice Separation and Tracking Method for Public Security Criminal Investigation Monitoring

What is AI technical title?
AI technical title is built by Patsnap AI team. It summarizes the technical point description of the patent document.
A voice separation and criminal investigation technology, applied in voice analysis, instrumentation, computing, etc., can solve the problems of multi-microphone nonlinear combination configuration stability, high time complexity of training and testing, and inability to achieve end-to-end voice tracking, etc. Achieve the effect of real-time speech separation and tracking, reduce delay, and reduce generalization error

Active Publication Date: 2021-07-09

GUANGDONG UNIV OF TECH

View PDF0 Cites 0 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0004] 1. Align and capture the position information of multiple target speakers through the combination of multiple microphone arrays, but this method has problems of nonlinear combination of multiple microphones and configuration stability;

[0005] 2. Use visual information as auxiliary information to enhance the performance of the speech separation and tracking system to separate and track speech signals. However, this method needs to combine speech information and visual information for simultaneous processing and analysis, and in practical applications There is a delay problem in the collected audio and image, which makes it impossible to adapt;

[0006] 3. The speech signal is processed by using the effective bit coding vector or the speech information of the target speaker as an additional input to the speech separation system, but this method cannot achieve end-to-end speech tracking, and compared with a separate speech tracking algorithm, Due to the introduction of target speaker identity information as input, there is a problem that the time complexity of training and testing is too high

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0078] The accompanying drawings are for illustrative purposes only and cannot be construed as limiting the patent;

[0079] In order to better illustrate this embodiment, some parts in the drawings will be omitted, enlarged or reduced, and do not represent the size of the actual product;

[0080] For those skilled in the art, it is understandable that some well-known structures and descriptions thereof may be omitted in the drawings.

[0081] The technical solutions of the present invention will be further described below in conjunction with the accompanying drawings and embodiments.

[0082] Such as figure 1 As shown, it is a flow chart of a voice separation and tracking method for public security criminal investigation monitoring in this embodiment.

[0083] A kind of voice separation and tracking method that this embodiment proposes is used for public security criminal investigation monitoring, comprises the following steps:

[0084] S1. Import the initial speech accord...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention relates to the technical field of voice signal recognition and processing, and proposes a voice separation and tracking method for public security criminal investigation and monitoring, comprising the following steps: importing initial voice according to time sequence, performing frame-by-frame windowing processing on the initial voice, and obtaining a windowed voice signal Carry out time-frequency decomposition to windowed speech signal, obtain time-frequency two-dimensional signal by short-time Fourier transform; Carry out endpoint detection in frequency domain to described time-frequency two-dimensional signal, the speech signal segment corresponding to empty speech segment Perform filtering processing; use the two-way long-short-term memory network structure to perform speech separation on the time-frequency two-dimensional signal that has completed the filtering process, and output multiple voice waveforms of the target speaker; establish and train a target speaker model based on GMM-UBM, and convert all The speech waveform of the target speaker is used as the model input, and the GMM model of the target speaker is acquired through adaptively, and then the speech waveform is recognized, and the serial number of the target speaker is output, which is the result of speech tracking.

Description

technical field [0001] The invention relates to the technical field of voice signal recognition and processing, and more specifically, to a voice separation and tracking method for public security criminal investigation monitoring. Background technique [0002] In the field of public security criminal investigation and monitoring, it is difficult to obtain relevant important information for the audio clip because the obtained audio clip contains related interference factors such as background noise, multiple speakers, and reverberation. Therefore, in the process of processing the speech signal, it is necessary to separate the speech signals of multiple speakers before processing them separately. At the same time, due to the particularity of criminal investigation monitoring, the voice signals of multiple speakers are collected by the same pickup, so it is difficult to separate and process the voice signals of multiple speakers. In addition, in the actual monitoring process ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & AuthorityPatents(China)

IPC IPC(8): G10L17/06G10L17/18G10L21/0272G10L25/78G06K9/62

CPCG10L17/06G10L17/18G10L21/0272G10L25/78G06F18/23213

Inventor郝敏李扬刘航

OwnerGUANGDONG UNIV OF TECH

A Voice Separation and Tracking Method for Public Security Criminal Investigation Monitoring

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology