Voice separation method, voice separation device, electronic equipment and storage medium

What is AI technical title?
AI technical title is built by Patsnap AI team. It summarizes the technical point description of the patent document.
A voice separation and voice technology, applied in electronic equipment and storage media, voice separation devices, and voice separation fields, can solve problems such as difficult to separate short-term voices, estimate the number of speakers, etc.

Active Publication Date: 2021-06-08

北京远鉴信息技术有限公司

View PDF5 Cites 0 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0004] In view of the fact that the existing speech recognition technology is difficult to separate short-term speech and cannot accurately estimate the number of speakers in connection with context information, the application provides a speech separation method, speech separation device, electronic equipment and storage media

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0026] In order to make the purpose, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be clearly and completely described below in conjunction with the drawings in the embodiments of the present application. Obviously, the described embodiments are only It is a part of the embodiments of this application, not all of them. The components of the embodiments of the application generally described and illustrated in the figures herein may be arranged and designed in a variety of different configurations. Accordingly, the following detailed description of the embodiments of the application provided in the accompanying drawings is not intended to limit the scope of the claimed application, but merely represents selected embodiments of the application. Based on the embodiments of the present application, every other embodiment obtained by those skilled in the art withou...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The application provides a speech separation method, a speech separation device, electronic equipment, and a storage medium. The speech separation method includes: obtaining the original audio, and extracting the spectrogram feature sequence from the original audio in a time window sliding window; The graph feature sequence is input into the pre-trained speech segmentation model, and the embedded feature sequence is obtained through the speech segmentation model; the embedded feature sequence is input into the pre-trained speech clustering model, and the corresponding embedding feature sequence is obtained through the speech clustering model The predicted label sequence; perform single-speaker voice restoration based on the predicted label sequence to generate separated speech. According to the speech separation method, speech separation device, electronic equipment and storage medium of the present application, the problem of unsatisfactory speech separation effect can be solved, and the speech segment belonging to a single speaker can be separated from the short-term speech audio file in which multiple people speak alternately, And it can accurately estimate the number of speakers in conjunction with contextual information.

Description

technical field [0001] The present application relates to the technical field of voice processing, in particular to a voice separation method, a voice separation device, electronic equipment and a storage medium. Background technique [0002] "Cocktail party problem" is a classic problem in the field of computer speech processing. "Cocktail party problem" refers to that when a single speaker speaks, speech recognition technology can often accurately identify what the speaker is saying, but when the scene contains multiple When there are no speakers, the accuracy of speech recognition will be greatly reduced. [0003] Generally speaking, the audio to be separated will contain a large number of short-duration speech. Since the short-duration speech contains less information and is not highly distinguishable, it is difficult to separate the speech, and because the context information cannot be used to accurately estimate the number of speakers, the separation is difficult. The...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & AuthorityPatents(China)

IPC IPC(8): G10L15/02G10L15/06G10L15/16G10L15/183G10L15/26G06F16/35G06K9/62G06N3/04G06N3/08

CPCG10L15/02G10L15/063G10L15/183G10L15/16G10L15/26G06F16/35G06N3/084G06N3/045G06F18/214

Inventor史王雷王秋明

Owner北京远鉴信息技术有限公司

Voice separation method, voice separation device, electronic equipment and storage medium

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology