Voice separation method, voice separation device, electronic equipment and storage medium

A voice separation and voice technology, applied in electronic equipment and storage media, voice separation devices, and voice separation fields, can solve problems such as difficult to separate short-term voices, estimate the number of speakers, etc.

Active Publication Date: 2021-06-08
北京远鉴信息技术有限公司
View PDF5 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] In view of the fact that the existing speech recognition technology is difficult to separate short-term speech and cannot accurately estimate the number of speakers in connection with context information, the application provides a speech separation method, speech separation device, electronic equipment and storage media

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Voice separation method, voice separation device, electronic equipment and storage medium
  • Voice separation method, voice separation device, electronic equipment and storage medium
  • Voice separation method, voice separation device, electronic equipment and storage medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0026] In order to make the purpose, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be clearly and completely described below in conjunction with the drawings in the embodiments of the present application. Obviously, the described embodiments are only It is a part of the embodiments of this application, not all of them. The components of the embodiments of the application generally described and illustrated in the figures herein may be arranged and designed in a variety of different configurations. Accordingly, the following detailed description of the embodiments of the application provided in the accompanying drawings is not intended to limit the scope of the claimed application, but merely represents selected embodiments of the application. Based on the embodiments of the present application, every other embodiment obtained by those skilled in the art withou...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The application provides a speech separation method, a speech separation device, electronic equipment, and a storage medium. The speech separation method includes: obtaining the original audio, and extracting the spectrogram feature sequence from the original audio in a time window sliding window; The graph feature sequence is input into the pre-trained speech segmentation model, and the embedded feature sequence is obtained through the speech segmentation model; the embedded feature sequence is input into the pre-trained speech clustering model, and the corresponding embedding feature sequence is obtained through the speech clustering model The predicted label sequence; perform single-speaker voice restoration based on the predicted label sequence to generate separated speech. According to the speech separation method, speech separation device, electronic equipment and storage medium of the present application, the problem of unsatisfactory speech separation effect can be solved, and the speech segment belonging to a single speaker can be separated from the short-term speech audio file in which multiple people speak alternately, And it can accurately estimate the number of speakers in conjunction with contextual information.

Description

technical field [0001] The present application relates to the technical field of voice processing, in particular to a voice separation method, a voice separation device, electronic equipment and a storage medium. Background technique [0002] "Cocktail party problem" is a classic problem in the field of computer speech processing. "Cocktail party problem" refers to that when a single speaker speaks, speech recognition technology can often accurately identify what the speaker is saying, but when the scene contains multiple When there are no speakers, the accuracy of speech recognition will be greatly reduced. [0003] Generally speaking, the audio to be separated will contain a large number of short-duration speech. Since the short-duration speech contains less information and is not highly distinguishable, it is difficult to separate the speech, and because the context information cannot be used to accurately estimate the number of speakers, the separation is difficult. The...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G10L15/02G10L15/06G10L15/16G10L15/183G10L15/26G06F16/35G06K9/62G06N3/04G06N3/08
CPCG10L15/02G10L15/063G10L15/183G10L15/16G10L15/26G06F16/35G06N3/084G06N3/045G06F18/214
Inventor 史王雷王秋明
Owner 北京远鉴信息技术有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products