Sound separation method, sound separation device and computer readable storage medium

What is AI technical title?
AI technical title is built by PatSnap AI team. It summarizes the technical point description of the patent document.
A separation method and sound technology, applied in the computer field, can solve the problem of unable to recognize the target sentence, weaken the volume of other people, unable to strengthen the voice of the selected person, etc.

Active Publication Date: 2019-09-17

PING AN TECH (SHENZHEN) CO LTD

View PDF6 Cites 19 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0002] Defects or deficiencies or problems in the existing industry or products: In a noisy indoor environment, such as a cocktail party, there are many different sound sources at the same time, and it is very easy for human hearing to focus on a certain sound from a noisy environment. "Block" other voices, and the existing computer speech recognition intelligent system cannot accurately identify the target sentence in a noisy environment, cannot strengthen the voice of the selected person, and at the same time weaken the volume of other people at the same time, the existing system fails Addressing the "cocktail party effect"

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0049] It should be understood that the specific embodiments described here are only used to explain the present invention, not to limit the present invention.

[0050] The invention provides a sound separation method. refer to figure 1 As shown, it is a schematic flowchart of a sound separation method provided by an embodiment of the present invention. The method may be performed by a device, and the device may be implemented by software and / or hardware.

[0051] In this embodiment, the sound separation method includes:

[0052] S10. Obtain original audio and video samples.

[0053] In this embodiment, the original audio and video samples include audio and video of multiple application scenarios. For example, obtain the historical audio and video files of the conference room, and select about 10,000 hours of audio and video data from them.

[0054] S11. Divide the original audio and video samples into multiple audio and video segments, and extract the video stream and au...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a sound separation method, which comprises the following steps of: dividing an original audio and video sample into a plurality of audio and video segments, and extracting a video stream and an audio stream of each audio and video segment; determining face features in the video stream of each audio and video segment; obtaining audio features in the audio stream of each audio / video segment by using an audio conversion compression method; combining the face features and the audio features of each audio and video segment to generate audio-visual features of each audio and video segment; taking the audio-visual characteristics of each audio-video segment as the input of a sound separation model, and training the sound separation model to obtain a trained sound separation model; and taking the target audio and video data as input of the trained sound separation model, and outputting audio data of a person in the target audio and video data. The invention further provides a sound separation device and a computer readable storage medium. According to the invention, the accurate mapping of the sound and the speaker can be realized, and the voice separation quality is obviously improved.

Description

technical field [0001] The present invention relates to the field of computer technology, in particular to a sound separation method, device and computer-readable storage medium. Background technique [0002] Defects or deficiencies or problems in the existing industry or products: In a noisy indoor environment, such as in a cocktail party, there are many different sound sources at the same time, and it is very easy for human hearing to focus on a certain sound from a noisy environment. "Block" other voices, and the existing computer speech recognition intelligent system cannot accurately identify the target sentence in a noisy environment, cannot strengthen the voice of the selected person, and at the same time weaken the volume of other people at the same time, the existing system fails Solve the "cocktail party effect". Contents of the invention [0003] The present invention provides a sound separation method, device and computer-readable storage medium, the main purp...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

IPC IPC(8): G10L21/0208G10L21/0272G10L25/30G10L25/57G06K9/00G06K9/62

CPCG10L21/0208G10L21/0272G10L25/30G10L25/57G10L2021/02087G06V40/172G06F18/241

Inventor王健宗程宁

OwnerPING AN TECH (SHENZHEN) CO LTD

Sound separation method, sound separation device and computer readable storage medium

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements:Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology