Voice detection method based on multiple sound areas, related device and storage medium

A voice detection and sound zone technology, which is applied in voice analysis, voice recognition, instruments, etc., can solve the problems of low signal strength, poor effect, and large voice signal transmission loss when reaching the microphone array

Active Publication Date: 2020-10-27
TENCENT TECH (SHENZHEN) CO LTD
View PDF15 Cites 6 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] However, when there are multiple speakers in the environment, judging the main speaker only by the arrival signal strength is flawed, because the main speaker may be farther away from the microphone array than the interfering speaker. Although the volume of the main speaker may be greater than that of the interfering speaker, the propagation loss of the speech signal in the space is greater, so the signal strength reaching the microphone array may be smaller instead, resulting in poorer effect on subsequent speech processing

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Voice detection method based on multiple sound areas, related device and storage medium
  • Voice detection method based on multiple sound areas, related device and storage medium
  • Voice detection method based on multiple sound areas, related device and storage medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0087] The embodiment of the present application provides a multi-sound region-based speech detection method, related devices, and storage media, which can retain or suppress speech signals in different directions through control signals in a multi-sound source scenario, so that real-time Separating and enhancing the voice of each user, thereby improving the accuracy of voice detection, which is conducive to improving the effect of voice processing.

[0088] The terms "first", "second", "third", "fourth", etc. (if any) in the description and claims of this application and the above drawings are used to distinguish similar users, not necessarily Used to describe a specific sequence or sequence. It is to be understood that the data so used are interchangeable under appropriate circumstances such that the embodiments of the application described herein, for example, can be practiced in sequences other than those illustrated or described herein. Furthermore, the terms "comprising...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a voice detection method based on multiple sound areas, the method is applied to the field of artificial intelligence, and the voice detection method provided by the inventioncomprises the following steps: obtaining sound area information corresponding to each sound area in N sound areas; generating a control signal corresponding to each sound area according to the sound area information corresponding to each sound area; processing the voice input signal corresponding to each sound area by adopting the control signal corresponding to each sound area to obtain a voice output signal corresponding to each sound area; and generating a voice detection result according to the voice output signal corresponding to each sound area. The invention further discloses a voice detection device and a storage medium. According to the invention, the voice signals from different directions can be processed in parallel based on the plurality of sound areas, and the voice signals in different directions are reserved or suppressed through the control signals in a multi-sound-source scene, so that the voice of each user can be separated and enhanced in real time, and the accuracyof voice detection is improved.

Description

technical field [0001] The present application relates to the field of artificial intelligence, and in particular to a multi-sound region-based speech detection method, related devices and storage media. Background technique [0002] With the wide application of far-field voice in people's daily life, in the multi-sound source (or multi-user) scenario, perform voice activity detection (VAD), separation, enhancement, and recognition for each possible sound source It has become a bottleneck for many intelligent voice products to improve their voice interaction performance. [0003] In the traditional scheme, a monophonic pre-processing system based on the main speaker detection algorithm is designed. The pre-processing system generally uses azimuth estimation combined with signal strength estimation, or azimuth estimation combined with spatial spectrum estimation. Estimate the speaker with the strongest signal energy (that is, the signal energy reaching the microphone array) ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G10L21/028G10L25/27G10L15/18G10L15/22
CPCG10L21/028G10L25/27G10L15/22G10L15/1815G10L2021/02166G10L2021/02087G10L21/0208G10L25/84G06T7/20G06T2207/30201G10L17/02G10L17/22G10L25/21
Inventor 郑脊萌陈联武黎韦伟段志毅于蒙苏丹姜开宇
Owner TENCENT TECH (SHENZHEN) CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products