Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Segmentation clustering method and system for multi-person voice in complex environment

A technology of segmentation and clustering, multi-person speech, applied in speech analysis, speech recognition, instruments, etc., can solve the problems that cannot be directly and effectively solved, the data requirements are high, and the optimization of speaker segmentation and clustering tasks has not been seen. Strong discrimination, good discriminative ability, and the effect of reducing classification errors

Active Publication Date: 2020-04-24
AISPEECH CO LTD
View PDF4 Cites 14 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Benefits of technology

The use of this redundancy algorithm improves the accuracy of identifying specific people or things by dividing them into smaller groups based on their similarity between similar sounds called clusters. This helps identify even if there are many different voices speaking at once without being mistakenly identified due to background noise interference from other sources like loudspeakers.

Problems solved by technology

This patented describes an algorithm called Vadelta which helps detect or cluster sounds without being detected incorrectly due to noise interference between them. Additionally, current algorithms either require advanced signal analysis techniques like wavelets or convolutional networks to handle complicated scenarios involving many talker voices simultaneously have limitations because they often result in incorrect results if certain parts were mistakenly identified during the final step. Therefore, there remains a challenge in optimizing speaker separation/clustering in these challenges.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Segmentation clustering method and system for multi-person voice in complex environment
  • Segmentation clustering method and system for multi-person voice in complex environment
  • Segmentation clustering method and system for multi-person voice in complex environment

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0053] In order to make the purpose, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below in conjunction with the drawings in the embodiments of the present invention. Obviously, the described embodiments It is a part of embodiments of the present invention, but not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without creative efforts fall within the protection scope of the present invention.

[0054] In the following, the embodiment of the present application will be introduced first, and then the experimental data will be used to verify the difference between the solution of the present application and the prior art, and what beneficial effects can be achieved.

[0055] Please refer to figure 1 , is a flow chart of a method for segmen...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a segmentation clustering method and system for multi-person voice in a complex environment. The method comprises the following steps of: acquiring multiple continuous multi-person speaking voice segment audios according to multi-person speaking audios; and normalizing the multi-person speaking voice segment audios according to acoustic features to obtain normalized audios;acquiring multiple sections of to-be-processed audios; extracting voiceprint information characteristics of the multiple sections of to-be-processed audios; acquiring scores among all the to-be-processed audio segments by setting scoring criteria; according to the similarity scores among all the to-be-processed audio segments, acquiring category labels of a plurality of persons through a multi-stage redundant clustering algorithm; and segmenting and clustering the multi-person speaking audios according to the category labels of the plurality of persons. By using the redundant clustering method, the clustering center of a target speaker can be improved to be more dispersed, and the distinction degree is higher. And for an unclear voice segment of the target speaker in a complex environment, a better discrimination capability is realized, so that the classification error of speaker classification in a segmentation clustering task in the complex environment is reduced.

Description

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Owner AISPEECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products