Overlapped voice detection method and system

A technology of overlapping voices and voices, applied in voice analysis, instruments, etc., can solve the problems of not being able to identify who is speaking and not being able to handle overlapping voices

Inactive Publication Date: 2012-09-19
RICOH KK
View PDF4 Cites 28 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

US7295970B discloses a method of training and using a separate speaker model to train overlapping voices, although the patent also mentions a method for discovering overlapping voices and can separate overlapping voices from separate voices, but the patent disclosed The method cannot identify who is speaking in an overlapping speech
[0004] US76468

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Overlapped voice detection method and system
  • Overlapped voice detection method and system
  • Overlapped voice detection method and system

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0022] Hereinafter, specific embodiments of the present invention will be described in detail with reference to the accompanying drawings.

[0023] figure 1 Shown is a flow chart of the overlapping speech detection method according to the present invention. First, at step S11, a voice input is received through the voice input module 301, such a voice input module is, for example, a recording device in a voice recording device. Then at step S12, the input voice is sent to the voice segmentation module 302, and the voice segmentation module divides the received voice data into a plurality of voice segments according to time sequence. The segmentation is performed based on voice energy, and the length of the segmented voice segments is between 100 milliseconds and 1 second, for example, each voice segment may be 200 milliseconds, 300 milliseconds, 500 milliseconds, etc.

[0024] Subsequently, at step S13, the non-speech segment removal module 303 detects the non-speech segment ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides an overlapped voice detection method and an overlapped voice detection system. The method comprises the following steps: based on the Bayes information criterion, finding voice fractions which only contain voice of a single speaking person from a plurality of voice fractions in the overlapped voice, and assigning an identical identifier to voice fractions belonging to the same speaking person; randomly selecting the sample data of various voice fractions from the same type of voice fractions and combining the selected sample data so as to obtain various combination results capable of reflecting all voice overlapping possibilities; establishing a single-person speech fraction model and an overlapped voice fraction model on the basis of the obtained single-person voice fractions and the combined multi-person overlapped voice fractions; and finally detecting each voice fraction by using the single-person voice fraction model and the overlapped voice fraction model, and labeling each voice fraction according to the detection result.

Description

technical field [0001] The invention relates to a method and a system for marking a speaker in a segment of speech, in particular to identifying a speaker in a segment of speech. Background technique [0002] In some practical applications, it is usually necessary to recognize a piece of speech to know how many people are speaking in the speech and which part of the speech is said by that person. This recognition is especially important when there are multiple speakers in a piece of speech. At this stage, especially in conferences, overlapping speech (there are multiple people speaking) is a very important source of error when it comes to labeling speakers. Current speech annotation systems struggle to correctly identify overlapping speech segments that contain multiple speakers. Existing speech segment recognition systems usually can only recognize speech segments containing a single speaker for each segment of speech, and for speech segments containing multiple speakers,...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G10L11/00G10L25/27G10L25/90
Inventor 尹悦燕鲁耀杰王磊史达飞郑继川
Owner RICOH KK
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products