Method and device for voice segmentation

What is AI technical title?
AI technical title is built by PatSnap AI team. It summarizes the technical point description of the patent document.
A speech and speech segment technology, applied in speech analysis, speech recognition, instruments, etc., can solve the problems of low precision and poor segmentation effect, and achieve the effect of improving precision and good effect.

Active Publication Date: 2018-03-06

PING AN TECH (SHENZHEN) CO LTD

View PDF3 Cites 0 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

The traditional speech segmentation technology is based on the global background model and the Gaussian mixture model. Due to technical limitations, the segmentation accuracy of this speech segmentation method is not high, especially for the dialogues that frequently alternate and overlap. Difference

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0045] The principles and features of the present invention are described below in conjunction with the accompanying drawings, and the examples given are only used to explain the present invention, and are not intended to limit the scope of the present invention.

[0046] Such as figure 1 as shown, figure 1 It is a schematic flow chart of an embodiment of the method for voice segmentation of the present invention, the method for voice segmentation includes the following steps:

[0047] Step S1, when the automatic answering system receives the mixed voice sent by the terminal, it divides the mixed voice into multiple short voice segments, and marks each short voice segment with a corresponding speaker identification;

[0048] This embodiment can be applied to an automatic answering system of a call center, for example, an automatic answering system of an insurance call center, an automatic answering system of various customer service call centers, and the like. The automatic ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

Disclosed are a speech segmentation method and device. The speech segmentation method comprises: when a mixed speech sent by a terminal is received, segmenting the mixed speech into a plurality of phrase segments, and labelling each phrase segment with a corresponding speaker identifier (S1); and using a recurrent neural network to build a voiceprint model for the phrase segment corresponding to each speaker identifier, and adjusting a corresponding segmentation boundary in the mixed speech based on the voiceprint model so as to segment out an effective speech segment corresponding to each speaker identifier (S2). The method can effectively improve the precision of speech segmentation and has a better speech segmentation effect especially on frequently alternate conversations and overlapping speeches.

Description

technical field [0001] The invention relates to the technical field of speech processing, in particular to a method and device for speech segmentation. Background technique [0002] At present, many voices received by the call center are mixed with the voices of multiple people. At this time, the voice needs to be segmented (speaker diarization) before further voice analysis can be performed on the target voice. Speech segmentation refers to: in the field of speech processing, when the voices of multiple speakers are combined and recorded in one channel, the voices of each speaker in the signal are extracted separately. The traditional speech segmentation technology is based on the global background model and the Gaussian mixture model. Due to technical limitations, the segmentation accuracy of this speech segmentation method is not high, especially for the dialogues that frequently alternate and overlap. Difference. Contents of the invention [0003] The object of the p...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & AuthorityPatents(China)

IPC IPC(8): G10L15/04

CPCG10L15/04

Inventor王健宗郭卉肖京

OwnerPING AN TECH (SHENZHEN) CO LTD

Method and device for voice segmentation

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology