Monophonic speaker separation model, training method and separation method

A speaker separation, monophonic technology, applied in the field of deep learning, can solve problems such as customer emotional fluctuations, clustering algorithm interference, etc., to achieve the effect of low production cost, high accuracy, and strong robustness

Inactive Publication Date: 2020-04-14
浙江百应科技有限公司
View PDF6 Cites 2 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, in scenarios such as collections, there are a lot of scene sounds, and the clustering algorithm is easily disturbed by abnormal points (noise, car horns, etc.). At the same time, customer service and customer emotions often fluctuate during the collection process, making it impossible to be accurate. speaker separation

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Monophonic speaker separation model, training method and separation method
  • Monophonic speaker separation model, training method and separation method
  • Monophonic speaker separation model, training method and separation method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0035] The technical solutions of the present invention will be further described in detail below through specific embodiments in combination with the accompanying drawings. Obviously, the described embodiments are only some of the embodiments of the present application, not all of them. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without making creative efforts belong to the protection scope of the present invention.

[0036] At present, part of the speaker separation still uses manual listening and manual separation, while another part of the speaker separation technology solution uses the unsupervised learning mode, which divides the recording into small audio segments, and then extracts the features of each audio segment. for clustering. However, in scenarios such as sales, return visits, and collections, there are often a large number of scene sounds, and the clustering algorithm is easily distu...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a monophonic speaker separation model, a training method and a separation method. The monophonic speaker separation method comprises the following steps: acquiring audio containing a first speaker and a second speaker; segmenting the audio to obtain at least one segmented audio; inputting at least one part of segmented audio to the monophonic speaker separation model to obtain at least one part of first embedding corresponding to the at least one part of segmented audio; inputting a pre-recorded audio only comprising the second speaker to the monophonic speaker separation model, and obtaining a second embedding corresponding to the audio only comprising the second speaker; and judging whether the cosine similarity of the at least one part of first embedding and thesecond embedding is less than a preset threshold, if so, determining that the at least one part of segmented audio is a first speaker audio, and if not, determining that the at least one part of segmented audio is a second speaker audio.

Description

technical field [0001] The invention relates to the field of deep learning, in particular to a monophonic speaker separation model, a training method and a separation method. Background technique [0002] At present, in sales, return visits, dunning and other scenarios, most companies still use monophonic recordings to collect recordings. Since the voices of customers and customer service are on the same channel, and then through asr (Automatic Speech Recognition, ) into text, it is impossible to know whether the text field corresponds to the customer or the customer service, resulting in the need to manually listen to the recordings one by one in the voice quality inspection. Moreover, some recordings are as long as several minutes, but the effective information is only a few seconds, which will greatly waste the resources of the enterprise, not only greatly increase the labor cost, lower the efficiency, but also cannot guarantee the quality of the quality inspection, and m...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G10L21/0272G10L21/0308G06N3/08G06N3/04
CPCG10L21/0272G10L21/0308G06N3/08G06N3/045
Inventor 王磊
Owner 浙江百应科技有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products