Interactive voice segmentation and clustering method, device and equipment

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A clustering method and interactive technology, applied in speech analysis, speech recognition, instruments, etc., can solve the problem of inaccurate speech segmentation and clustering results, and achieve the effect of improving accuracy

Pending Publication Date: 2022-07-05

XIAMEN KUAISHANGTONG TECH CORP LTD

View PDF0 Cites 1 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0004] In view of this, the object of the present invention is to propose an interactive speech segmentation and clustering method, device and equipment, aiming to solve the problem of inaccurate results of existing speech segmentation and clustering

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0037] In order to make the purposes, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments These are some embodiments of the present invention, but not all of them. Based on the embodiments of the present invention, all other embodiments obtained by those of ordinary skill in the art without creative efforts shall fall within the protection scope of the present invention. Accordingly, the following detailed description of the embodiments of the invention provided in the accompanying drawings is not intended to limit the scope of the invention as claimed, but is merely representative of selected embodiments of the invention. Based on the embodiments of the present invention, all other embodiments obtained b...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses an interactive voice segmentation and clustering method and device, equipment and a storage medium, and the method comprises the steps: carrying out the preprocessing of to-be-processed audio data, and obtaining N types of voices; the N types of voices are audited and listened, the voices belonging to the same person are combined, M types of voices are obtained, and the M types of voices correspond to the number of people in the audio dialogue; calculating a center vector of each type of voice and the similarity of each voice segment contained in each type of voice based on the M types of voice, and marking the voice segments with the similarity lower than a preset value; and auditing and listening the marked voice segments, and redistributing the marked voice segments to obtain an audio classification result. The accuracy of a voice segmentation clustering result can be improved.

Description

technical field [0001] The present invention relates to the technical field of speech processing, and in particular, to an interactive speech segmentation and clustering method, device and device. Background technique [0002] Speech segmentation and clustering is to solve the problem of who said when in audio. In a voice file with multiple people talking alternately, it is possible to mark the start and end time of each person's speech. Based on this technology, subsequent voiceprint extraction of different people, automatic speech recognition, and target speaker detection can be performed. [0003] At present, the main implementation method is to shard the voice, and then cluster it based on the voiceprint algorithm, and then calculate the start and end times according to the clustering result. Since the algorithm cannot directly obtain the number of people with speech content, the similarity threshold is generally used for clustering. However, this method has the probl...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

IPC IPC(8): G10L15/00G10L15/01G10L15/02G10L15/04G10L15/06G10L25/27

CPCG10L15/005G10L15/01G10L15/02G10L15/04G10L15/06G10L25/27G10L2015/0631

Inventor 洪国强肖龙源李稀敏叶志坚

Owner XIAMEN KUAISHANGTONG TECH CORP LTD

Interactive voice segmentation and clustering method, device and equipment

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology