Voiceprint recognition method, device, equipment and storage medium based on distance coding

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A technology of voiceprint recognition and distance coding, which is applied in speech analysis, instruments, etc., can solve the problems of ignoring the relationship between the original space of data, general poor adaptability, and poor voiceprint recognition effect, so as to achieve good recognition performance and improve sensitivity , the effect of learning

Active Publication Date: 2022-05-03

SICHUAN CHANGHONG ELECTRIC CO LTD

View PDF14 Cites 0 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0005] The present invention provides a method, device, device and storage medium for voiceprint recognition based on distance coding, to solve the problem that the existing technology only considers the class label of the voice data in the target space, but ignores the relationship of the data in the original space. Such a system It may make part of the data that is relatively close in the original space be mapped to the embedding vector space, but the distance is relatively long, resulting in the problem of poor voiceprint recognition. At the same time, the existing voiceprint recognition system is not efficient and generally adaptable. sex problem

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment 1

[0039] see figure 1 , the present embodiment provides a voiceprint recognition method based on distance coding, including a training phase and a recognition phase, wherein the training phase includes:

[0040] Step S11, acquire speech data with speaker labels, and extract basic feature representations for each speech, thereby forming a training set.

[0041] In this step, the basic feature representation of speech can be frequency features, such as Mel frequency cepstral coefficient (MFCC), constant Q cepstral coefficient (CQCC), etc., or it can be an embedded representation based on neural network extraction, such as d- vector, x-vector, etc. The speech data may be complete sentence data, or speech fragments divided into speech units such as phonemes, syllables, and words.

[0042] In this embodiment, we illustrate with a training set formed from a database covering 200 people with a total of 100,000 speeches. Each piece of data in the dataset provides the original speech ...

Embodiment 2

[0076] see figure 2 , a voiceprint recognition device based on distance coding. This device needs to go through the training stage before proceeding to the recognition stage. The device includes: a speech processing module, a similarity matrix training module, an embedding vector generation module, an anchor point set generation module, Encoding module, regression model training module and recognition module.

[0077] The speech processing module is used to obtain speech data with speaker labels in the training stage, and extracts the basic feature representation for each speech, thereby forming a training set, and forms a training set; The front end processes and extracts the underlying feature representations.

[0078] Specifically, the basic feature representation of speech can be frequency features, such as Mel frequency cepstral coefficient (MFCC), constant Q cepstral coefficient (CQCC), etc., or it can be an embedded representation based on neural network extraction, s...

Embodiment 3

[0094] Based on the method of Embodiment 1 and the device of Embodiment 2, this embodiment provides a computer device, including a memory and a processor, wherein the memory and the processor are both arranged on the bus, the memory stores a computer program, and the processor executes the computer program The real-time implementation is the voiceprint recognition method based on distance coding described in Embodiment 1.

[0095] Those of ordinary skill in the art can understand that all or part of the process in the method of Embodiment 1 can be completed by instructing related hardware, software, firmware or their combination through a program, and the program can be stored in a computer-readable When the program is executed, the program may include the procedures of the embodiments of the above methods. Wherein, the storage medium may be a magnetic disk, an optical disk, a read-only memory (Read-Only Memory, ROM) or a random access memory (Random Access Memory, RAM) and th...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a voiceprint recognition method, device, device and storage medium based on distance coding. The method includes a training stage: acquiring voice data with speaker labels, and extracting basic feature representations from it; The distance between the basic feature representations constitutes a similarity matrix; the eigenvalue decomposition is performed on the similarity matrix, and the eigenvectors corresponding to the D largest eigenvalues are taken to form a matrix, and the embedding vector is formed after transposition; M speech data are selected from the training set , define the set of corresponding basic feature representations as an anchor point set; use the basic feature representation in the anchor point set to encode the basic feature representation of each piece of speech data in the training set to generate a coding vector; train the regression model, and convert each The encoding vectors corresponding to speech data are mapped to their corresponding embedding vectors. Recognition stage: Carry out similarity judgment. The present invention introduces the original feature space position relationship into the speaker embedding vector, thereby obtaining better recognition performance.

Description

technical field [0001] The present invention relates to the technical field of voiceprint recognition, in particular to a voiceprint recognition method, device, equipment and storage medium based on distance coding. Background technique [0002] With the rapid development of artificial intelligence technology, more and more products incorporating artificial intelligence technology appear in people's daily life. Among them, voiceprint recognition, as an important identity information identification method, has also achieved good development and wide application in recent years, especially in the field of security and smart device products. [0003] However, existing voiceprint recognition techniques only consider the class label of speech data in the target space, that is, the speaker label, while ignoring the relationship of the data in the original space. Such a system may make part of the data that is relatively close in the original space be farther away after being mapp...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Patents(China)

IPC IPC(8): G10L17/04G10L17/02G10L17/08G10L17/18G10L19/00G10L25/24

CPCG10L17/04G10L17/02G10L17/08G10L17/18G10L19/00G10L25/24

Inventor 汪欣

Owner SICHUAN CHANGHONG ELECTRIC CO LTD

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Voiceprint recognition method, device, equipment and storage medium based on distance coding

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment 1

Embodiment 2

Embodiment 3

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology