Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Voiceprint recognition method, device, equipment and storage medium based on distance coding

A technology of voiceprint recognition and distance coding, which is applied in speech analysis, instruments, etc., can solve the problems of ignoring the relationship between the original space of data, general poor adaptability, and poor voiceprint recognition effect, so as to achieve good recognition performance and improve sensitivity , the effect of learning

Active Publication Date: 2022-05-03
SICHUAN CHANGHONG ELECTRIC CO LTD
View PDF14 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] The present invention provides a method, device, device and storage medium for voiceprint recognition based on distance coding, to solve the problem that the existing technology only considers the class label of the voice data in the target space, but ignores the relationship of the data in the original space. Such a system It may make part of the data that is relatively close in the original space be mapped to the embedding vector space, but the distance is relatively long, resulting in the problem of poor voiceprint recognition. At the same time, the existing voiceprint recognition system is not efficient and generally adaptable. sex problem

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Voiceprint recognition method, device, equipment and storage medium based on distance coding
  • Voiceprint recognition method, device, equipment and storage medium based on distance coding

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0039] see figure 1 , the present embodiment provides a voiceprint recognition method based on distance coding, including a training phase and a recognition phase, wherein the training phase includes:

[0040] Step S11, acquire speech data with speaker labels, and extract basic feature representations for each speech, thereby forming a training set.

[0041] In this step, the basic feature representation of speech can be frequency features, such as Mel frequency cepstral coefficient (MFCC), constant Q cepstral coefficient (CQCC), etc., or it can be an embedded representation based on neural network extraction, such as d- vector, x-vector, etc. The speech data may be complete sentence data, or speech fragments divided into speech units such as phonemes, syllables, and words.

[0042] In this embodiment, we illustrate with a training set formed from a database covering 200 people with a total of 100,000 speeches. Each piece of data in the dataset provides the original speech ...

Embodiment 2

[0076] see figure 2 , a voiceprint recognition device based on distance coding. This device needs to go through the training stage before proceeding to the recognition stage. The device includes: a speech processing module, a similarity matrix training module, an embedding vector generation module, an anchor point set generation module, Encoding module, regression model training module and recognition module.

[0077] The speech processing module is used to obtain speech data with speaker labels in the training stage, and extracts the basic feature representation for each speech, thereby forming a training set, and forms a training set; The front end processes and extracts the underlying feature representations.

[0078] Specifically, the basic feature representation of speech can be frequency features, such as Mel frequency cepstral coefficient (MFCC), constant Q cepstral coefficient (CQCC), etc., or it can be an embedded representation based on neural network extraction, s...

Embodiment 3

[0094] Based on the method of Embodiment 1 and the device of Embodiment 2, this embodiment provides a computer device, including a memory and a processor, wherein the memory and the processor are both arranged on the bus, the memory stores a computer program, and the processor executes the computer program The real-time implementation is the voiceprint recognition method based on distance coding described in Embodiment 1.

[0095] Those of ordinary skill in the art can understand that all or part of the process in the method of Embodiment 1 can be completed by instructing related hardware, software, firmware or their combination through a program, and the program can be stored in a computer-readable When the program is executed, the program may include the procedures of the embodiments of the above methods. Wherein, the storage medium may be a magnetic disk, an optical disk, a read-only memory (Read-Only Memory, ROM) or a random access memory (Random Access Memory, RAM) and th...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a voiceprint recognition method, device, device and storage medium based on distance coding. The method includes a training stage: acquiring voice data with speaker labels, and extracting basic feature representations from it; The distance between the basic feature representations constitutes a similarity matrix; the eigenvalue decomposition is performed on the similarity matrix, and the eigenvectors corresponding to the D largest eigenvalues ​​are taken to form a matrix, and the embedding vector is formed after transposition; M speech data are selected from the training set , define the set of corresponding basic feature representations as an anchor point set; use the basic feature representation in the anchor point set to encode the basic feature representation of each piece of speech data in the training set to generate a coding vector; train the regression model, and convert each The encoding vectors corresponding to speech data are mapped to their corresponding embedding vectors. Recognition stage: Carry out similarity judgment. The present invention introduces the original feature space position relationship into the speaker embedding vector, thereby obtaining better recognition performance.

Description

technical field [0001] The present invention relates to the technical field of voiceprint recognition, in particular to a voiceprint recognition method, device, equipment and storage medium based on distance coding. Background technique [0002] With the rapid development of artificial intelligence technology, more and more products incorporating artificial intelligence technology appear in people's daily life. Among them, voiceprint recognition, as an important identity information identification method, has also achieved good development and wide application in recent years, especially in the field of security and smart device products. [0003] However, existing voiceprint recognition techniques only consider the class label of speech data in the target space, that is, the speaker label, while ignoring the relationship of the data in the original space. Such a system may make part of the data that is relatively close in the original space be farther away after being mapp...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G10L17/04G10L17/02G10L17/08G10L17/18G10L19/00G10L25/24
CPCG10L17/04G10L17/02G10L17/08G10L17/18G10L19/00G10L25/24
Inventor 汪欣
Owner SICHUAN CHANGHONG ELECTRIC CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products