Voiceprint recognition method, device, equipment and storage medium based on distance coding
A technology of voiceprint recognition and distance coding, which is applied in speech analysis, instruments, etc., can solve the problems of ignoring the relationship between the original space of data, general poor adaptability, and poor voiceprint recognition effect, so as to achieve good recognition performance and improve sensitivity , the effect of learning
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment 1
[0039] see figure 1 , the present embodiment provides a voiceprint recognition method based on distance coding, including a training phase and a recognition phase, wherein the training phase includes:
[0040] Step S11, acquire speech data with speaker labels, and extract basic feature representations for each speech, thereby forming a training set.
[0041] In this step, the basic feature representation of speech can be frequency features, such as Mel frequency cepstral coefficient (MFCC), constant Q cepstral coefficient (CQCC), etc., or it can be an embedded representation based on neural network extraction, such as d- vector, x-vector, etc. The speech data may be complete sentence data, or speech fragments divided into speech units such as phonemes, syllables, and words.
[0042] In this embodiment, we illustrate with a training set formed from a database covering 200 people with a total of 100,000 speeches. Each piece of data in the dataset provides the original speech ...
Embodiment 2
[0076] see figure 2 , a voiceprint recognition device based on distance coding. This device needs to go through the training stage before proceeding to the recognition stage. The device includes: a speech processing module, a similarity matrix training module, an embedding vector generation module, an anchor point set generation module, Encoding module, regression model training module and recognition module.
[0077] The speech processing module is used to obtain speech data with speaker labels in the training stage, and extracts the basic feature representation for each speech, thereby forming a training set, and forms a training set; The front end processes and extracts the underlying feature representations.
[0078] Specifically, the basic feature representation of speech can be frequency features, such as Mel frequency cepstral coefficient (MFCC), constant Q cepstral coefficient (CQCC), etc., or it can be an embedded representation based on neural network extraction, s...
Embodiment 3
[0094] Based on the method of Embodiment 1 and the device of Embodiment 2, this embodiment provides a computer device, including a memory and a processor, wherein the memory and the processor are both arranged on the bus, the memory stores a computer program, and the processor executes the computer program The real-time implementation is the voiceprint recognition method based on distance coding described in Embodiment 1.
[0095] Those of ordinary skill in the art can understand that all or part of the process in the method of Embodiment 1 can be completed by instructing related hardware, software, firmware or their combination through a program, and the program can be stored in a computer-readable When the program is executed, the program may include the procedures of the embodiments of the above methods. Wherein, the storage medium may be a magnetic disk, an optical disk, a read-only memory (Read-Only Memory, ROM) or a random access memory (Random Access Memory, RAM) and th...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com