Voiceprint recognition method based on gender, nationality and emotion information

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A voiceprint recognition and nationality technology, applied in the field of speaker recognition, can solve problems such as affecting the accuracy of speaker recognition, limited neural network learning ability, and increasing network depth or complexity.

Active Publication Date: 2020-06-05

TIANJIN UNIV

View PDF3 Cites 18 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

These attributes affect the accuracy of speaker recognition when performing speaker verification tasks

Subjectively speaking, gender and nationality information can provide multiple verifications for the identity of the speaker to increase the recognition rate. However, when the emotions contained in different sentences of the same speaker are inconsistent, it will seriously affect the extraction of the speaker's personalized features. Thereby reducing the system recognition rate

[0003] Existing methods improve system performance from three aspects: 1) increase the amount of training data; 2) increase the depth or complexity of the network; 3) design a more ingenious loss function; however, due to the limited learning ability of the neural network, in order to To further improve the performance of the system without increasing the training complexity, it is essential to add more constraints during the training phase

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0054] The present invention will be described in further detail below in conjunction with the accompanying drawings and accompanying tables.

[0055] This example uses the widely used VOXCELEB1 data set in speaker recognition as an example to give an embodiment of the invention. The entire system algorithm process includes four steps: data preprocessing, feature extraction, training of neural network model parameters, and use of scoring fusion tools. Specific steps are as follows:

[0056] 1) Data preprocessing

[0057] In the data preprocessing stage, the length of the training sentence is firstly limited, and the sentence with a length of less than 1 second is directly skipped, and the sentence with a length of more than 3 seconds is randomly cut for 3 seconds. All training sentences are then normalized.

[0058] 2) Feature extraction

[0059] A 512-dimensional spectrogram was extracted using the Librosa tool. See above for a detailed description of spectrograms.

[0...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a voiceprint recognition method based on gender, nationality and emotion information. The method specifically comprises the following steps: firstly, data preprocessing, secondly, feature extraction, and secondly, neural network parameter training: according to the specific structure of a neural network, in the training process, the input sequence of training sentences is disrupted, then 128 sentences are randomly selected as a training batch, and the number of data iterations is 80; wherein a training file required by the scoring fusion tool is a development set resultand a test set result of each system; wherein VOXCELEB1 test is used in the test set; wherein the development set is a test file which is generated based on 1211 training speaker statements and comprises 40,000 test pairs; and a final test set scoring result is obtained through 100 times of iteration. According to the invention, the recognition rate is improved.

Description

technical field [0001] The invention relates to the field of text-independent speaker recognition, in particular to multi-task and confrontational domain adaptation training, specifically a voiceprint recognition method based on gender, nationality and emotional information. Background technique [0002] Speech contains different kinds of attributes, such as content, gender, nationality, emotion, age and so on. These attributes affect the accuracy of speaker recognition when performing speaker verification tasks. Subjectively speaking, gender and nationality information can provide multiple verifications for the identity of the speaker to increase the recognition rate. However, when the emotions contained in different sentences of the same speaker are inconsistent, it will seriously affect the extraction of the speaker's personalized features. Thereby reducing the system recognition rate. [0003] Existing methods improve system performance from three aspects: 1) increase ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

IPC IPC(8): G10L17/02G10L17/04G10L17/18G10L17/22G10L25/60G10L25/63

CPCG10L17/02G10L17/04G10L17/18G10L17/22G10L25/63G10L25/60Y02D10/00

Inventor 党建武李凯王龙标

Owner TIANJIN UNIV

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Voiceprint recognition method based on gender, nationality and emotion information

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology