Voiceprint recognition method based on gender, nationality and emotion information

A voiceprint recognition and nationality technology, applied in the field of speaker recognition, can solve problems such as affecting the accuracy of speaker recognition, limited neural network learning ability, and increasing network depth or complexity.

Active Publication Date: 2020-06-05
TIANJIN UNIV
View PDF3 Cites 18 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

These attributes affect the accuracy of speaker recognition when performing speaker verification tasks
Subjectively speaking, gender and nationality information can provide multiple verifications for the identity of the speaker to increase the recognition rate. However, when the emotions contained in different sentences of the same speaker are inconsistent, it will seriously affect the extraction of the speaker's personalized features. Thereby reducing the s...

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Voiceprint recognition method based on gender, nationality and emotion information
  • Voiceprint recognition method based on gender, nationality and emotion information
  • Voiceprint recognition method based on gender, nationality and emotion information

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0054] The present invention will be described in further detail below in conjunction with the accompanying drawings and accompanying tables.

[0055] This example uses the widely used VOXCELEB1 data set in speaker recognition as an example to give an embodiment of the invention. The entire system algorithm process includes four steps: data preprocessing, feature extraction, training of neural network model parameters, and use of scoring fusion tools. Specific steps are as follows:

[0056] 1) Data preprocessing

[0057] In the data preprocessing stage, the length of the training sentence is firstly limited, and the sentence with a length of less than 1 second is directly skipped, and the sentence with a length of more than 3 seconds is randomly cut for 3 seconds. All training sentences are then normalized.

[0058] 2) Feature extraction

[0059] A 512-dimensional spectrogram was extracted using the Librosa tool. See above for a detailed description of spectrograms.

[0...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a voiceprint recognition method based on gender, nationality and emotion information. The method specifically comprises the following steps: firstly, data preprocessing, secondly, feature extraction, and secondly, neural network parameter training: according to the specific structure of a neural network, in the training process, the input sequence of training sentences is disrupted, then 128 sentences are randomly selected as a training batch, and the number of data iterations is 80; wherein a training file required by the scoring fusion tool is a development set resultand a test set result of each system; wherein VOXCELEB1 test is used in the test set; wherein the development set is a test file which is generated based on 1211 training speaker statements and comprises 40,000 test pairs; and a final test set scoring result is obtained through 100 times of iteration. According to the invention, the recognition rate is improved.

Description

technical field [0001] The invention relates to the field of text-independent speaker recognition, in particular to multi-task and confrontational domain adaptation training, specifically a voiceprint recognition method based on gender, nationality and emotional information. Background technique [0002] Speech contains different kinds of attributes, such as content, gender, nationality, emotion, age and so on. These attributes affect the accuracy of speaker recognition when performing speaker verification tasks. Subjectively speaking, gender and nationality information can provide multiple verifications for the identity of the speaker to increase the recognition rate. However, when the emotions contained in different sentences of the same speaker are inconsistent, it will seriously affect the extraction of the speaker's personalized features. Thereby reducing the system recognition rate. [0003] Existing methods improve system performance from three aspects: 1) increase ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G10L17/02G10L17/04G10L17/18G10L17/22G10L25/60G10L25/63
CPCG10L17/02G10L17/04G10L17/18G10L17/22G10L25/63G10L25/60Y02D10/00
Inventor 党建武李凯王龙标
Owner TIANJIN UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products