Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

A Speaker Age and Gender Classification Method Based on Residual Network and Fusion Features

A technology that integrates features and classification methods, applied in speech analysis, speech recognition, instruments, etc., can solve problems such as discrepancies in recognition results, increase the difficulty of gender and age recognition of speakers, and increase the difficulty of practical application of system overhead.

Active Publication Date: 2022-08-05
UNIV OF ELECTRONICS SCI & TECH OF CHINA
View PDF10 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

At present, the gender and age classification of speakers is mainly similar to the voiceprint recognition technology. There are mainly traditional statistical methods and deep neural network methods. Although there is a high recognition rate in gender classification, due to the speaker's voice characteristics and The relationship between age is more complicated, and the accuracy of age classification is not very high
[0003] The current speaker gender and age identification has the following difficulties: First, due to the uncertainty of age estimation, most current research on speaker gender and age classification separates the speaker’s gender and age in order to ensure the accuracy of gender classification identification, which increases the overhead of the system and the difficulty of practical application; second, it is difficult to find characteristic parameters that can fully characterize the speaker's gender and age, which also increases the difficulty of speaker gender and age identification; third, the limitations of traditional statistical methods It is impossible to accurately extract the speech characteristics of the speaker's gender and age from a large amount of speech data; the fourth is the lack of speech data sets. At present, if you want to identify the speaker's gender and age, most of you need to collect speech data yourself, and due to the collection Differences in equipment will also lead to differences in recognition results

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A Speaker Age and Gender Classification Method Based on Residual Network and Fusion Features
  • A Speaker Age and Gender Classification Method Based on Residual Network and Fusion Features
  • A Speaker Age and Gender Classification Method Based on Residual Network and Fusion Features

Examples

Experimental program
Comparison scheme
Effect test

Embodiment

[0073] The invention proposes an end-to-end speaker gender and age classification method, which simultaneously realizes the speaker's gender and age classification. First, the original speech is processed to obtain its MFCC coefficient (13 dimensions), MFCC first-order difference (13 dimensions) respectively. ) and the fundamental frequency F0, splicing the three parameters to obtain a 27-dimensional mixing parameter, which is used as the input of the network. The network consists of 4 residual layers, a fully connected layer and a sampling layer. The mixed parameters extracted in the first step are first used to extract the speaker's speech information features through the 4 residual layers. The four residual layers are all composed of a convolution layer and several residual blocks. After four convolution layers, 512-dimensional feature parameters are obtained. The extracted feature parameters are output, scored in the sampling t layer, and the final judgment result is outpu...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The present invention provides a speaker age and gender classification method based on residual network and fusion features. The present invention combines MFCC parameters and fundamental frequency F0, and uses their composite features as the speech features for speaker gender and age classification; the present invention adopts The convolutional residual network trains the speaker gender and age recognition model. The residual network solves the gradient explosion and disappearance caused by the deepening of the network level in the deep neural network, so that a deeper network can be used in training to extract deeper layers. voice features, thereby improving the accuracy of recognition. According to the gender and age of the speakers, the present invention only divides the speakers into six categories (minor men and women<18), (18=<adult men and women<55), and (senior men and women>=55) to improve the recognition degree.

Description

technical field [0001] The invention belongs to the technical field of voiceprint recognition, and in particular relates to a method for classifying age and gender of speakers based on residual network and fusion features. Background technique [0002] With the application of deep neural networks in the field of voiceprint recognition, voiceprint recognition technology has made great breakthroughs and is gradually applied to actual scenarios. However, compared with voiceprint recognition technology, the accuracy of attribute classification such as speaker gender and age needs to be improved . At present, the gender and age classification of speakers is similar to the voiceprint recognition technology. There are mainly traditional statistical methods and deep neural network methods. Although the gender classification has a high recognition rate, due to the speaker's voice characteristics and The relationship of age is more complicated, and the accuracy of age classification ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G10L15/08G10L15/06G10L25/18G10L25/24G10L25/30G10L25/45
CPCG10L15/08G10L15/063G10L25/18G10L25/24G10L25/27G10L25/45
Inventor 文军汪伟宋文豪
Owner UNIV OF ELECTRONICS SCI & TECH OF CHINA
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products