Speaker identification method, device, equipment, storage medium and program product

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A speaker recognition and recognition technology, applied in speech recognition, neural learning methods, character and pattern recognition, etc., can solve the problems of high labor cost, long training period, poor recognition effect of speaker recognition model, etc., and achieve improvement Accuracy, improving extraction precision and accuracy, and reducing the effect of training cycles

Pending Publication Date: 2021-04-09

BEIJING BAIDU NETCOM SCI & TECH CO LTD

View PDF0 Cites 0 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0002] In related technologies, in the application of speaker recognition models in different fields, due to differences in data characteristics in different fields, such as data encoding, data content, data channels, and data dimensions, the speaker recognition model is used in cross-field There is a problem with poor recognition effect in the application

[0003] Based on this, it is necessary to optimize the interlocutor recognition model. Usually, it is necessary to collect a large amount of labeled data related to the target field, retrain the model and verify it, and there are defects such as long training period and high labor cost.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0042] Exemplary embodiments of the present disclosure are described below in conjunction with the accompanying drawings, which include various details of the embodiments of the present disclosure to facilitate understanding, and they should be regarded as exemplary only. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the disclosure. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.

[0043] Such as figure 1 As shown, the embodiment of the present disclosure provides a method for generating a speaker recognition model, which relates to the field of artificial intelligence technology, and in particular to the fields of speech recognition, big data, deep learning, and cloud computing. The method may include the steps of:

[0044] Step S101: Obtain an...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention provides a speaker recognition method and device, equipment, a storage medium and a program product, and relates to the field of artificial intelligence, in particular to the fields of speech recognition, deep learning, big data, cloud computing and the like. The specific implementation scheme is as follows: obtaining an initial model, wherein the initial model comprises a feature extraction network; acquiring sample features of a source domain sample audio and a target domain sample audio, wherein the source domain sample audio comprises a speaker label and a domain label, and the target domain sample audio comprises a domain label; extracting sample features of the source domain sample audio and the target domain sample audio frame by frame based on a feature extraction network to obtain source domain clause features and target domain clause features; and training an initial model by using the source domain clause features and the target domain clause features to generate a speaker recognition model, the speaker recognition model being used for recognizing a speaker of the to-be-recognized audio of the target domain. According to the technology disclosed by the invention, the training efficiency of the speaker recognition model can be improved, and the training period is shortened.

Description

technical field [0001] The present disclosure relates to the technical field of artificial intelligence, in particular to the fields of speech recognition, big data, deep learning and cloud computing. Background technique [0002] In related technologies, in the application of speaker recognition models in different fields, due to differences in data characteristics in different fields, such as data encoding, data content, data channels, and data dimensions, etc., speaker recognition models are used in cross-field There is a problem of poor recognition effect in the application. [0003] Based on this, it is necessary to optimize the interlocutor recognition model. Usually, it is necessary to collect a large amount of labeled data related to the target field, retrain the model and verify it, which has defects such as long training period and high labor cost. Contents of the invention [0004] The present disclosure provides a method, device, device, storage medium and pro...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

IPC IPC(8): G10L15/06G10L15/02G06K9/46G06K9/62G06N3/04G06N3/08G10L15/16G10L15/22G10L15/26

CPCG10L15/063G10L15/02G10L15/22G10L15/16G06N3/084G06V10/449G06N3/045G06F18/214

Inventor 赵情恩曾新贵熊新雷陈蓉肖岩

Owner BEIJING BAIDU NETCOM SCI & TECH CO LTD

Speaker identification method, device, equipment, storage medium and program product

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology