Method and system for multi-label distribution learning in natural language processing classification model

A technology of natural language processing and classification model, applied in natural language data processing, semantic analysis, electronic digital data processing and other directions, can solve the problems of inaccurate label distribution and unfavorable model generalization ability.

Pending Publication Date: 2020-10-20
北京北大软件工程股份有限公司
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, there are still some shortcomings in multi-label learning, that is, many samples are not particularly clear about whether they belong to a label, but are in a state of "either being labeled with this label or not being labeled with this label".
The label distribution obtained by the existing technology when calculating the label of the sample is not accurate, which is not conducive to improving the generalization ability of the model

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and system for multi-label distribution learning in natural language processing classification model
  • Method and system for multi-label distribution learning in natural language processing classification model
  • Method and system for multi-label distribution learning in natural language processing classification model

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0049] In order to make the purpose, technical solution and advantages of the present application clearer, the technical solution of the present invention will be described in detail below in conjunction with the drawings and embodiments. Apparently, the described embodiments are only some of the embodiments of this application, not all of them. Based on the embodiments in the present application, all other implementation manners obtained by persons of ordinary skill in the art without creative efforts fall within the protection scope of the present application.

[0050] In one embodiment, the present invention provides a method for multi-label distribution learning in a natural language processing classification model, such as figure 1 shown, including the following steps:

[0051] Get training samples;

[0052] Calculate the label vector of each label and the sample vector of each sample according to the data of all samples;

[0053] Calculate the correlation between each...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a multi-label distribution learning method and system in a natural language processing classification model and belongs to the field of natural language processing. After a training sample is obtained, a label vector of each label and a sample vector of each sample are obtained through calculation; the correlation between each sample and each label is calculated according to the label vector and the sample vector; the label distribution of each sample is calculated through the correlation between each sample and each label; and finally the natural processing classification model is updated according to the label distribution. The method is advantaged in that the updated samples of the natural processing classification model can obtain more accurate labels, and the generalization ability of the natural processing classification model is greatly improved.

Description

technical field [0001] The present invention relates to the field of natural language processing, in particular, to a method and system for multi-label distribution learning in natural language processing classification models. Background technique [0002] Natural language processing tasks are mainly divided into three categories, chapter-level classification tasks, sentence-level classification tasks, and word-level classification tasks. In traditional classification tasks, there is often a situation that a sample does not belong to only one category, so multi-label learning (multi-label learning) came into being. However, there are still some shortcomings in multi-label learning, that is, many samples are not particularly clear about whether they belong to a label, but are in a state of "either being labeled with this label or not being labeled with this label". The label distribution obtained by the existing technology when calculating the label of the sample is inaccur...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F16/35G06F40/211G06F40/284G06F40/30G06K9/62
CPCG06F16/35G06F40/211G06F40/284G06F40/30G06F18/2411G06F18/214
Inventor 叶蔚刘培阳张世琨张君福
Owner 北京北大软件工程股份有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products