Unlock instant, AI-driven research and patent intelligence for your innovation.

Tax payer industry classification-oriented label noise learning method

A learning method, a taxpayer's technology, applied in the direction of neural learning methods, text database clustering/classification, instruments, etc., can solve the problems of difficult labeling and acquisition of anchor points, and achieve the effect of avoiding new errors and broad application scenarios

Pending Publication Date: 2022-07-29
XI AN JIAOTONG UNIV
View PDF2 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, the traditional mixed ratio estimation method is only suitable for binary classification scenarios and relies on anchor points (samples that clearly belong to a certain class), while the taxpayer industry classification problem has many industry categories and is a multi-classification problem, and the anchor points are difficult to label and get

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Tax payer industry classification-oriented label noise learning method
  • Tax payer industry classification-oriented label noise learning method
  • Tax payer industry classification-oriented label noise learning method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0102] Exemplary embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure may be embodied in various forms and should not be limited by the embodiments set forth herein. Rather, these embodiments are provided so that the present disclosure will be more thoroughly understood, and will fully convey the scope of the present disclosure to those skilled in the art. It should be noted that the embodiments of the present invention and the features of the embodiments may be combined with each other under the condition of no conflict. The present invention will be described in detail below with reference to the accompanying drawings and in conjunction with the embodiments.

[0103] like figure 1 As shown, in the specific implementation of the present invention, a label noise learning metho...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a taxpayer industry classification-oriented label noise learning method, which comprises the following steps of: firstly, extracting text information and non-text information in taxpayer industry information, and respectively carrying out text embedding and non-text coding processing based on an XLNet text pre-training network and a coding technology to obtain feature information; secondly, constructing a TextCNN network used for taxpayer industry classification, determining the number of layers of the network, the shape of a convolution kernel and the input and output dimension of each layer according to the feature information and the target classification number, connecting the XLNet text pre-training network and the TextCNN network in series, and constructing an end-to-end training device by combining the noisy taxpayer industry label data as supervision; thirdly, estimating a condition transfer matrix based on an improved mixing proportion estimation method; and finally, learning network parameters in the training device, taking the condition transfer matrix as a linear layer behind the TextCNN network, realizing conversion from noise label prediction to real taxpayer industry label prediction, and carrying out taxpayer industry classification.

Description

technical field [0001] The invention belongs to the technical field of text classification with label noise, in particular to a label noise learning method for taxpayer industry classification. Background technique [0002] In recent years, the market economy has continued to prosper, the number of enterprises has increased, and the division of labor has been continuously refined. Accompanying this, the upgrade and further construction of the tax system has become an urgent need. [0003] Taxpayer industry classification is a prerequisite for determining taxpayer policies and preferential policies, and an important link in tax collection. At present, my country divides the taxpayer industry into 20 categories and 97 categories. Due to the large number of categories, the traditional manual classification method requires a lot of human resources, and is limited by the professional knowledge and experience of the classifier, which inevitably introduces classification errors, ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/35G06F40/205G06F40/289G06K9/62G06N3/04G06N3/08G06Q40/00
CPCG06F16/35G06F40/289G06F40/205G06N3/08G06Q40/10G06N3/045G06F18/2431G06F18/2415
Inventor 郑庆华曹书植阮建飞赵锐董博师斌
Owner XI AN JIAOTONG UNIV