Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Taxpayer industry classification method based on multistage generative model

A technology for generating models and classification methods, applied in biological neural network models, neural learning methods, text database clustering/classification, etc., can solve the problem of difficulty in dealing with the dependence of label noise and features, poor model classification accuracy, and no consideration of training Problems such as data category label noise, to achieve the effect of solving taxpayer industry classification problems, preventing the decline in classification accuracy, and credible prediction results

Active Publication Date: 2021-05-28
XI AN JIAOTONG UNIV
View PDF8 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0010]However, the existing technical solutions described in the above literature mainly have the following problems: when constructing taxpayer industry classification models, the problem of noise in the training data category labels is not considered, resulting in training The resulting model has poor classification accuracy
In addition, the existing technology only considers the one-way mapping from the feature level to the label level, and the noise of industry category labels is often directly related to characteristics such as business scope, and the one-way mapping is difficult to deal with the dependence between label noise and features.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Taxpayer industry classification method based on multistage generative model
  • Taxpayer industry classification method based on multistage generative model
  • Taxpayer industry classification method based on multistage generative model

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0122] The present invention will be further described below in conjunction with specific embodiments with reference to the accompanying drawings.

[0123] Such as figure 1 As shown, in the specific implementation of the present invention, the prediction process to the taxpayer's true industry category label comprises the following steps:

[0124] Step 1: Construct taxpayer text and non-text feature vectors. The implementation process is as follows figure 2 , including the following steps:

[0125] S101: Extract taxpayer text information and non-text information

[0126] Extract the text information in the taxpayer industry information table, extract the taxpayer name, the industry detail code of the taxpayer's main industry and subsidiary industries, and the text of the business scope from the registered taxpayer information and registered taxpayer information extension table.

[0127] Extract the non-text information in the taxpayer's industry information form, and extra...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a taxpayer industry classification method based on a multistage generative model, which comprises the following steps: firstly, extracting to-be-mined text and non-text information in taxpayer industry information to carry out text embedding and coding, and carrying out feature processing on the coded information; secondly, converting the taxpayer industry category label with noise into a multi-complementary label; thirdly, constructing a multi-level generation model of label and feature levels under the bidirectional mapping framework; then, training the device based on the encoded features and the generated multiple complementary tags; and finally, taking the prediction of the label level on the real label as the final taxpayer industry category of the test data. According to the method, the noise rate of the tag is reduced by converting the noisy tag into the multi-complementary tag, the feature and tag level bidirectional mapping is introduced to cope with feature-dependent noise existing in the taxpayer industry category tag, and the taxpayer industry classification accuracy can be effectively improved.

Description

technical field [0001] The invention belongs to the field of industry classification, and in particular relates to a taxpayer industry classification method based on a multi-level generation model, which is used to solve the taxpayer industry classification label noise classification problem. Background technique [0002] At present, the industry classification in the tax registration information is mainly judged by the responsible personnel of the tax authority in the tax registration process based on the taxpayer's business scope and actual business operations based on experience, resulting in a discrepancy between the industry classification in the tax registration information and the actual industry classification. That is, there is noise in the labeling of the taxpayer's industry category. Therefore, how to train a noise-robust classifier based on the noisy taxpayer industry category labels, identify and correct the inconsistency between the existing taxpayer's business...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/35G06F16/33G06F40/242G06F40/289G06K9/62G06N3/04G06N3/08G06Q40/00
CPCG06F16/35G06F16/3335G06F40/289G06F40/242G06N3/084G06Q40/10G06N3/048G06N3/045G06F18/2132G06F18/2415
Inventor 郑庆华董博吴雨萱赵锐阮建飞师斌
Owner XI AN JIAOTONG UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products