Unlock instant, AI-driven research and patent intelligence for your innovation.

Multistep label transformation-based Boosting improvement method under biased data in integrated learning

An integrated learning and labeling technology, applied in the field of data recognition, can solve the problems of increasing the difficulty of data prediction tasks, affecting the ability of algorithm fitting, and destroying the inherent statistical laws of data.

Inactive Publication Date: 2018-08-14
XIDIAN UNIV
View PDF0 Cites 3 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] In summary, the problems in the prior art are: In the iterative process of the Boosting algorithm, due to the bias of the data, it will increase the difficulty of the prediction task, reduce the learning ability of the model, and reduce the prediction accuracy.
[0005] First of all, the data distribution in the real industry is diverse, and it is difficult to deal with a variety of biased situations with a single formula transformation. In addition, the Boosting algorithm lacks an immediate data correction strategy when biased or long-tailed in the fitting process, and different The correction strategy often has relevant requirements on the data. If the corresponding transformation is not carried out, the correction strategy cannot continue. Carrying out the corresponding transformation will often cause forced mapping of the data, which will destroy the inherent statistical laws of the data and affect the subsequent algorithm fitting ability. , and it is impossible to improve the prediction accuracy, how to reasonably correct the data distribution and ensure the fitting ability of the algorithm is crucial

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Multistep label transformation-based Boosting improvement method under biased data in integrated learning
  • Multistep label transformation-based Boosting improvement method under biased data in integrated learning
  • Multistep label transformation-based Boosting improvement method under biased data in integrated learning

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0059] In order to make the object, technical solution and advantages of the present invention more clear, the present invention will be further described in detail below in conjunction with the examples. It should be understood that the specific embodiments described here are only used to explain the present invention, not to limit the present invention.

[0060] The present invention reduces the impact of data bias on the Boosting algorithm as much as possible, improves the flexibility of the algorithm, and improves the algorithm fitting ability; solves the problem of increasing the difficulty of the prediction task and reducing the model due to the bias of the data in the iterative process of the Boosting algorithm Learning ability, technical problems that reduce prediction accuracy.

[0061] The application principle of the present invention will be described in detail below in conjunction with the accompanying drawings.

[0062] Such as figure 1 As shown, the improved ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention belongs to the technical field of data recognition, and discloses a multistep label transformation-based Boosting improvement method under biased data in integrated learning. The methodcomprises the following steps of: preparing a training data set and a test data set; carrying out biased detection and label transformation on original labels of samples in the training data set; carrying out Boosting process iteration by adoption of a multistage label transformation manner, when the training of each stage is ended, calculating a fitting residue under the current stage, and when atransformation index is satisfied, carrying out sigmod compression transformation and boxcox transformation; determining a fitting stage number through a timely stopping mechanism, so as to completethe training process; and carrying out prediction and inverse transformation on test data in stages to complete the prediction process and then obtain a prediction result. According to the method, theinfluences, on algorithm systems, of data bias are sufficiently relieved, the algorithm flexibility is enhanced, the fitting ability of a Boosting algorithm is enhanced to a certain extent, and the algorithm is more robust.

Description

technical field [0001] The invention belongs to the technical field of data recognition, and in particular relates to an improved Boosting method based on multi-step label transformation under biased data in integrated learning. Background technique [0002] At present, the existing technologies commonly used in the industry are as follows: Boosting is a family of algorithms that can upgrade a weak learner to a strong learner. It is an important representative branch of integrated learning. The working mechanism of this family of algorithms is similar, that is, first train a base learner from the initial training set, and then learn based on the base The performance of the base learner is used to train the next base learner until the number of base learners reaches the value T specified in advance, and finally the T base learners are weighted and combined. At present, typical Boosting representative algorithms include Gradient Boost Decision Tree (GBDT), Extreme Gradient Bo...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06N99/00
CPCG06N20/00
Inventor 孙红光盛敏李伟民史琰李建东文娟张琰刘俊宇
Owner XIDIAN UNIV
Features
  • R&D
  • Intellectual Property
  • Life Sciences
  • Materials
  • Tech Scout
Why Patsnap Eureka
  • Unparalleled Data Quality
  • Higher Quality Content
  • 60% Fewer Hallucinations
Social media
Patsnap Eureka Blog
Learn More