Unlock instant, AI-driven research and patent intelligence for your innovation.

Model training method and apparatus based on gradient boosting decision tree

a decision tree and model training technology, applied in the field of information technology, can solve the problems of insufficient accumulation of data, inability to obtain enough labeled samples, and inability to obtain qualified models, and achieve the effect of training data sufficiency

Active Publication Date: 2021-10-26
ADVANCED NEW TECH CO LTD
View PDF7 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Benefits of technology

The present patent provides a method and apparatus for training a model using the GBDT algorithm. The technique improves training data sufficiency by dividing the training process into two stages. In the first stage, decision trees are trained using labeled samples from a data domain similar to the target service scenario, and the training residual is determined. In the second stage, decision trees are trained using labeled samples from the target service scenario and the training residual from the first stage. The models trained in the first and second stages are then integrated to obtain the final model applied to the target service scenario. This approach allows for efficient training even when the data in the target scenario is insufficient. The patent improves machine learning technology and enables earlier and effective training of various prediction models, even if specific, contextual data has not been accumulated to a level sufficient for traditional model training purposes.

Problems solved by technology

If there are only a small number of labeled samples, it is usually impossible to obtain a qualified model.
However, in practice, there may be insufficient accumulation of data in some service scenarios.
Consequently, it can be impossible to obtain enough labeled samples from the data domain of a certain service scenario when a model designed for applying to the service scenario needs to be trained, and no qualified model can be obtained.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Model training method and apparatus based on gradient boosting decision tree
  • Model training method and apparatus based on gradient boosting decision tree
  • Model training method and apparatus based on gradient boosting decision tree

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0019]The present disclosure is based on the transfer learning mechanism in the technical field of machine learning. When a model applied to a target service scenario needs to be obtained, if data accumulated in the target service scenario is insufficient, data accumulated in a service scenario similar to the target service scenario can be used for model training. Illustratively, the similar service scenario and the target service scenario are associated with same data features or a threshold quantity of overlapping data features.

[0020]Specifically, the present disclosure combines the transfer learning idea with the GBDT algorithm and improves the GBDT algorithm flow. In the implementation of the present specification, based on the GBDT algorithm flow, data generated in a service scenario similar to a target service scenario is used for training, and after a certain training suspension condition is met, the training is suspended and current training residual is calculated; then, the...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

Disclosed are a model training method and apparatus based on gradient boosting decision tree (GBDT). A GBDT algorithm flow is divided into two stages. In the first stage, labeled samples are obtained from a data domain of a service scenario similar to a target service scenario to sequentially train several decision trees, and training residual generated after the training in the first stage is determined; in the second stage, labeled samples are obtained from a data domain of the target service scenario, and several decision trees continue to be trained based on the training residual. Finally, a model applied to the target service scenario is actually obtained by integrating the decision trees trained in the first stage with the decision trees trained in the second stage.

Description

BACKGROUNDTechnical Field[0001]Implementations of the present specification pertain to the field of information technologies, and in particular, to a model training method and apparatus based on gradient boosting decision tree (GBDT).Description of the Related Art[0002]Machine learning is an important branch of computer technologies. Many machine learning methods require proper training based on relevant data in particular context. In many cases, when a prediction model that is designed for applying to a certain service scenario needs to be trained, a large amount of data needs to be obtained from a data domain of this service scenario for labeling, so as to obtain labeled samples for model training. If there are only a small number of labeled samples, it is usually impossible to obtain a qualified model. It should be noted that a data domain of a certain service scenario is actually a set of service data generated based on the service scenario.[0003]However, in practice, there may ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(United States)
IPC IPC(8): G06N5/00G06K9/62G06N20/20G06N20/10G06N20/00G06N5/02
CPCG06N5/003G06K9/6256G06N20/00G06N20/10G06N20/20G06N5/02G06N5/025G06N5/027G06F18/2155G06F18/214G06N5/01G06F18/2148
Inventor CHEN, CHAOCHAOZHOU, JUN
Owner ADVANCED NEW TECH CO LTD