Unlock instant, AI-driven research and patent intelligence for your innovation.

Layered sampling tree method and device for fitting variable joint distribution

A joint distribution and variable technology, applied in the field of machine learning, can solve the problems of large differences in the overall distribution of the target sample set and cannot meet the scene simulation, and achieve the effect of improving the simulation accuracy

Pending Publication Date: 2022-02-15
度小满科技(北京)有限公司
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] The technical problem to be solved by the embodiments of the present invention is to provide a hierarchical sampling tree method, device, equipment and medium for fitting the joint distribution of variables, which can solve the problem of sequentially sampling the distribution of each feature in the prior art. The overall distribution of the sample set and the target sample set is very different, which cannot meet the technical problems of the scene simulation

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Layered sampling tree method and device for fitting variable joint distribution
  • Layered sampling tree method and device for fitting variable joint distribution
  • Layered sampling tree method and device for fitting variable joint distribution

Examples

Experimental program
Comparison scheme
Effect test

no. 1 example

[0047] see figure 1 .

[0048] Such as figure 1 As shown, this embodiment provides a hierarchical sampling tree method for fitting the joint distribution of variables, which at least includes the following steps:

[0049] S1. Obtain all characteristic variables with values ​​of 0-1 in the sample data set, and arrange the characteristic variables according to the preset numbering sequence, and create a corresponding initial node structure;

[0050] S2. Traverse each sample in the sample data set, check the value of each feature of the sample according to the order of the characteristic variables, until all samples in the sample data set are checked, and generate a corresponding initial hierarchical sampling tree;

[0051] S3. Perform node correction on the initial hierarchical sampling tree until all nodes in the initial hierarchical sampling tree are traversed to obtain a corrected hierarchical sampling tree;

[0052] S4. A corresponding sample is generated each time throug...

no. 2 example

[0084] see figure 2 .

[0085] Such as figure 2 As shown, this embodiment provides a hierarchical sampling tree system for fitting the joint distribution of variables, including:

[0086] The initial node module 100 is used to obtain all characteristic variables with a value of 0-1 in the sample data set, and arrange the characteristic variables according to the preset numbering order to create a corresponding initial node structure;

[0087] For the initial node module 100, by obtaining all the characteristic variables with a value of 0-1 in the sample data set, assuming that the sample data has k characteristic variables with a value of 0-1, we arrange these variables in a certain number order, such as v 1 v 2 …v k , so as to create the corresponding initial node structure. Each layer of the sampling tree is a node containing the above k domains, and each domain contains a child node pointer and a count field (to be converted into a proportion in the end).

[0088] Th...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a layered sampling tree method and device for fitting variable joint distribution, equipment and a medium, and the method comprises the steps: obtaining all characteristic variables of 0-1 values of a sample data set, arranging the characteristic variables according to a numbering sequence, and creating an initial node structure; traversing samples of the sample data set, checking the value of each feature of the samples according to a feature variable sequence, and generating an initial layered sampling tree until all the samples of the sample data set are checked; performing node correction on the initial stratified sampling tree until all nodes are traversed to obtain a corrected stratified sampling tree; and generating one sample every time through the corrected layered sampling tree, repeating the sampling process until the required number of samples is generated, and obtaining a sampling sample data set. Joint distribution information of multiple 0-1 variables in the sample data set can be efficiently captured in the fitting stage, and the simulated sample data set with the same joint distribution as the target sample set is accurately generated in the subsequent reasoning stage.

Description

technical field [0001] The invention relates to the technical field of machine learning, in particular to a hierarchical sampling tree method, device, equipment and medium for fitting the joint distribution of variables. Background technique [0002] In order to better reflect the predictive performance of the model under a specific sample data distribution, it is often necessary to simulate and generate a sample set that conforms to a certain distribution law, and then evaluate the performance of the model on the basis of this sample set. In today's era of big data, a large number of variables in the input of many models are sparse features with a value of 0-1, and these features are not independent. Under this condition, especially when the number of samples is relatively small compared to the number of feature combinations, the sample set and the target sample set generated by the conventional method are sequentially sampled according to the distribution of each feature. ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06N20/00
CPCG06N20/00
Inventor 林熙东杨青
Owner 度小满科技(北京)有限公司