Eureka AIR delivers breakthrough ideas for toughest innovation challenges, trusted by R&D personnel around the world.

Automatic regression diagnosis method on big data platform

A big data platform and automatic regression technology, applied in structured data retrieval, database management systems, complex mathematical operations, etc., can solve a lot of time and cost problems, and achieve the effect of improving efficiency

Active Publication Date: 2021-10-26
上海派拉软件股份有限公司
View PDF6 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Existing linear regression model diagnosis needs to be done manually, and professional statistical engineers need to interact with a large amount of data, which means a lot of time and the cost of professional statistical engineers

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Automatic regression diagnosis method on big data platform
  • Automatic regression diagnosis method on big data platform

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0027] Below in conjunction with specific embodiment, further illustrate the present invention. It should be understood that these examples are only used to illustrate the present invention and are not intended to limit the scope of the present invention. In addition, it should be understood that after reading the teachings of the present invention, those skilled in the art can make various changes or modifications to the present invention, and these equivalent forms also fall within the scope defined by the appended claims of the present application.

[0028] Commonly used regression model evaluation indicators include:

[0029] R squared:

[0030] How many percent of the dependent variable obtained through the regression equation can be explained by the independent variable is a commonly used indicator for judging the degree of fitting of the model. The value of R squared ranges from 0 to 1.

[0031] Judgment criteria: For the regression model, the larger the value of R s...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention relates to an automatic regression diagnosis method on a big data platform, which is characterized in that it includes the following steps: importing data sources into the big data platform; performing random sampling of the data imported into the big data platform; On the node, the core algorithm is used to calculate each sample obtained by random sampling to obtain the regression model corresponding to each sample; the obtained regression models are cross-validated to obtain the final regression model. After adopting the method provided by the present invention, the regression modeling can be carried out completely automatically, that is, the work of regression diagnosis is directly completed by the algorithm without investing a large amount of modeling cost, and the efficiency of regression modeling and diagnosis is improved.

Description

technical field [0001] The invention relates to a method for automatic model diagnosis of a linear regression model on a big data platform. Background technique [0002] Whether it is in applied statistics, data analysis, or in the popular big data analysis and other fields, the linear regression model has become one of the most classic, common and commonly used models due to its simple modeling process and convenient explanation. However, when doing linear regression modeling, one cannot avoid the need for model diagnostics for the linear regression model. [0003] The so-called regression model refers to: use the independent variable and the corresponding dependent variable in the historical data to establish a shape such as Y=a(1)*X(1)+a(2)*X(2)+a(3)*X (3) +...+b model, where Y is the dependent variable, X(1), X(2), X(3),... are independent variables, and different Xs represent different data dimensions , and use the model to predict the value of the corresponding depen...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06F16/25G06F17/18
CPCG06F17/18
Inventor 张毅骏张瑞瑞陈远猷张瀚潇
Owner 上海派拉软件股份有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Eureka Blog
Learn More
PatSnap group products