Rochester model-naive Bayesian model-based data classification system

A Bayesian model and data classification technology, applied in special data processing applications, electrical digital data processing, instruments, etc., can solve problems such as poor prediction accuracy and poor model robustness, and achieve high accuracy, good robustness, and satisfaction The effect of solvency and creditworthiness

Inactive Publication Date: 2010-06-02
HEFEI JOYIN INFORMATION TECH
View PDF0 Cites 16 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

For the Rochester regression model, its advantages are mainly robustness, strong interpretability of the model, and a linear scorecard can be generated. The disadvantage is that the prediction accuracy is poor compared with some other systems, such as neural networks, Naive Bayes (naive Bayesian) model, etc.
The Naive Bayesian model is a

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Rochester model-naive Bayesian model-based data classification system
  • Rochester model-naive Bayesian model-based data classification system
  • Rochester model-naive Bayesian model-based data classification system

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0014] The present invention will be described in detail below in conjunction with the accompanying drawings and embodiments.

[0015] Such as figure 1 As shown, the data classification system of the present invention includes a data processing module 1 , a sample sampling module 2 , a model building module 3 , and a data testing module 4 .

[0016] Among them, the main function of the data processing module 1 is to determine the sample stratification rules applicable to the mixed dynamic model according to the degree of absence of various sample variables in the input original sample set. That is, in the data processing module 1, the missing value ratio of various sample variables in the original sample set, as well as the correlation between various sample variables and the type and distribution of each sample variable are calculated; when a certain type of sample variable is missing When the value ratio exceeds a fixed threshold, and the correlation between the variables e...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to a Rochester model-naive Bayesian model-based data classification system, which comprises a data processing module, a sampling module, a modeling module and a data testing module, wherein the data processing module divides an original sample set into a saturated layer and a lacking layer according to the input missing value ratio of each sample variable in the original sample set and relativity among the sample variables and sample attributes; the sampling module randomly extracts a training sample variable and a testing sample variable from the saturated layer and the lacking layer to form a training sample set and a testing sample set of which each comprises the saturated layer and the lacking layer respectively; the modeling module models training samples in the saturated layer through a Rochester regression model and models the training samples in the lacking layer through a naive Bayesian model to obtain a hybrid dynamic model with the Rochester regression model and the naive Bayesian model; and the data testing module inputs testing samples in the saturated layer into the Rochester regression model in the hybrid dynamic model, inputs the testing samples in the lacking layer into the naive Bayesian model in the hybrid dynamic model and performs a test to obtain and output scoring results. The Rochester model-naive Bayesian model-based data classification system is integrated with the functions of the Rochester regression model and the naive Bayesian model so as to have complementary advantages and can be widely applied to the financial industry, retailing and the telecommunication industry.

Description

technical field [0001] The invention relates to a data classification system, in particular to a data classification system based on a Rochester model-naive Bayesian model. Background technique [0002] Data mining is more and more widely used in the financial industry, retail industry and telecommunications industry. In the financial field, managers can use data mining to analyze, classify and grade customers' repayment ability and credit, thereby reducing the cost of issuing loans. Blindness, improve the ratio structure of bank assets and liabilities in various types of investment products, improve the efficiency of capital use, and optimize the asset structure. At the same time, it can also discover the leading factors and key links that play a decisive role in various capital operation businesses, so as to formulate corresponding financial policies. In the retail industry, data mining can help identify customer buying behavior, discover customer buying patterns and tren...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/30
Inventor 尹留志
Owner HEFEI JOYIN INFORMATION TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products