Approximate processing method and system for maximum information coefficient between single variable and multiple variables

A technology of maximum information coefficient and approximate processing, which is applied in the fields of electrical digital data processing, special data processing applications, digital data information retrieval, etc. simple effect

Inactive Publication Date: 2019-11-08
XIDIAN UNIV
View PDF0 Cites 1 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0016] The existing technology that can calculate the maximum information coefficient cannot calculate the maximum information coefficient between a single variable and multiple variables, but in the actual large data set, the correlation between single variable and multiple variables exists widely

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Approximate processing method and system for maximum information coefficient between single variable and multiple variables
  • Approximate processing method and system for maximum information coefficient between single variable and multiple variables
  • Approximate processing method and system for maximum information coefficient between single variable and multiple variables

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0034] In order to make the object, technical solution and advantages of the present invention more clear, the present invention will be further described in detail below in conjunction with the examples. It should be understood that the specific embodiments described here are only used to explain the present invention, not to limit the present invention.

[0035] The multiple variables involved in the present invention are: compared with nouns for the correlation between a single variable and a single variable, the correlation between a single variable and multiple variables is mainly investigated. The maximum information coefficient involved in the present invention is: a statistic used to measure the degree of correlation between a single variable and multiple variables.

[0036] In existing data processing, the statistical definition of the maximum information coefficient does not consider the correlation between a single variable and multiple variables. The existing algo...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention belongs to the technical field of data mining, and discloses a method and a system for calculating a maximum information coefficient between a single variable and multiple variables in abig data set, which are used for calculating the maximum information coefficient between the single variable Y and m variables (X1, X2,..., Xm). The method: searching optimal grid division of a multi-dimensional (m + 1) space; , fixing division of m variables (X1, X2,..., Xm) by utilizing a maximum information coefficient algorithm of a single variable and m-1 variables; dividing a variable Y, and finding out more suitable division for the variable Y and m variables (X1, X2,..., Xm); and calculating a normalized maximum mutual information value to serve as a maximum information coefficient value. The problem that the existing method for calculating the maximum information coefficient cannot be applied to the calculation of the maximum information coefficient between a single variable anda multivariable can be solved; according to the approximate processing method and system for the maximum information coefficient between the single variable and the multiple variables, the maximum information coefficient value between the single variable and the multiple variables can be calculated.

Description

technical field [0001] The invention belongs to the technical field of data mining, and in particular relates to an approximate processing method and system for the maximum information coefficient between a single variable and multiple variables. Background technique [0002] The universal connection of things makes mining the correlation between variables from data a basic work with a wide range of applications, such as the correlation between genes and genes, the correlation between genes and cancer, and the correlation between population growth and birth rate The relationship between sex, diet structure and birth rate, and the correlation between diet structure and cancer incidence jointly affecting population growth, etc. [0003] At present, the statistics used to measure the degree of correlation between variables mainly include the following types. Pearson Correlation, which measures the matching degree of two variables on a line, can only detect the statistical line...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/2458
CPCG06F16/2462
Inventor 张军英王月杨利英
Owner XIDIAN UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products