Automated data mining preprocessing method

A data preprocessing and data mining technology, applied in the fields of electrical digital data processing, special data processing applications, instruments, etc., can solve the problems of lack of feedback and parameter control, and the quality of preprocessing cannot be guaranteed, and achieve rich information and adaptability. The effect of strong, strong preprocessing ability

Active Publication Date: 2016-03-30
HUAZHONG UNIV OF SCI & TECH
View PDF6 Cites 16 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

The invention can perform automatic data preprocessing to a certain extent through a method based on historical databases, but lacks feedback and parameter control, so the quality of preprocessing cannot be guaranteed

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Automated data mining preprocessing method
  • Automated data mining preprocessing method
  • Automated data mining preprocessing method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0045] In order to make the object, technical solution and advantages of the present invention clearer, the present invention will be further described in detail below in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described here are only used to explain the present invention, not to limit the present invention. In addition, the technical features involved in the various embodiments of the present invention described below can be combined with each other as long as they do not constitute a conflict with each other.

[0046] Such as figure 1 As shown, the automatic data mining preprocessing method of the present invention comprises the following steps:

[0047] Step 1: Establish a database and a preprocessing rule base, create a new data table in the database and standardize the naming, import the data to be preprocessed into the newly created data table after sampling, and at the same time check the value o...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The present invention discloses an automated data mining preprocessing method. The method is characterized by comprising: establishing a database and a preprocessing rule base, establishing a new data table in the database, naming the data table in a standardized mode, sampling to-be-preprocessed data, then importing the sampled data into the new data table, and performing mathematical statistics on a value of each field of the preprocessed data after sampling at the same time; and extracting keywords A, B and C of the data table, querying whether these keywords exist in the preprocessing rule base, and if no, adding the keywords of the data table and all the fields into the preprocessing rule base, and then processing all the preprocessed data by adopting a binning method and a data smoothing method to generate a new rule, and adding the new rule into the original rule base. According to the method disclosed by the present invention, by scoring and feeding back a preprocessing result, a field mapping function is adjusted, so as to improve the preprocessing quality.

Description

technical field [0001] The invention belongs to the field of data mining, and more specifically relates to an automatic data mining preprocessing method. Background technique [0002] In the engineering application of data mining, data preprocessing often accounts for 80% or more of the work. Scholars have done a lot of research on data mining methods and achieved certain results. However, data mining preprocessing, especially how to automate data preprocessing, is still a problem. At present, some enterprises and research institutions have proposed some data mining preprocessing methods. [0003] For example, Chinese invention CN200910236744.2 proposes a method, system and device for data preprocessing in a data mining system, wherein data preprocessing corresponds to multiple preprocessing methods with a set execution order, and the main technical solutions include: determining The current preprocessing method corresponding to data preprocessing; when it is determined th...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30
CPCG06F16/211G06F16/219
Inventor 莫益军尹强廖振松
Owner HUAZHONG UNIV OF SCI & TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products