A key feature extraction method based on improved minimum spanning tree for high-dimensional data

A key feature, high-dimensional data technology, applied in the direction of electrical digital data processing, special data processing applications, instruments, etc., can solve the problem of high-dimensional sample data preprocessing, without considering the application background of the data set, and sample data cannot be data mined and analysis issues

Inactive Publication Date: 2018-12-28
WUHAN UNIV OF SCI & TECH
View PDF3 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] However, the above-mentioned patents have two major flaws: ① The problem of high-dimensional sample data preprocessing is not solved. When there are missing values ​​and outliers in the sample data, it is very likely that these sample data cannot be directly used in the data mining and analysis process, reducing the The analyzability of the data is improved; ② the application background of the data set is not considered, and the designed sample data processing algorithm needs to be analyzed and adjusted according to the actual data set in order to obtain satisfactory feature extraction results

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A key feature extraction method based on improved minimum spanning tree for high-dimensional data
  • A key feature extraction method based on improved minimum spanning tree for high-dimensional data
  • A key feature extraction method based on improved minimum spanning tree for high-dimensional data

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0070] The key feature extraction method of high-dimensional data based on the improved minimum spanning tree includes the following steps:

[0071] Step 1. Preprocessing the hot-rolled strip data, including: data cleaning, data integration and discretization of continuous attributes.

[0072] 1. Data cleaning

[0073] For the abnormal values ​​in the hot rolling process data, since each attribute in the data has its own reasonable range in the actual production process, the upper and lower limit search method is used to find the abnormal values. Not only set reasonable upper and lower limits for the value of each characteristic attribute according to the prior knowledge in the actual production process, but the data beyond this reasonable range are considered as outliers. Since the data samples containing missing values ​​and outliers account for a very small proportion in the data set, and the data of each strip are independent of each other, the operation of directly delet...

Embodiment 2

[0136] Algorithm flow process of the present invention is as follows:

[0137] Step 1. Initialize, define input and output data sets, complete data cleaning, data integration and attribute discretization;

[0138] Step 2. Remove irrelevant attributes;

[0139] Step 3, constructing a minimum spanning tree;

[0140] Step 4. Complete the segmentation of the minimum spanning tree, and select key (representative) features based on the correlation measure between attributes.

[0141] First make the following variable assumptions: Let TR be the calculated feature attribute F i The intermediate variable of the symmetric uncertainty between the target attribute C and the target attribute C, FC is the calculation of two feature attributes F i with F j Intermediate variables with symmetric uncertainties between, for the subtree T i The feature attribute with the largest symmetric uncertainty value between the center and the target attribute C, k is the number of nodes in the conne...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a method for extracting key features of high-dimensional data base on an improved minimum spanning tree, including such steps as: 1, preprocessing hot rolled strip data comprising data cleaning, data integration and continuous attribute discretization, 2, removing irrelevant features, 3, constructing minimum spanning tree, 4, segmenting minimum spanning tree, removing redundant attributes and extracting key feature variables. The invention effectively avoids the problem of redundant operation failure caused by single point and multiple divergences, and remarkably improves the efficiency of extracting the key characteristic attributes affecting the finishing rolling temperature, thereby achieving the purpose of improving the modeling accuracy of the finishing rollingtemperature and the reliability of rolling control.

Description

technical field [0001] The invention relates to a method for extracting key features of high-dimensional data based on an improved minimum spanning tree, belonging to the field of high-dimensional data mining. Background technique [0002] In modern industrial production, there are a large number of production objects with complex industrial characteristics. They generally have complex industrial characteristics such as drastic changes in working conditions, strong nonlinearity, strong coupling, time-varying parameters, and mathematical models that are difficult to accurately describe. Existing control methods have the problems of being unable to adapt to frequently changing working conditions and relying too much on the accuracy of the controlled object model, which can easily lead to low control accuracy of the system and poor tracking effect on given signals, and cannot fully meet the requirements of modern industrial production. need. However, a large amount of industri...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30
Inventor 刘斌黄卫华王昳晗蒋峥
Owner WUHAN UNIV OF SCI & TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products