Improved association rule report data mining method based on mutual exclusion expression

A technology of report data and mutual exclusion relationship, applied in the field of data science, can solve problems such as insufficient computing memory, high computing time cost, and low computing efficiency

Inactive Publication Date: 2020-06-19
HARBIN INST OF TECH
View PDF1 Cites 4 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] Due to the continuous increase in the data scale for association rule mining, traditional association rule analysis and mining methods are difficult to deal with structured data reports converted from threshold division to transaction data, resulting in insufficient computing memory and high computing time costs.
The purpose of the present invention is to overcome the problem of low calculation efficiency caused by too many mutually exclusive data items in the existing association rule mining method in the processing of structured data reports converted from threshold division into transaction data, and to provide a mutually exclusive expression-based Improve the association rule mining method, so as to improve the calculation efficiency and mine the implicit association relationship between data indicators

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Improved association rule report data mining method based on mutual exclusion expression
  • Improved association rule report data mining method based on mutual exclusion expression
  • Improved association rule report data mining method based on mutual exclusion expression

Examples

Experimental program
Comparison scheme
Effect test

specific Embodiment approach

[0057] The teaching quality data of colleges and universities is structured data, which contains multi-dimensional data on the teaching quality of colleges and universities. Among them, there are implicit correlations among multi-dimensional indicators such as the number of students, the quality of the teacher team, the conditions of running a school, and the school's financial bonuses, all of which require mining and analysis of association rules for teaching quality data.

[0058] Execution step 1: Perform data preprocessing on the structured data to be processed, as shown in Table 1, and uniformly convert it into transaction data in the form of Boolean values, as shown in Table 2, perform data processing based on the mutually exclusive relationship between data indicators Group to get a binary sparse matrix with group labels.

[0059] From the original structured data in Table 1, based on each indicator, select an appropriate threshold and convert it to the corresponding tr...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses an improved association rule report data mining method based on mutual exclusion expression, relates to a knowledge discovery and data mining method in the field of data science, and solves the problems of high memory consumption and low efficiency when a traditional association rule algorithm processes massive data. The method comprises the steps of 1, converting data intotransaction data based on a data threshold range, and obtaining a binary sparse matrix with grouping labels based on data logic; 2, obtaining a set in which all frequent items are 1, and removing a non-frequent item set to obtain a new grouping result; and 3, performing self-connection iterative search on the frequent item set to search the frequent item set, and cutting and iterating the candidate item set until a new frequent item set cannot be generated, thereby obtaining an association rule mining result. The basic idea of the method is to convert structured data into transaction data, generate groups based on a mutual exclusion relationship, and perform rule mining, thereby reducing the computing memory and improving the computing efficiency. The application scene is wide, and the social and economic values are very high.

Description

technical field [0001] The invention relates to knowledge discovery and data mining methods in the field of data science, in particular to an improved association rule report data mining method based on mutually exclusive expressions. Background technique [0002] With the explosive growth of data volume in the information age, people find that there is huge data value hidden behind the massive data. For the data value of structured data, better results can be obtained by using traditional or modern data mining methods, but due to the difficulty in converting structured data reports into transactional data in the process of mining association rules (only Boolean values ​​​​True and False represent events Datasets that happen or not), and the data scale is also growing continuously. The data mining methods that used to be able to mine valuable information are now facing huge challenges in the face of massive data. At present, dividing structured data reports into transaction...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/2458
CPCG06F2216/03G06F16/2465
Inventor 沈毅赵虹博张淼
Owner HARBIN INST OF TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products