Unlock instant, AI-driven research and patent intelligence for your innovation.

A big data classification method and system

A classification method and classification system technology, applied in the field of big data analysis, can solve problems such as difficult to explain, unstable prediction accuracy, high model complexity, etc., and achieve the effect of reducing memory pressure and calculation pressure

Active Publication Date: 2017-04-12
INST OF COMPUTING TECH CHINESE ACAD OF SCI
View PDF3 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0024] In order to solve the above problems, the purpose of the present invention is to provide a large data classification method and system based on hypersurfaces, which can solve the problems of unstable prediction accuracy, high calculation cost, slow speed, high model complexity and difficulty in the above-mentioned prior art. Interpretation, the problem of being unable to handle massive data

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A big data classification method and system
  • A big data classification method and system
  • A big data classification method and system

Examples

Experimental program
Comparison scheme
Effect test

Embodiment

[0062] In the following, we will illustrate the big data classification method of the present invention by taking a training sample composed of corresponding pixels of 26 classic English letter pictures as an example.

[0063] In the present invention, the form of the input data is as follows, the training data contains the class mark, and the test data does not contain the class mark (according to specific requirements, various data can be preprocessed to make it conform to the input format of the present invention):

[0064]

[0065] ...

[0066]The training step includes multiple cycles of first mapping / reduction steps (represented as n here, where n is a positive integer), and each first mapping / reduction step is composed of a first mapping step and a reduction step. The input training data enters the mapping end and is divided into multiple input data blocks of fixed size (usually 64MB), and then each input data block is read and processed in parallel. Each piece of d...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a big data classification method and system. The method includes the steps: training, namely dividing input data into input data blocks, generating classification rules (pattern strings => class marks) of pattern strings from the input data blocks, and writing the classification rules into a Hbase database rule table; testing, namely, reading the input data blocks, constructing pattern strings to be classified, searching for classification rules matching with the pattern strings to be classified, in the Hbase database rule table, and outputting classification results. Therefore, the big data classification method and system based on hypersurface is provided; classification can be achieved by a hypersurface-based covering algorithm on the basis of a Hadoop mapping / simplification programming frame and a Hbase distributed non-relational database, a rule model easy to explain can be constructed at low calculation cost, big data is quickly and efficiently processed, and the requirement for classifying explosively-increasing data in real world is met.

Description

technical field [0001] The invention relates to the field of big data analysis, in particular to a hypersurface-based big data classification method and system. Background technique [0002] Classification is an important form of data analysis used to extract models that characterize important classes of data. Such a model is called a classifier and it is used to predict the class label for classification. Data classification is a two-stage process, including a learning stage and a classification stage. The learning stage is the stage of building a classification model, and the classification stage is to use the model to predict the class label of the given data. For example, we can build a classification model that classifies bank loan applications as safe or dangerous. This analysis can help us better understand the data comprehensively. Many classification and prediction methods come from machine learning, pattern recognition, and statistics. Most of the algorithms ar...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06F17/30
Inventor 何清吴新宇庄福振敖翔
Owner INST OF COMPUTING TECH CHINESE ACAD OF SCI