Non-uniform big data classifying method

A classification method and big data technology, applied in text database clustering/classification, unstructured text data retrieval, electronic digital data processing, etc., can solve problems such as high complexity of big data classification and big data algorithms, and reduce complexity degree, improve grades, and improve the effect of classification grades
CN103500205AActive Publication Date: 2014-01-08GUANGXI NORMAL UNIV

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
GUANGXI NORMAL UNIV
Publication Date
2014-01-08

Smart Images

  • Figure 1
    Figure 1
Patent Text Reader

Abstract

The invention relates to a non-uniform big data classifying method which is used for classifying data set categories which cannot be classified and non-uniform category data sets of the data set categories which cannot be classified in a computer memory. Firstly, the size of a sample is determined by a downsampling method according to a theory, and the number of classifiers can be determined according to the number of the samples. An integrated classifier is established for each category of the big data. When a test case is tested, the integrated classifiers of all categories are classified, and the category where the integrated classifier with the highest classification rate is arranged is used as the category of the test case. The method is in linear to time complexity of big data category and can reduce polarization of non-uniform big data classification results. Furthermore, the integrated classifiers improve the accuracy. The method is easy to implement and only involves some simple math models in writing codes.
Need to check novelty before this filing date? Find Prior Art

Description

technical field

[0001] The present invention relates to the fields of computer science and technology and information technology, in particular to big data, in particular to a processing method for classification of non-uniform big data. Background technique

[0002] Big data refers to the collection of data that cannot be captured, managed and processed with conventional software tools within the existing physical conditions and allowed time. Big data has the following characteristics: Volume (large amount of data), Variety (variety of data types), Value (low value density), Velocity (fast processing speed), referred to as 4V.

[0003] At present, big data research usually includes two categories. First, big data challenges architecture. At present, the raw data capacity in the HADOOP clusters of many famous websites reaches dozens of PB, and there is redundancy, which needs to be scanned and updated every day. Then, in order to ensure that the failure of a single node o...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More