Cluster excavation method of large-scale data set of single machine

A large-scale data and clustering mining technology, which is applied in the fields of electrical digital data processing, special data processing applications, instruments, etc., can solve problems such as insufficient memory of a single physical machine, and achieve the effect of improving mining efficiency

Active Publication Date: 2015-06-24
江苏爱星信息科技有限公司
View PDF7 Cites 3 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0006] The method provided by the present invention effectively solves the serious shortage of memory in a single physical machine based on the mechanism of virtual memory and block processing. Set mining efficiency,

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Cluster excavation method of large-scale data set of single machine
  • Cluster excavation method of large-scale data set of single machine
  • Cluster excavation method of large-scale data set of single machine

Examples

Experimental program
Comparison scheme
Effect test

Example Embodiment

[0017] In the following, the present invention will be further clarified with reference to specific examples. It should be understood that these examples are only used to illustrate the present invention and not to limit the scope of the present invention. After reading the present invention, those skilled in the art will understand various equivalents All the modifications fall within the scope defined by the appended claims of this application.

[0018] The method provided by the present invention roughly divides the cluster mining of large-scale data sets on a single machine into three steps, such as figure 1 As shown, first build a cluster mining model, and then solve the memory leak and operating efficiency issues separately; figure 2 Said, the implementation steps of the method of the present invention are: the solution of the memory leakage problem, that is, the large data is read into the memory in blocks based on the virtual memory and the block processing mechanism; the ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a cluster excavation method of a large-scale data set of a single machine. The cluster excavation method mainly comprises the three steps that firstly, the problem of memory leakage easily occurring when the large-scale data set is read is solved; secondly, the big data problem is converted into a small data problem resolved easily by using the hardware advantages and the algorithm though of 'divide-and-conquer'; thirdly, a proper excavation model is built based on a cluster excavation algorithm, the cluster excavation works of small data are finished sequentially, and finally the excavation results are combined to obtain the final result. Through the method of designing a storage mode, expanding virtual equipment and the like, the common problems of memory limitation and low operation efficiency in the big data excavation process are effectively resolved, and the cluster excavation work for a GB scale data set can be finished on an independent and normally working physical machine without using network clusters.

Description

technical field [0001] The invention relates to a single-machine cluster mining method for large-scale data sets, and belongs to the technical field of data mining. Background technique [0002] In recent years, with the rapid development and popularization of computer and information technology, the scale of industrial application systems has expanded rapidly, and the data generated by industrial applications has grown explosively. Industry / enterprise big data that can easily reach hundreds of GB or even tens to hundreds of terabytes has far exceeded the processing capabilities of existing traditional computing technologies and information systems. [0003] Big data brings many new challenges to traditional computing technology. Big data makes it difficult for many traditional serialization algorithms that are effective on small data sets to complete calculations in an acceptable time when faced with big data processing; at the same time, big data contains many characteris...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/30
Inventor 范仕良张雪洁骆融臻
Owner 江苏爱星信息科技有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products