Web data mining system on basis of Hadoop platform

A data mining and data technology, applied in the direction of electrical digital data processing, special data processing applications, instruments, etc., can solve the problem of not improving the data processing ability of the mining system

Inactive Publication Date: 2013-09-18
句容智恒安全设备有限公司
View PDF3 Cites 10 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, the current research on web data mining is mainly focused on improving the mining algorithm, which will only improve the effectiveness of the mining system, but does not improve the data processing ability of the mining system

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Web data mining system on basis of Hadoop platform

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0025] Web data mining system based on Hadoop platform, such as figure 1 As shown, it includes user interaction layer, business application layer, Web data mining platform layer and distributed storage computing layer;

[0026] The user interaction layer is used for the interaction between the user and the system, including: a user management module, a business module and a display module;

[0027] The business application layer includes: a business response module and a workflow module;

[0028] Described Web data mining platform layer comprises: data loading module, result storage module, mode evaluation module, parallel ETL module and parallel data mining algorithm module;

[0029] The distributed storage computing layer uses Hadoop to realize file distributed storage and parallel computing functions, including: HDFS module, MapReduce module and distributed management module;

[0030] In the above user interaction layer:

[0031] User management module, which is used to ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a Web data mining system on the basis of a Hadoop platform and relates to a data mining system. The system comprises a user interaction layer, a service application layer, a Web data mining platform layer and a distributed storage calculation layer; the user interaction layer is used for interaction between a user and the system and comprises a user management module, a service module and a display module; the service application layer comprises a service response module and a workflow module; the Web data mining platform layer comprises a data loading module, a result storage module, a mode evaluation module, a parallel ETL (Extract Transform and Load) module and a parallel data mining algorithm module; and the distributed storage calculation layer uses Hadoop to implement file distributed storage and parallel calculation functions and comprises an HDFS (Hadoop Distributed File System) module, a MapReduce module and a distributed management module. According to the invention, the calculation and storage requirements of each module with the requirement on huge calculation capacity are expanded onto each node in an HADOOP cluster and related data mining work is carried out by utilizing the parallel calculation and storage capacity of the cluster.

Description

technical field [0001] The invention relates to a data mining system, in particular to a Hadoop platform-based web data mining system. Background technique [0002] Web data mining refers to the use of data mining techniques to discover potential, useful patterns or information in www data. It is based on the analysis of a large amount of network data, adopts corresponding data mining algorithms, extracts, screens, converts, mines and analyzes data on specific application models, and finally makes inductive reasoning. But the current research on web data mining mainly focuses on improving the mining algorithm, which only improves the effectiveness of the mining system, but does not improve the data processing ability of the mining system. With the rapid development of network technology, the data on the web is growing exponentially, and the use of a single data mining platform has encountered a bottleneck in computing power. The present invention has invented a web data min...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30
Inventor 黄玉明李伟
Owner 句容智恒安全设备有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products