Mass web data mining method based on Hadoop
Patent Information
- Authority / Receiving Office
- CN · China
- Current Assignee / Owner
- INSPUR GROUP CO LTD
- Publication Date
- 2015-07-29
- Estimated Expiration
- Not applicable · inactive patent
Smart Images
Figure 1
Abstract
Description
Technical field
[0001] The invention discloses a method for mining massive web data, which belongs to the field of computer data processing, in particular to a method for mining massive web data based on Hadoop. Background technique
[0002] In response to the rapid growth of the current Web data scale, the computing power of a single node is no longer competent for the analysis and processing of large-scale data. In recent years, with the rise of "cloud computing" technology, people have turned their attention to the storage and processing of massive data. Emerging technology. The biggest advantage of the Hadoop "cloud computing" platform is that it implements the idea of "computing close to storage". The traditional "move data close to computing" model has too much system overhead when the data scale reaches massive amounts, while "mobile computing close to storage" The large overhead of network transmission of massive data can be saved, and processing time can be greatly re...