Internet data analysis system

A data analysis system and Internet technology, applied in the field of data mining, can solve problems such as deviation of data analysis results and difficulty in decision-making for upper-level business applications of data analysis.

Inactive Publication Date: 2014-11-12
SHANGHAI CHRUST INFORMATION TECH
View PDF4 Cites 11 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] The existing data mining process uses one or several fixed data analysis algorithms to build a data analysis system. Since each algorithm has its own advantages and disadvantage

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Internet data analysis system
  • Internet data analysis system

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0013] see figure 1 , the Internet data analysis system of the present application includes a data preprocessing module 10 and a data analysis module 20 .

[0014] The data preprocessing module 10 extracts the main content from the webpage information of the Internet, filters out useless information such as tags, and obtains the text corresponding to each webpage. The acquired text is first effectively segmented by a tokenizer to obtain multiple tokens, and then the unimportant tokens are filtered out through feature value reduction, and only the tokens that can clearly highlight the characteristics of the text are retained.

[0015] The data analysis module 20 selects corresponding one or more types of algorithms from the four types of algorithms: classification algorithm, clustering algorithm, association rule algorithm, and special rule algorithm according to the analysis requirements, and one or more types of algorithms are used in each type of algorithm. The algorithm is...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses an internet data analysis system which comprises a data preprocessing module and a data analysis module, wherein the data preprocessing module extracts main content from webpage information of the internet, a text corresponding to each webpage is obtained through filtration, the obtained texts are firstly segmented by a segmentation device to obtain a plurality of segmentation words, and segmentation words highlighting characteristics of the texts are reserved through dimensionality reduction of characteristic values; and the data analysis module selects one or more categories of algorithms from a classification algorithm, a clustering algorithm, an association rule algorithm and special rule matching algorithm according to analysis requirements, each category of algorithm adopts one or more algorithms for processing the segmentation words which are subjected to dimensionality reduction and correspond to the webpages output by the data preprocessing module, and the analysis result is stored. With the adoption of the internet data analysis system, the defect of inaccurate data analysis result caused by a single data mining algorithm is overcome, or the time cost due to the need of secondary system development when other algorithms are used on the basis of one algorithm is saved, and efficiency and accuracy of data analysis are improved.

Description

technical field [0001] This application relates to a data mining (data mining) technology, in particular to a method for analyzing Internet data. Background technique [0002] Data mining refers to the process of revealing hidden, previously unknown and potentially valuable information from a large amount of data. It is mainly based on artificial intelligence, machine learning, pattern recognition, statistics, database, data retrieval and other technologies to achieve the above goals. [0003] The existing data mining process uses one or several fixed data analysis algorithms to build a data analysis system. Since each algorithm has its own advantages and disadvantages, it often causes deviations between data analysis results, resulting in data analysis based on data analysis. It is difficult for upper-level business applications to make decisions. Contents of the invention [0004] The technical problem to be solved in this application is to provide an Internet data ana...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/30
CPCG06F16/35G06F16/9535
Inventor 顾青倪庆洋谢超梁佐泉冯四风梁艳敏张士鹏田文晋贾伟峰田肖
Owner SHANGHAI CHRUST INFORMATION TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products