Netuser behavior data real-time processing method based on distributed computation

A technology of real-time data processing and distributed computing, which is applied in network data retrieval, network data indexing, electrical digital data processing, etc. It can solve the problems of not being able to solve the needs of real-time computing statistics and analysis, and achieve high accuracy and fast response to requests , strong usability effect

Inactive Publication Date: 2015-01-28
SHANGHAI JIAO TONG UNIV
View PDF5 Cites 10 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

This method proposes to use distributed batch processing to effectively carry out big data statistical analysis, but this method cannot solve the statistical analysis requirements of real-time computing

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Netuser behavior data real-time processing method based on distributed computation
  • Netuser behavior data real-time processing method based on distributed computation
  • Netuser behavior data real-time processing method based on distributed computation

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0033] Such as Figure 1-Figure 2 As shown, this embodiment provides a method for real-time processing of network user behavior data based on distributed computing, figure 1 Among them, a represents the startup process, and b represents the real-time calculation process. In this embodiment, the core business of the website is an online education platform, and user access interaction behaviors include: browsing pages, watching online videos, completing online small exercises and getting results, chapter exercises, major assignments, user forum posts and replies, user Its own attribute data (user's gender, age, place of origin, etc.). There are three main types of statistical data on the website: 1. The distribution characteristics of users of the online education platform (ratio of men and women, age range characteristics (10-20, 20-30, 30-40, etc.), distribution of hometown, etc.); 2. User participation Records of browsing and watching videos, forum posts and reply statistic...

Embodiment 2

[0041] refer to figure 1 with image 3 As shown, in this embodiment, the data source side distributes the event data flow to multiple distributed nodes in proportion to perform parallel calculation and perform specification according to requirements. All the other are with embodiment 1.

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to a netuser behavior data real-time processing method based on distributed computation. The netuser behavior data real-time processing method includes a starting process and a real-time computing process which are sequentially operated. The starting process includes extracting data from a netuser behavior database, subjecting the data to distributed batching computation to acquire a statistical result initial value, and storing the statistical result initial value into a statistical result cache region and a result database. The real-time computing process includes that a data source end continues to receive event data streams generated from user visit interbehavior data and stores the same into the netuser behavior database, all current event data streams on the data source end are distributed in multiple nodes in distributed manner, incremental computation and agreement stipulation are performed on the event data streams according to the statistical result initial value, and the final result is saved in the statistical result cache region. Compared with the prior art, the netuser behavior data real-time processing method has the advantages such that distributed multi-node parallel computing is supported, incremental computational algorithm of the event stream is designed based on the actual needs, coupling degree with other functions on the network platform is low, and real-time computing is effectively guaranteed.

Description

technical field [0001] The invention relates to the field of network data processing, in particular to a method for real-time processing of network user behavior data based on distributed computing. Background technique [0002] With the development of information services on the Internet, many government departments, companies, universities, research institutes, etc. have already owned or are building their own websites. There is a web server running behind each website. For the management of the website, it is required not only to pay attention to the daily throughput of the server, but also to understand the access status of each page of the website, and to improve the content and quality of the webpage according to the click frequency of each page , Improve the readability of the content, based on data such as the browsing interaction behavior of each page. Track data related to all users, institutions, etc. and perform statistical analysis. [0003] This is especially...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30
CPCG06F16/2462G06F16/27G06F16/951
Inventor 王加俊徐礼爽周文峰
Owner SHANGHAI JIAO TONG UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products