Data collection method and system based on streaming real-time distributed big data

A data acquisition and distributed technology, applied in the transmission system, electrical components, etc., can solve the problems of not being able to get real-time feedback, and not being able to better grasp the development trend, so as to ensure throughput and stability, avoid data accumulation and Loss, the effect of improving throughput

Inactive Publication Date: 2017-11-24
SOUTH CHINA UNIV OF TECH
View PDF8 Cites 23 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, in the era of big data, multi-source data promotes the emergence of data effectiveness. Traditional batch data collection methods cannot effectivel

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Data collection method and system based on streaming real-time distributed big data
  • Data collection method and system based on streaming real-time distributed big data
  • Data collection method and system based on streaming real-time distributed big data

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0028] The present invention will be further described below in conjunction with specific examples.

[0029] The data acquisition method based on streaming real-time distributed big data provided by this embodiment is specifically: firstly, various clients obtain service support through the access of Web services, and new business data and data will be generated while accessing. The acquisition system needs to perform data acquisition operations on newly produced business data. The specific processing flow chart is as follows figure 1 shown. The data collection process is executed in a distributed cloud cluster, which consists of a main server and multiple sub-servers. The sub-server divides the collection partitions through the set partition rules. Different business data corresponds to different business types. The same business type is associated with multiple partitions to form a corresponding task queue. Real-time collection is performed through queued multi-partition pa...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a data collection method and system based on streaming real-time distributed big data. Data collection is processed in a mode of a distributed cloud cluster, so the processing performance of the data collection is improved and certain expandability is provided. A subarea associated task queue is established. Data accumulation and landing do not need to be finished firstly. The change of business data is detected in real time. Incrementally collected data is stored efficiently through adoption of a memory model. The space occupied when local temporary files are stored is reduced. Data stacking and loss can be avoided. On the basis of the memory model, streaming processing is carried out on data blocks. Parallel processing is directly carried out on data streams in the memory. The data streams are updated to an analysis data set in real time. According to the method and the system, the efficient processing performance of the cloud cluster is exerted fully; moreover, the data is collected and classified through utilization of an efficient storage model based on the memory; the data basis for follow-up real-time data analysis is provided; the real-time data collection is ensured; and a real-time feedback analysis result can be obtained.

Description

technical field [0001] The present invention relates to the technical field of big data data collection, in particular to a data collection method and system based on streaming real-time distributed big data. Background technique [0002] In today's situation where the trend of Internetization is gradually strengthening, along with the active promotion of the "Internet +" project in policy guidelines, Internet applications are presented to users in a variety of ways, and the number of Internet application audiences has increased sharply. A large amount of user Internet application data, including relevant application business data and user behavior data and other precious data. If the rapid development of big data technology can be used to conduct data mining and statistical analysis of relevant user Internet application data, it can provide a reference for the promotion of the "Internet +" project and the improvement of user services, and will be of great help in creating m...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): H04L29/08
CPCH04L67/10H04L67/12
Inventor 张星明梁桂煌林育蓓陈霖古振威吴世豪
Owner SOUTH CHINA UNIV OF TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products