Method and system for data processing

A technology for data processing and raw data, applied in the field of data processing, can solve the problems of inability to achieve incremental update, inconvenience in processing real-time data, inability to process data, etc., achieving no transmission delay, convenient processing of real-time data, and easy use handy effect

Active Publication Date: 2014-10-08
SHENZHEN TENCENT COMP SYST CO LTD
View PDF3 Cites 36 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] In the process of realizing the present invention, the inventor found that the prior art has at least the following problems: in the data processing of the above-mentioned prior art, MapReduce is a computing method of batch processing, and the data processed by Map is first stored in the disk, and then It is transmitted to the Reduce stage for processing. This calculation mode needs to collect the data first and then process it in full at one time, but cannot process the real-time arriving data, so that incremental updates cannot be achieved. It is very inconvenient to process real-time data and has a large transmission delay

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and system for data processing
  • Method and system for data processing
  • Method and system for data processing

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0025] In order to make the object, technical solution and advantages of the present invention clearer, the implementation manner of the present invention will be further described in detail below in conjunction with the accompanying drawings.

[0026] figure 1 It is a flowchart of a data processing method provided by an embodiment of the present invention. The execution subject of the data processing method in this embodiment may be a data processing system, and the system may include multiple components, and different components execute the following steps respectively. Such as figure 1 As shown, the data processing method of this embodiment may specifically include the following steps:

[0027] 100. Analyzing the raw data arriving in real time;

[0028] The data processing method in this embodiment is used to collect statistics on raw data obtained in real time, where the real time can be at a level of granularity such as 1 day, 1 hour, 1 minute, or 1 second according to...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a method and a system for data processing. The method comprises the following steps: analyzing raw data arriving in real time; extracting at least one field required by data statistics and the value of each field from the analyzed raw data according to configuration information; carrying out data statistics by a preset time window according to the configuration information, at least one field, the value of each field and acquisition time corresponding to the value of each field so as to obtain a key, a key value corresponding to the key and the current time window value corresponding to the key value; carrying out increment updating on the key value corresponding to the key in a key value memory system by the time window according to the key, the key value corresponding to the key and the current time window value corresponding to the key value. Through the adoption of the technical scheme, increment updating can be carried out on the data arriving in real time in a time window sliding form, the defect in the prior art that massive processing needs to be carried out after data collection is finished once so that real-time data cannot be timely processed is overcome, and the processing of the real-time data is extremely convenient.

Description

technical field [0001] The present invention relates to the technical field of data processing, in particular to a data processing method and system. Background technique [0002] In many fields such as precise recommendation, real-time monitoring, real-time marketing, data mining, etc., it is necessary to perform real-time statistics on the real-time arrival data according to the time window, such as real-time statistics of clicks, exposures or monitoring indicators. [0003] In the prior art, data statistics based on time windows are generally based on Hive and Hadoop data warehouses. The more common mode is to first transfer the data offline to a Hadoop cluster distributed file system (Hadoop Distributed File System; In HDFS), data is partitioned and stored according to Hive's two-dimensional table format and time windows such as days or hours. Then use the structured query language (Structured Query Language; SQL) language Hibernate query language (Hibernate Query Langu...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30G06F3/06
CPCG06F16/23
Inventor 张文郁
Owner SHENZHEN TENCENT COMP SYST CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products