Data processing method and system

A data processing system and data processing technology, applied in the field of communications, can solve the problems affecting the data throughput performance and low data throughput performance of the Storm system, and achieve the effects of improving data throughput performance, reducing pressure, and saving computing resources

Active Publication Date: 2016-04-06
TENCENT TECH (BEIJING) CO LTD
View PDF2 Cites 21 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0002] The Storm system is an open source, free distributed real-time computing system, which is widely used for its simple, reliable, and efficient data processing characteristics; the Storm system provided by related technologies has the problem of low data throughput performance, for example, when the Storm system When business data needs to be processed to obtain monthly business information, it is necessary to obtain daily business data from the data source and process the daily business data separately to obtain one month's business information; since Storm is used to process massive data, As a result, business data that is not processed in time will inevitably bring considerable storage pressure to the Storm system, affecting the data throughput performance of the Storm system

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Data processing method and system
  • Data processing method and system
  • Data processing method and system

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0073] This embodiment records a data processing method, which can be applied to a data processing system (such as a Storm system), such as Figure 6a As shown, the data processing method recorded in this embodiment includes the following steps:

[0074] Step 101, the data processing system acquires business data from a data source according to preset data source configuration information.

[0075] As an example, the component Spout in the Storm system can obtain business data from external data sources.

[0076] Step 102, distribute the acquired business data to the N data processing layers of the data processing system.

[0077] N is the number of different granularities indicated by the service processing policy corresponding to the service data, the granularity is a basic unit for processing the service data to obtain service information, and N is an integer greater than 1.

[0078] Step 103, at the i-th processing layer of the data processing system, combine the receive...

Embodiment 2

[0095] This embodiment records a Storm system, such as Figure 7 As shown, the data processing system includes:

[0096] The data source unit (corresponding to the component Spout in the Storm system) is used to obtain business data from the data source according to the preset data source configuration information;

[0097] The data source unit is further configured to distribute the acquired business data to N data processing layers of the data processing system, where N is the number of different granularity businesses indicated by the business processing policy corresponding to the business data, and the Granularity is the basic unit for processing the business data to obtain business information, and N is an integer greater than 1;

[0098] The data merging unit (corresponding to the component Boltb1 in the Storm system) is used to merge the received business data at the i-th granularity at the i-th processing layer of the data processing system to obtain the i-th granula...

Embodiment 3

[0114]The Storm system recorded in this embodiment can support the access of multiple business data, that is, it can process the business data of multiple data sources accessed, avoiding the problem of low Bolt code reuse rate in related technologies, and can Save computing resources of the Storm cluster; such as Figure 8a As shown, the Storm system recorded in this embodiment includes:

[0115] Spout is used to transmit data of at least two services to the Bolt. The data is transferred between the Spout and the Bolt in the form of a stream, and the basic transmission unit of the stream is a tuple;

[0116] Bolta is used to clean data from multiple businesses, including filling data with default values, aligning special values, judging legality, etc., filtering out unnecessary data, that is, “dirty data” according to Bolta’s own business logic " process, the processed data is sent to Boltb in the form of a stream (the basic unit is a tuple);

[0117] Boltb is used for norma...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The present invention discloses a data processing method and system. The method comprises: a data processing system acquiring service data from a data source according to preset data source configuration information; distributing the acquired service data to N data processing layers of the data processing system; in an ith processing layer of the data processing system, combining the received business data in the ith granularity to obtain the ith granularity of business data; and processing the ith granularity of business data by using a business processing strategy corresponding to the business data to obtain the ith business information. By adopting the technical scheme of the present invention, data throughput performance of the data processing system can be improved; and access of multiple services is supported, and computing resources of a Storm cluster are conserved.

Description

technical field [0001] The present invention relates to communication technology, in particular to a data processing method and system. Background technique [0002] The Storm system is an open source, free distributed real-time computing system, which is widely used for its simple, reliable, and efficient data processing characteristics; the Storm system provided by related technologies has the problem of low data throughput performance, for example, when the Storm system When business data needs to be processed to obtain monthly business information, it is necessary to obtain daily business data from the data source and process the daily business data separately to obtain one month's business information; since Storm is used to process massive data, As a result, business data that is not processed in time will inevitably bring considerable storage pressure to the Storm system, affecting the data throughput performance of the Storm system. Contents of the invention [00...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30
Inventor 张超
Owner TENCENT TECH (BEIJING) CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products