Method and device for processing bulk data

A technology of batch data and processing methods, applied in the direction of multi-programming devices, etc., can solve problems such as dynamic adjustment of processing order, and achieve the effect of reducing waiting time and improving utilization rate

Inactive Publication Date: 2011-05-11
中国移动通信集团甘肃有限公司
View PDF3 Cites 24 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] The purpose of the present invention is to solve the defect that the processing order cannot be dynamically adjusted according to the size of the data to be processed in the prior art batch data processing method, and propose a batch data processing method and device to improve the thread processing efficiency

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and device for processing bulk data
  • Method and device for processing bulk data
  • Method and device for processing bulk data

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0033] figure 2 It is a flowchart of a batch data processing method according to an embodiment of the present invention. Such as figure 2 As shown, this embodiment includes:

[0034] Step S102: read the data to be processed into the cache, and obtain the occupied space of the data to be processed;

[0035] Step S104: Calculate the estimated weight of the data to be processed according to the preset unit weight and occupied space;

[0036] Step S106: Insert the data to be processed into the data sequence to be processed according to the estimated weight;

[0037] Step S108: put the data to be processed in the data sequence to be processed into the thread for processing.

[0038] In this embodiment, before step S102, it also includes: reading the batch data from the batch data source into the cache, where the batch data source can be stored in the form of a file separated by a certain format, or stored in the form of a database table. The weight value can be determined in...

Embodiment 2

[0043] In this embodiment, on the basis of the first embodiment, the unit weight is dynamically adjusted to more accurately reflect the size of the data to be processed, reduce the waiting time of the data to be processed, and improve the processing efficiency. This embodiment is applied to firstly setting the initial value of the unit weight, and then gradually approaching a reasonable unit weight according to the actual time for processing the data to be processed, that is, the execution weight. In this embodiment, after step S108 in the first embodiment, it also includes:

[0044] Step S202: Obtain the execution weight of the data to be processed;

[0045] Step S204: Correct the unit weight according to the execution weight of multiple data to be processed;

[0046] Step S206: Calculate the estimated weight of subsequent data to be processed according to the corrected unit weight.

[0047] In this embodiment, the execution weight is the actual time for processing the data t...

Embodiment 3

[0087] This embodiment will describe other contents in the batch data processing method on the basis of the second embodiment: including submission of processed data, rollback after processing errors, dynamic adjustment of cache, and dynamic adjustment of the number of execution threads.

[0088] 1. Submission of processed data

[0089] The number of preset thread submissions; when the number of processed data in a thread reaches the preset number of submissions, the processed data will be written to the database or file system.

[0090] Specifically, the CommitCount is set, the processed data is stored in the memory, and it is written to the database or file system when the commit amount is reached or the entire processing is completed, so as to avoid frequent I / O operations. The CommitCount should not be set too large, one is to occupy a large amount of memory, and the other is to take too long to roll back. Every time a submission is made, a statistical thread is triggered...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a method and a device for processing bulk data. The method comprises the following steps: reading data to be processed into a cache, and acquiring occupied space of the data to be processed; calculating a pre-estimated weight of the data to be processed according to a preset unit weight and the occupied space; inserting the data to be processed into a sequence of the data to be processed according to the pre-estimated weight; and placing the data to be processed in the sequence of the data to be processed into a thread for processing. In all embodiments of the invention, the data to be processed is sequenced according to the required processing time, and then bulk data can be processed according to settings of a user. Therefore, the utilization rate of each thread is improved, the waiting time of the data to be processed is reduced, and the processing efficiency is improved.

Description

technical field [0001] The invention relates to the technical field of business support in the communication industry, in particular to a batch data processing method and device. Background technique [0002] Telecom operators usually process batch data centrally at the beginning or end of the month, such as batch processing of bills, batch generation of bills, batch write-off of expenses, batch reconciliation with various business platforms, etc., and generally adopt a single-threaded method. The single-threaded method is to read data one by one or read them into memory at one time, and then process them one by one. The process of reading and processing is serial, and submission is generally adopted one by one. However, in the single-threaded processing mode, the execution time is long and the resource utilization rate is the lowest. [0003] In the prior art, there is also a direct modulo method, that is, to take a modulus from a certain field (column), and assign it to d...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F9/46
Inventor 贾琨
Owner 中国移动通信集团甘肃有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products