Rapid statistics task generation system and method suitable for big data

A big data and task technology, applied in the field of data statistics, can solve the problems of low reuse and development efficiency, complex debugging process, inability to adapt to the development process of big data, etc., and achieve the effect of improving development efficiency.

Active Publication Date: 2015-09-16
DINGLI COMM
View PDF8 Cites 13 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

In the current technology, the calculation of various statistical indicators based on big data is mostly realized by using Java to develop mapreduce, etc., but the development and debugging

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Rapid statistics task generation system and method suitable for big data

Examples

Experimental program
Comparison scheme
Effect test

specific Embodiment

[0096] refer to figure 1 , Statistical task: Calculate the number of successful SMS sending, the number of SMS sending failures, the total number of SMS sending, the success rate of SMS sending, and the failure rate of SMS sending for each city in Guangdong Province every day. This is achieved through the following steps:

[0097] (1) Define the data source adapter, first define the input SMS data source attributes, such as the table name bssap, the field cdr_type, and the type is int, where cdr_type=10 means sending SMS, cdr_result=1 means sending SMS successfully, and other means failure; define The field name is city_name, and the type is string, indicating the name of the city, etc.

[0098]If there is a data source adapter corresponding to the SMS data source attribute in the data source adapter warehouse, it will be called directly from the library. If not, a new data source adapter will be created and saved in the data source adapter warehouse.

[0099] (2) Define ato...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a rapid statistics task generation system and method suitable for big data. The method comprises the steps that after a data source adapter, an atomic counter, a statistical indicator generator, a dimension selector, a report generator, a scheduler and a code generator are generated, codes of a statistics task are automatically generated, and when a preset scheduling condition of the scheduler is satisfied, the codes are automatically executed. According to the rapid statistics task generation system and method suitable for the big data, the statistics task is decomposed and defined as the parts such as the data source adapter, the atomic counter, the statistical indicator generator, the dimension selector, the report generator and the scheduler, when a user creates a task of the user, the user can drag the indicators needed by the user, and then statistics codes are automatically generated according to a standard model configured by the user. Accordingly, the complex cloud calculation process is simplified, module componentization is achieved, the statistics task codes are rapidly generated, the development efficiency is greatly improved, and the rapid statistics task generation system and method suitable for the big data can be widely applied to the big data statistics industry.

Description

technical field [0001] The invention relates to the field of data statistics, in particular to a system and method for rapidly generating statistical tasks suitable for big data. Background technique [0002] In order to facilitate the description below, the following name explanations are first given: [0003] hadoop: a distributed system infrastructure, users can develop distributed programs without knowing the underlying details of the distribution; [0004] parquet: column storage file format for hadoop; [0005] MapReduce: a programming model for parallel computing of large-scale data sets; [0006] Impala: Impala is a new query system developed by Cloudera. It provides SQL semantics and can query PB-level big data stored in Hadoop's HDFS and HBase. The biggest advantage is speed. [0007] Spark: Spark is a distributed data rapid analysis project developed by the University of California, Berkeley. Its core technology is Resilient distributed datasets, which provide...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/30
CPCG06F16/2462
Inventor 别志铭张健明张勇鹏王旭吴楠王耘喻大发
Owner DINGLI COMM
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products