Concurrent computational system and non-repetition counting method

A parallel computing and data technology, applied in the direction of computing, digital data processing, special data processing applications, etc., can solve the problems of slow statistical speed of large databases, achieve high cost performance, avoid errors, and improve the effect of statistical speed

Inactive Publication Date: 2010-12-15
上海云数信息科技有限公司
View PDF2 Cites 10 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] The purpose of this invention is to provide a parallel computing system to solve the problem of slow statistical speed of large databases
[0005] Another object of the present invention is to provide a kind of deduplication counting method, to solve the problem that the statistical speed of large database is slow

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Concurrent computational system and non-repetition counting method
  • Concurrent computational system and non-repetition counting method
  • Concurrent computational system and non-repetition counting method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0020] The main idea of ​​the present invention is to divide and distribute massive data in the source database to multiple node databases, and perform parallel computing with multiple independent node servers, so that the ability of multi-computer and multi-core simultaneous computing can be fully utilized. In addition, the present invention determines the statistical method based on the judgment of whether the statistical content is a key field for segmentation, which effectively avoids errors caused by data being counted multiple times.

[0021] The data statistics of the present invention are established under the SQL command, and are especially suitable for BI (Business Intelligence, business intelligence) systems. The present invention will be described in detail below in conjunction with the accompanying drawings.

[0022] See figure 1 , which is an architecture diagram of the parallel computing system of the present invention. This system comprises data segmentation ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a concurrent computational system and a method. The method comprises the following steps of: (1) setting a plurality of node databases; (2) partitioning mass data in a source database according to a key field and distributing the data among the node databases; (3) judging whether counted content is the partitioned key field; (4) is so, carrying out the non-repetition counting computation of the key field; and (5) if not, carrying out the grouped counting computation of the non-key field. The invention can greatly improve the counting speed of a large database and ensurethe statistics accuracy.

Description

technical field [0001] The invention relates to a statistical method of a database, in particular to a parallel computing system and a deduplication counting method. Background technique [0002] With the development and popularization of computer technology, large-scale databases have rapidly entered various industries such as telecommunications and finance. SQL (Structured Query Language, Structured Query Language) is a set of operation commands specially established for databases, and it is a database language. The main function of SQL is to establish relationships with various databases and enable communication between different types of databases. According to ANSI (American National Standards Institute), SQL is used as a standard language for relational database management systems. When using SQL, you only need to issue the "what to do" command without thinking about "how to do it". SQL statements can be used to perform various operations on the database, such as up...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30
Inventor 李晓华
Owner 上海云数信息科技有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products