Supercharge Your Innovation With Domain-Expert AI Agents!

Distributed scheduling method of multi-source data analysis engine, computing node and distributed scheduling system of the multi-source data analysis engine

A computing node and analysis engine technology, applied in the database field, can solve problems such as limitations, single point of failure, and poor scalability of the job scheduling system

Active Publication Date: 2020-09-04
BEIJING SHENGXIN NETWORK TECH CO LTD
View PDF5 Cites 5 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0006] Yarn solves the problems of poor scalability, single point of failure, and limited to MR (MapReduce) computing framework of Hadoop's first-generation job scheduling system.
However, in terms of data management, the same HDFS as the first generation is still used. In the scenario of massive data above the PB (Petabyte, petabyte) level, this system plays an irreplaceable role as the only solution at present, but in There is still room for optimization in TB (Terabyte, terabyte) level data scenarios

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Distributed scheduling method of multi-source data analysis engine, computing node and distributed scheduling system of the multi-source data analysis engine
  • Distributed scheduling method of multi-source data analysis engine, computing node and distributed scheduling system of the multi-source data analysis engine
  • Distributed scheduling method of multi-source data analysis engine, computing node and distributed scheduling system of the multi-source data analysis engine

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0050] Exemplary embodiments of the present invention will be described in more detail below with reference to the accompanying drawings. Although exemplary embodiments of the present invention are shown in the drawings, it should be understood that the invention may be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided for more thorough understanding of the present invention and to fully convey the scope of the present invention to those skilled in the art.

[0051] figure 1 is a block diagram of an example computing device 100 arranged to implement a method of distributed scheduling of a multi-source data analysis engine according to the present invention. In a basic configuration 102 , computing device 100 typically includes system memory 106 and one or more processors 104 . A memory bus 108 may be used for communication between the processor 104 and the system memory 106 .

[0052] Depending on...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The embodiment of the invention provides a distributed scheduling method of a multi-source data analysis engine, a computing node device, a readable storage medium, computing equipment and a distributed scheduling system of the multi-source data analysis engine. A large amount of communication cost is saved, the distributed processing efficiency of the multi-source data analysis engine is improved, and the method comprises the steps that a computing node with the highest scheduling index of the multi-source data analysis engine receives a query task; the computing node determines a sub-query task including an intermediate result set of the query task, and determines a storage node of the intermediate result set; the computing node computes a first time overhead of migrating the intermediate result set to the local, and computes a second time overhead of executing the sub-query task by the storage node; and the computing node selects whether to migrate the intermediate result set to thelocal and execute the sub-query task by the computing node or execute the sub-query task by the storage node according to a comparison result of the first time overhead and the second time overhead.

Description

technical field [0001] The present invention relates to the technical field of databases, in particular to a distributed scheduling method of a multi-source data analysis engine, a computing node device, a readable storage medium, a computing device, and a distributed scheduling system of a multi-source data analysis engine. Background technique [0002] The multi-source data analysis engine is a query language for docking with various data sources such as ElasticSearch, Mongo, and Mysql. Taking the Qingteng Structured Query Language (QSL) engine as an example, it can also provide pipeline syntax to filter query results, realizing continuous data analysis capabilities. The engine node saves the intermediate result set in the calculation process in the local Sqlite3 database. The engine itself does not have distributed capabilities, and can only call computing resources of a single node, which greatly limits the improvement of the computing power assembly and the concurrent ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F16/903
CPCG06F16/903G06F16/90335
Inventor 李一哲程度张福
Owner BEIJING SHENGXIN NETWORK TECH CO LTD
Features
  • R&D
  • Intellectual Property
  • Life Sciences
  • Materials
  • Tech Scout
Why Patsnap Eureka
  • Unparalleled Data Quality
  • Higher Quality Content
  • 60% Fewer Hallucinations
Social media
Patsnap Eureka Blog
Learn More