Unlock instant, AI-driven research and patent intelligence for your innovation.

A Memory-Based Data Fast Collision Subsystem for Data Mining Systems

A data mining and data collision technology, applied in the field of data processing, can solve the problems of high I/O overhead, low cost performance, and timeliness discount.

Active Publication Date: 2021-06-08
KAITUO INFORMATION SYST
View PDF4 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Therefore, in actual application scenarios, business personnel cannot directly process various data sources, and the timeliness is greatly reduced.
At the same time, if the original data has no practical use value after the analysis, it needs to be removed. Therefore, for data with a short life cycle, the cost performance of the import operation will be low
[0003] In response to the above problems, people currently use the Kettle tool for data extraction. Although the Kettle tool can efficiently and stably extract data that exists in various ways, the Kettle tool still needs to clean and clean the data during the extraction process. Conversion processing, that is, the ETL process, the completed data still needs to be landed, and the I / O overhead is large

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A Memory-Based Data Fast Collision Subsystem for Data Mining Systems
  • A Memory-Based Data Fast Collision Subsystem for Data Mining Systems

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0022] combine Figure 1 to Figure 2 , which describes a specific embodiment of the present invention in detail, but does not limit the claims of the present invention in any way.

[0023] Such as figure 1 As shown, a memory-based fast data collision system for data mining systems includes a task control module, a resource monitoring / flow control module, an external interface module, a routing module, a memory computing domain module, and a data-driven module. The system adopts an asynchronous Communication method, that is, when the task submitter submits the task for calculation, there is no need to wait for the execution of the calculation process, and the system immediately returns the reception status of the submitted task;

[0024] in:

[0025] 1. According to the current status information of the system provided by the resource monitoring / flow control module, the task control module realizes the reception, start and execution of data collision tasks by interacting with...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The present invention relates to the technical field of data processing, in particular to a memory-based fast data collision subsystem for a data mining system, including a task control module, a resource monitoring / flow control module, an external interface module, a routing module, and a memory operation domain module And the data-driven module, the task control module realizes the reception, start, execution and return of the task request processing results of the data collision task according to the current state information of the system, and the resource monitoring / flow control module provides the current system state information for the task control module; The interface module exposes interface services through HTTP; the routing module analyzes the data collision task request submitted by the task submitter to obtain task parameters, and routes the task parameters to the algorithm unit. The memory operation domain module includes multiple algorithm units, and the data drive module includes With multiple data drivers, the invention has high data extraction efficiency and avoids I / O overhead caused by data landing.

Description

technical field [0001] The invention relates to the technical field of data processing, in particular to a memory-based fast data collision subsystem for a data mining system. Background technique [0002] With the development of big data applications, more and more data has analyzable value, but the historically generated data have different formats and storage methods. As a result, if these data are to be analyzed, they must be uniformly imported into a unified processing environment for processing. For some small, scattered, and highly random data, it is difficult to import them in a unified way. Even if it is possible to import this method of importing first and then analyzing it, it will cause a certain delay. This is due to the existing data Analysis technology, the usual practice is: first extract data into a unified environment, such as a relational database or a big data distributed platform, and then proceed to the next step of data analysis and processing. Durin...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06F16/2458G06F16/25G06F9/50
CPCG06F9/5016
Inventor 陈华郁东风葛永山
Owner KAITUO INFORMATION SYST