Unlock instant, AI-driven research and patent intelligence for your innovation.

Behavior-consistent cluster-wide data wrangling based on locally processed sample data

A data sorting and original data technology, applied in the field of data processing, can solve the problems of not ensuring that the first and second running engines are equal, and the output of the first running engine is different, etc.

Active Publication Date: 2020-10-16
BUSINESS OBJECTS SOFTWARE
View PDF14 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

In other words, although one execution engine may be suitable for large data sets and a second execution engine may be suitable for small data sets, there is no guarantee that the behavior of the first and second execution engines will be equal
Therefore, even though the requested data wrangling operation is the same, the output of the first execution engine and the second execution engine may be different

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Behavior-consistent cluster-wide data wrangling based on locally processed sample data
  • Behavior-consistent cluster-wide data wrangling based on locally processed sample data
  • Behavior-consistent cluster-wide data wrangling based on locally processed sample data

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0015] Reference will now be made in detail to specific exemplary embodiments for carrying out the subject matter of the invention. Examples of these specific embodiments are illustrated in the drawings, and details are set forth in the following description in order to provide a thorough understanding of the present subject matter. It will be understood that these exemplary are not intended to limit the scope of the claims to the illustrated embodiments. On the contrary, they are intended to cover such alternatives, modifications and equivalents as may be included within the scope of the present disclosure.

[0016] Aspects of the present disclosure include methods for scheduling various data wrangling operations in a local client device, previewing the planned data wrangling operations locally, and applying the planned data wrangling operations in a remote device such that the results in the remote device are structured Data is structured as planned (eg, previewed) in the l...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

Exemplary embodiments include a system for behaviorally consistent data organization, a computer-readable storage medium storing at least one program, and a computer-implemented method. The local client device selects the original sample data set from the remote database. The local execution engine then applies one or more local data wrangling operations to the raw sample data. If the results of the local data wrangling operation are satisfactory, the local data wrangling operation may be transferred to the remote data wrangling cluster. A remote runtime engine run by a remote data wrangling cluster then applies the data wrangling operations to the larger raw data set from which the sample raw data was obtained. Because the remote execution engine and the local execution engine are of the same type, the data wrangling behavior exhibited by the local execution engine is reflected in the data wrangling behavior of the remote execution engine.

Description

technical field [0001] Exemplary embodiments of the present application relate generally to data processing, and more specifically, to large-scale data wrangling based on data wrangling performed on smaller subsets. Background technique [0002] Data wrangling is the process of transforming or mapping data from a "raw" form into a form that allows for more convenient use of the data. Such uses may include further wrangling, data visualization, data aggregation, and training statistical models, among many other possible uses. Data wrangling sometimes follows a collection of basic steps that begin with extracting data in raw form from a data source, "cleaning" the raw data using various hardware and / or software modules, parsing the data into predetermined data structures, and converting the resulting structured content to Stored in an accessible database for storage and future use. [0003] Data wrangling is typically performed on large data sets and can be performed by usin...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06F16/21
CPCG06F16/21G06F16/258G06F16/285G06F16/955G06F16/986H04L67/10
Inventor M.楚穆拉I.伊万诺夫V.库马
Owner BUSINESS OBJECTS SOFTWARE