Unlock instant, AI-driven research and patent intelligence for your innovation.

Matching untagged data sources to untagged data analysis applications

A data part and data technology, applied in the field of information processing, can solve the problems of eliminating human participation and poor result accuracy

Inactive Publication Date: 2016-10-12
INT BUSINESS MASCH CORP
View PDF6 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, their utility is still domain-specific and the accuracy of the results is often not good enough to completely eliminate human involvement, let alone the costs involved in developing such solutions

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Matching untagged data sources to untagged data analysis applications
  • Matching untagged data sources to untagged data analysis applications
  • Matching untagged data sources to untagged data analysis applications

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0019] The present principles are directed to matching unlabeled data sources with unlabeled data analysis applications.

[0020] In one embodiment, there is provided a method of identifying an analytics solution from a collection of analytics solutions that matches a given data source by utilizing cloud computing technology. The approach does not assume the availability of predefined meta-information about data sources or analysis solutions for matching tasks, nor does it require preprocessing tasks to analyze, profile or discover data sources.

[0021] Thus, the present principles advantageously address the problem of matching unlabeled analytics solutions to unlabeled data sources. In one embodiment, the present principles involve testing a collection of candidate analytics solutions against a given data source to quickly determine whether each analytics solution is capable of consuming the given data source without creating serious problems. Each analysis solution is foun...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

A method and system are provided. The method includes identifying a set of applications compatible with a set of data. The applications and the data are untagged by corresponding metadata. The identifying step includes executing, by an execution platform, at least some of the applications in the set against at least some of the data in the set. The identifying step further includes analyzing, by a log analyzer, execution logs for executions of the at least some of the applications against the at least some of the data. The identifying step also includes indicating, by the log analyzer, a compatibility of the at least some of the applications to the at least some of the data by detecting compatibility relevant errors using the execution logs.

Description

technical field [0001] The present invention relates generally to information processing, and more particularly to matching unlabeled data sources with unlabeled data analysis applications. Background technique [0002] Data analysis applications and algorithms are generally written assuming that data sources are organized in some format (eg, database schema, specific key-value structure, etc.). In order for an analysis to consume a given (arbitrary) dataset, the first step is to analyze the dataset to determine whether it is compatible with a given analysis job. If this is not the case, some kind of data transformation process, ie, Extract, Transform, and Load (ETL), needs to be performed before analytical work can be performed on the dataset. Despite continuous advances in various data analysis techniques in fields such as data mining, big data analysis, machine learning, etc., these preprocessing steps of data analysis are still time-consuming and in most cases very labo...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30
CPCG06F16/254G06F11/366G06F8/00G06F9/445G06F11/3476G06F17/40G06F11/0766
Inventor K·W·格伦伯格高凤晙J·J·奥尔蒂斯T·萨罗尼迪斯R·厄高恩卡D·C·弗玛王西平
Owner INT BUSINESS MASCH CORP
Features
  • R&D
  • Intellectual Property
  • Life Sciences
  • Materials
  • Tech Scout
Why Patsnap Eureka
  • Unparalleled Data Quality
  • Higher Quality Content
  • 60% Fewer Hallucinations
Social media
Patsnap Eureka Blog
Learn More