Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

System and method for analyzing data sources to generate metadata

a data source and metadata technology, applied in the field of systems and methods for generating metadata, can solve the problems of inability of primary users of such bi tools to access data directly, inability to understand the sql language, and inability to understand the data model

Inactive Publication Date: 2008-06-12
PANTHEON SYST
View PDF59 Cites 233 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Benefits of technology

"The present invention provides systems and methods for inferring implicit elements of a data model within a data source and across multiple data sources. This allows for the generation of metadata that describes the data source, including information on the constraints and relationships between tables and columns. The invention also includes methods for identifying potential keys and relationships between tables based on the similarity of alternate names and calculating probabilities for each table. The technical effects of the invention include improved data analysis and improved data management."

Problems solved by technology

However, most business analysts, managers and corporate decision makers are not knowledgeable of the SQL language and are also not familiar with the data model.
As a result, the primary users of such BI tools are often unable to access the data directly.
In addition, information technology staffs are often unable to successfully use SQL to retrieve the desired data because the user must have an intimate knowledge of the data model used to create the database in order to include the correct commands in a query to retrieve the required data from the tables.
Such problems are also commonly encountered with the use of commercial off-the-shelf products (COTS), as well as BI software applications that have been custom developed for an organization.
In the instance of custom developed software, once the original designers have departed from the organization, it is frequently the case that no one remains with intimate knowledge of the data model used to create the custom software and often either associated documentation is missing or there is minimal informative documentation available for the customized software.
BI tool manufacturers face the challenge of enabling business users to query the data by hiding the complexities of SQL and allowing users to access information without an intimate knowledge of the underlying data model.
However, constraints need not be explicitly defined as part of the data model.
However, for security reasons, the creators of the new tables may not be granted privileges to create a referential integrity constraint on these existing tables.
However, most RDBMS generally do not allow creation of referential integrity constraints across multiple database instances.c. Most constraints such as the NOT NULL constraint, and referential integrity constraints are implemented at the source of the data itself.
Certain constraints are not implemented consistently across RDBMS vendors.
Thus, to maintain portability across database vendors or because the implemented semantics may not be ideally suited to the actual requirement, such constraints may not be explicitly defined as part of the data model, but rather are implemented instead at the sources of the data.e.
In the case of very large tables, explicit definition of referential integrity constraints may slow the insert, update or delete operations.
Most COTS packages have predefined data models that cannot be modified.
As applications get larger and more complex and interface with a large number of other applications, the explicit data model defined in catalog tables becomes less sufficient to populate the metadata layer for a BI system.
This is a commonly encountered problem during BI implementations.
A large amount of time and effort is spent in obtaining a global view of the data model that is not necessarily available through the relational catalogs.
However, such techniques are not sufficient and the capture of the enterprise data model is a largely manual process and involves the BI implementers obtaining information from various sources within the organization.
A lack of knowledge about the implicit data model within an organization often hinders this effort.
Consequently, populating the metadata layer of BI tools becomes a time consuming, expensive and onerous process.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • System and method for analyzing data sources to generate metadata
  • System and method for analyzing data sources to generate metadata
  • System and method for analyzing data sources to generate metadata

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0048]While the present invention may be embodied in many different forms, a number of illustrative embodiments are described herein with the understanding that the present disclosure is to be considered as providing examples of the principles of the invention and such examples are not intended to limit the invention to the embodiments described and / or illustrated herein.

[0049]Referring to FIG. 1, a method for analyzing data sources to generate metadata for the data sources according to an embodiment of the present invention will be described. Exemplary sub-flows are illustrated for each of the steps of the method in FIGS. 2-20.

[0050]According to embodiments of the invention, some or all of the following steps can be automated programmatically and some or all of the results of each step can be stored in a metadata repository. One skilled in the art will recognize that the execution of some steps results in the generation of metadata while the execution of other steps results in data...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

A system and method are provided for generating metadata relating to an enterprise management system including at least one data source having one or more of tables and columns. Constraints existing on at least one of the tables and columns in the data source are inferred based on data in the tables and columns. Metadata that includes information on the inferred constraints is generated.

Description

BACKGROUND OF THE INVENTION[0001]1. Field of the Invention[0002]The present invention relates generally to systems and methods for analyzing data sources to generate metadata, (i.e., data about the underlying data). More particularly, the present invention relates to systems and methods for generating metadata from disparate data sources by inferring data relationships and rules based on the data in the disparate data sources and / or query and procedure code.[0003]2. Description of the Related Art[0004]Databases are well known and are used to store and maintain information for almost every conceivable application. The use of databases has become increasingly prevalent over the last few decades; and, today, the vast majority of business records are maintained in databases.[0005]For large businesses, database information can include thousands of different kinds of information and often will range from millions to billions of records. Such businesses seek to leverage their data to make ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(United States)
IPC IPC(8): G06F17/30
CPCG06F17/30315G06F16/221
Inventor MATHURIA, JANAK
Owner PANTHEON SYST
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products