Unlock instant, AI-driven research and patent intelligence for your innovation.

Managing data profiling operations related to data type

A data type and type of technology, applied in the direction of electrical digital data processing, special data processing applications, database query, etc., can solve problems such as not being informed

Active Publication Date: 2016-10-26
INITIO TECH
View PDF6 Cites 2 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

For example, the range or typical values ​​of a data set, the relationship between different fields within the data set, or the functional correlation between the values ​​of different fields may not be known

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Managing data profiling operations related to data type
  • Managing data profiling operations related to data type
  • Managing data profiling operations related to data type

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0036] figure 1 An example of a data processing system 100 is shown in which techniques for managing technology type information for efficient data profiling may be applied. The system 100 includes a data source 102, which may include one or more data sources, such as storage devices or connections to online data streams, each of which may be in a variety of forms (e.g., database tables, spreadsheet files, plain text file, or the native format used by a host) to store or provide data in any of these. The execution environment includes a profiling module 106 and an execution module 112 . The profiling module 106 executes the data profiling program on the data source 102 or on the intermediate data or output data generated by the data processing program executed by the execution module 112 . The storage device 112 providing the data source 102 may be local to the execution environment 104, for example, stored on a storage medium (e.g., hard disk 108) connected to the computer ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

Data type information (208) associates each of one or more data types with an identifier. Processing data values in fields of records includes: generating data units (203), each including a field identifier and a binary value from one of the records extracted from the field of that record identified by the field identifier; aggregating information about binary values from multiple data units; generating a list of entries for each of one or more of the fields (205), at least some of the entries each including one of the binary values and information about that binary value aggregated from multiple data units; retrieving a data type from the data type information, and associating the retrieved data type with at least one binary value included in an entry; and generating profile information (114) based on a retrieved data type of a particular binary value appearing in a field, after the aggregating.

Description

[0001] Cross References to Related Applications [0002] This application claims priority to US Application Serial No. 61 / 949,477, filed March 7, 2014. technical field [0003] This specification relates to the management of data profiling operations in relation to data types. Background technique [0004] Databases or other information management systems often include data sets for which many characteristics may not be known. For example, the range or typical values ​​of a data set, the relationship between different fields within that data set, or the functional correlation between values ​​of different fields may not be known. Data profiling may include examining data sets to determine the aforementioned characteristics. Some data profiling techniques include receiving information about a data profiling job, running the data profiling job, and then returning the results of the run after a delay based on how long it takes to perform the various processing steps involved ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F17/30
CPCG06F16/2228G06F16/24G06F16/215
Inventor M·A·可汗
Owner INITIO TECH