Apparatus program and method for data property recognition

a data property and program technology, applied in the field of data science, can solve the problems of not all data providers publish their data according to standards, semantic interoperability problems, complex and time-consuming identification of data properties in a dataset, etc., and achieve the effect of improving the quality of recognition

Active Publication Date: 2019-10-01
FUJITSU LTD
View PDF92 Cites 2 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Benefits of technology

[0017]Advantageously, the updating of the feature vectors improves the quality of recognition with use of the reference set of feature vectors for recognition.

Problems solved by technology

However, not all data providers publish their data according to standards.
This often results in semantic interoperability problems when data from different sources are exchanged and merged, e.g. when two datasets refer to the same data property using different names.
The identification of data properties in a dataset it is complex and time-consuming when proper metadata is not available.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Apparatus program and method for data property recognition
  • Apparatus program and method for data property recognition
  • Apparatus program and method for data property recognition

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0039]FIG. 1 illustrates a data property recognition apparatus 10 of an embodiment. The data property recognition apparatus 10 comprises a model data acquisition processor 12, a feature vector generation processor 14, and a storage unit 16.

[0040]The model data acquisition processor is configured to acquire a plurality of model sets of data entries, each individual model set of data entries being a plurality of data entries individually representing an identified property common to the model set of data entries and being of a data type common to the model set of data entries. The plurality of model sets of data entries may be from a single data source, with the schema of the single data source identifying the property represented by the individual data values in each set. The illustrated line in FIG. 1 from the exterior of the data property recognition apparatus 10 to the model data acquisition processor 12 represents the import (acquiring) of model sets of data entries from a data s...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

A data property recognition apparatus, includes a storage unit; a model data acquisition processor acquiring a plurality of model sets of data entries, each data entry individually representing an identified property common to the model set and being of a data type common to the model set; a feature vector generation processor receiving an input set of data entries, recognizing a data type common to the input set of data entries from among a plurality of supported data types, selecting a set of statistical characteristics representing the input set of data entries in dependence upon the recognised data type, generating a value of each of the selected set of statistical characteristics from the input set of data entries, and outputting a feature vector composed of the generated values of statistical characteristics.

Description

CROSS-REFERENCE TO RELATED APPLICATION[0001]The application is based upon and claims the benefit of priority of prior German patent application no. 102016220771.7, filed Oct. 21, 2016, the entirety of which is herein incorporated by reference.FIELD[0002]This application lies in the field of data science, and in particular relates to the automation of reconciliation of data entries from multiple data sources.BACKGROUND[0003]Data scientists spend time organizing and cleaning data, which time could be better-spent on procedures such as modelling or data mining. Standardization bodies, such as the World Wide Web Consortium (W3C), have worked for many years on proposing formats and best practices to facilitate data publication and sharing. However, not all data providers publish their data according to standards. Moreover, most standards focus on the syntax of the data model and forget about the data semantics. This often results in semantic interoperability problems when data from diffe...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(United States)
IPC IPC(8): G06F16/28G06F16/25G06F16/215G06K9/00H04N5/765G06F16/683G06F9/30G06F16/901G06N5/04
CPCG06F16/9024G06N5/04G06F16/215G06F16/256G06F16/284G06F9/30192H04N5/765G06K9/00006G06F16/683G06V40/12
Inventor LLAVES, ALEJANDROPEÑA MUÑOZ, MANUELDE LA TORRE, VICTOR
Owner FUJITSU LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products