Attribute alignment method and apparatus

A technology of attribute pairs and attributes, which is applied in the field of attribute alignment methods and devices, and can solve problems such as large computational complexity

Active Publication Date: 2018-02-16
HUAWEI TECH CO LTD
View PDF5 Cites 8 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] Embodiments of the present invention provide an attribute alignment method and device, which are used to solve the problem in the prior art that matching one attribute of one data source with multiple attributes of another data source results in high computational complexity

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Attribute alignment method and apparatus
  • Attribute alignment method and apparatus
  • Attribute alignment method and apparatus

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0037] The following will clearly and completely describe the technical solutions in the embodiments of the present invention with reference to the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are only some, not all, embodiments of the present invention. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without making creative efforts belong to the protection scope of the present invention.

[0038] The attribute alignment method and device provided by the present invention perform clustering according to similarity after vectorizing N data sources, so that the difference in attribute names between data sources in the same cluster is small. The attribute alignment is performed on the data sources of different clusters, and then the attribute alignment is performed on the data sources of different clusters. Since the attributes of data sources in th...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses an attribute alignment method and apparatus, relates to the field of data analysis, and aims to solve the problem of high calculation complexity caused by matching between an attribute of a data source and multiple attributes of another data source. The attribute alignment method comprises the steps of obtaining N data sources; according to the N data sources, obtaining anattribute name set and generating an internal dictionary; according to the attribute name set, the internal dictionary and an attribute name of each data source in the N data sources, obtaining an attribute eigenvector of each data source; according to the attribute eigenvector of each data source, calculating the similarity between any two data sources in the N data sources and a similarity matrix; according to the similarity matrix, clustering the N data sources to obtain k clusters; and according to the internal dictionary, performing attribute alignment on the data sources of the same cluster in the k clusters to obtain k data sources, and performing attribute alignment on the k data sources to obtain a result data resource. The attribute alignment method and apparatus is applied to the data analysis.

Description

technical field [0001] The invention relates to the field of data analysis, in particular to an attribute alignment method and device. Background technique [0002] refer to figure 1 As shown in , data analysis mainly includes three stages: data collection (English full name: data collection), data integration (English full name: data curation) and data analysis (English full name: data analytics). Among them, data collection refers to the collection, modeling and storage of data generated by various businesses; data integration refers to data probing (English full name: data profiling) and data cleaning (English full name: datacleansing) of various data sources collected. ), attribute alignment (English full name: schema mapping), data conversion (English full name: datatransforming) and data deduplication (English full name: data deduplication) to form a unified data source; data analysis is to do data integration on the data source Corresponding business analysis report...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30
CPCG06F16/258
Inventor 陈庆玉
Owner HUAWEI TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products