Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Identification database deduplication method and system for MES system

A system identification and database technology, applied in the field of identification database deduplication, can solve the problems of time-consuming data transmission and reluctance to disclose data, etc.

Active Publication Date: 2021-04-20
BEIJING INSTITUTE OF TECHNOLOGYGY
View PDF4 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] Most of the traditional deduplication methods directly operate on the data, and directly transfer the data to a unified temporary database for similarity calculation operations. One problem is that it takes a long time to transfer a large amount of data, and the other problem is that some companies want to protect data, unwilling to disclose data

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Identification database deduplication method and system for MES system
  • Identification database deduplication method and system for MES system
  • Identification database deduplication method and system for MES system

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0076] The purpose of the present invention is to provide a method and system for deduplication of the identification database of the MES system, so as to reduce the time consumption caused by data transmission in the deduplication process and ensure the privacy of the data.

[0077] In order to make the above objects, features and advantages of the present invention more comprehensible, the invention will be further described in detail below in conjunction with the accompanying drawings and specific embodiments.

[0078] Such as figure 1 and 2 As shown, the present invention provides a deduplication method for an identification database of an MES system, and the deduplication method includes the following steps:

[0079] In step 101, each identification database to be processed is regarded as a slave node, an idle computing node is regarded as a master node, and the slave nodes are sequentially numbered according to the importance of the data stored in the identification dat...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses an identification database deduplication method and system for an MES system, and the method comprises the steps: enabling each to-be-processed identification database to serve as a slave node, enabling an idle operation node to serve as a master node, and sequentially labeling the slave nodes; dividing the data in all the slave nodes into N parts by using an SNM algorithm; calculating a minimum signature matrix of each data set; calculating the similarity between every two data elements in each data set according to the minimum signature matrix of each data set; and according to the similarity between every two data elements in each data set, performing deduplication processing on the data source in each data set. According to the method and system, time consumed in the data transmission process is shortened in a data set division mode, the minimum signature matrix is constructed, interaction is conducted on the minimum signature matrix, similarity calculation is conducted, interaction of data elements is not needed, the data privacy is guaranteed, and the time consumed in the data transmission process is further shortened.

Description

technical field [0001] The invention relates to the technical field of data processing, in particular to a method and system for deduplicating an identification database of an MES system. Background technique [0002] Similar duplicate data means that there are such two pieces of data 1R and 2R in the database, their content is the same or similar, and both correspond to the same real entity, then the data pair 1R and 2R are similar duplicate data. There may be many pairs of similar and repeated data in the actual database. Their existence reduces the quality of the data, may hinder the normal operation of the system, and even affect the correctness of the decision-making of the enterprise information management system. [0003] The MES-oriented industrial Internet unified identification database is a unified data element identification database that stores the MES system. It is composed of many database servers, and a large number of unified data elements are stored in it. ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F16/215
Inventor 柴森春王昭洋黄经纬张百海崔灵果李慧芳姚分喜
Owner BEIJING INSTITUTE OF TECHNOLOGYGY
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products