Multi-party record linkage method and system, electronic device and storage medium

A multi-party, recording technology, applied in electrical digital data processing, special data processing applications, instruments, etc., can solve problems such as inapplicability to multi-party data sources, inability to protect privacy, poor scalability and fault tolerance, and improve recall rates. , Good scalability, and the effect of improving fault tolerance

Inactive Publication Date: 2018-12-14
ZHEJIANG JIESHANG ARTIFICIAL INTELLIGENCE RES & DEV CO LTD
View PDF0 Cites 2 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] In order to overcome the deficiencies of the prior art, one of the objectives of the present invention is to provide a multi-party record linking method, which solve...

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Multi-party record linkage method and system, electronic device and storage medium
  • Multi-party record linkage method and system, electronic device and storage medium
  • Multi-party record linkage method and system, electronic device and storage medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0027] Below, the present invention will be further described in conjunction with the accompanying drawings and specific implementation methods. It should be noted that, under the premise of not conflicting, the various embodiments described below or the technical features can be combined arbitrarily to form new embodiments. .

[0028] A method for multi-party record linking, such as figure 1 shown, including the following steps:

[0029] Data preprocessing, the data sources of several participants are respectively divided into blocks, and the records in the data sources are converted into bit arrays; in this embodiment, the participants are represented by Pi, and the data sources are represented by Di. Preferably, the step data The preprocessing is specifically to use Bloom filter to change the value of attribute A of Ni records in each data source into q-gram, and apply k hash functions to map it into Ni bit arrays of m length.

[0030] In one embodiment, in order to prote...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a multi-party record linkage method, comprising the steps of: data preprocessing: dividing data sources of a plurality of participants into blocks, and converting records in thedata sources into bit arrays; record approximate matching: calculating the ratio of the bit arrays corresponding to the position of bit 1, and determining the corresponding positions of the interestrate as candidate matching positions when the interest rate reaches a dynamic threshold; similarity calculation: calculating the similarity between the candidate matching positions and judging whetherthe similarity reaches the global threshold. If yes, the matching is successful, otherwise the matching fails. The invention also relates to an electronic device, a storage medium and a multi-party record linkage system. The invention adopts the ratio to mark the similarity between the records at a certain position, thereby improving the fault tolerance; By using dynamic threshold and candidatematching positions to check and determine the successful matching position, so that the recall ratio and precision ratio are high; the similarity between records with quality problems can be effectively calculated; the invention realizes multi-party record linkage, can effectively protect privacy, and has good scalability and fault tolerance.

Description

technical field [0001] The invention relates to the technical field of record linking, in particular to a multi-party record linking method, electronic equipment, storage medium and system. Background technique [0002] With the continuous advancement of technology, data is growing and accumulating rapidly, and the era of big data has arrived. How to organize and analyze these data is the key to realizing the value of data. However, the existing record linking methods do not consider the privacy protection of recorded information when the recorded information involves personal privacy or sensitive information. Existing record linking methods are only applicable to two data sources, but many applications in reality often have more than two data sources for record linking. With the ever-increasing amount of data and the existence of real-world data quality problems, such as spelling mistakes, order reversal, etc., existing record linking methods are poor in scalability and f...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/30
Inventor 尚凌辉陈鑫叶淑阳
Owner ZHEJIANG JIESHANG ARTIFICIAL INTELLIGENCE RES & DEV CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products