Relationship-data attribute-value identity judgment method based on WEB information

A judgment method and technology of relational data, applied in the direction of electrical digital data processing, special data processing application, natural language data processing, etc., can solve the problems of poor accuracy rate and achieve the effect of improving accuracy rate

Active Publication Date: 2017-12-15
NORTHWESTERN POLYTECHNICAL UNIV
View PDF1 Cites 1 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] In order to overcome the deficiency of poor accuracy rate of existing attribute value identity determination methods, the present invention provides a method for attribute value identity determination of relational data based on WEB information

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Relationship-data attribute-value identity judgment method based on WEB information
  • Relationship-data attribute-value identity judgment method based on WEB information
  • Relationship-data attribute-value identity judgment method based on WEB information

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0038] refer to Figure 1-11 . The present invention is based on the attribute value identity determination method of the relational data of WEB information The specific steps are as follows:

[0039] Step 1, extracting query keywords.

[0040] Taking relational table 1 as an example, tuple 1, tuple 2, tuple 3 and tuple 4 respectively get the highest query keywords of FITNESS according to rule-based algorithm and genetic algorithm: paper title and conference time. Retrieve in the WEB through the query keyword, and obtain the expanded query fragment.

[0041] Relationship Table 1

[0042]

[0043] Use WEB search engine to obtain WEB information to expand entities, and use two algorithms to generate effective query keywords. In the rule-based query algorithm, the functional dependency rule fd:X→Y means that the attribute set X uniquely determines the attribute set Y. Use the attribute value in the attribute set X as the query key to retrieve relevant information through ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a relationship-data attribute-value identity judgment method based on WEB information. The relationship-data attribute-value identity judgment method based on the WEB information is used for solving the technical problem that an existing attribute-value identity judgment method is poor in accuracy. According to the technical scheme, query keywords are generated with the query algorithm, information in a database is extended through WEB, and related entities are extracted with the natural language processing method and the named entity identification method; frequent item sets are extracted in searched fragments with the FPTree algorithm, and serve as nodes of graphs; relationships between entity keys are extracted with the cooccurrence relationship method and the semantic relationship method, and edges are established; a maximum common sub-graph containing to-be-judged attributes is extracted in established entity relationship graphs with the Durand-Pasari algorithm; a common mode of the maximum common sub-graph is extracted with the Durand-Pasari algorithm; the similarity degree of attribute values is judged according to the matched result of a relationship mode, and the accurate rate of the attribute-value identity judgment method is increased.

Description

technical field [0001] The invention relates to a method for judging attribute value identity, in particular to a method for judging attribute value identity of relational data based on WEB information. Background technique [0002] Entity identity determination, also known as duplicate entity detection or record matching, is an important technique to improve data quality. The identity judgment of attribute values ​​is an important basis for the identity judgment of relational data entities. [0003] In the literature "Journal of Computer Science, Vol. 38, No. 10, 2015: Page 2028-2040", a method of identity determination only relying on attribute characteristics is used. First, the problem of attribute identity determination is formally described. Divided into two categories: intuitive features and comparative features, the probability distribution of attribute features is quantitatively analyzed to determine attribute identity. But in the actual attribute identity judgmen...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/27G06F17/30
CPCG06F16/367G06F40/295
Inventor 刘海龙成阿茹李战怀张陶然张国荣刘文洁
Owner NORTHWESTERN POLYTECHNICAL UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products