Address matching algorithm based on interest point knowledge graph pre-training

A technology of address matching and knowledge graph, applied in geographic information database, text database query, unstructured text data retrieval, etc., can solve problems such as poor generalization performance and inability to obtain good results in exact character matching, and achieve faster Convergence speed, improving generalization performance, and reducing the effect of training corpus

Active Publication Date: 2020-07-24
ZHEJIANG UNIV
View PDF3 Cites 12 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Due to the differences in the expression of addresses and the complexity of geographic information, the exact character matching based on the traditional bag-of-words model cannot achieve good results; the problem with the deep learning model is that it requires a large amount of training corpus and is prone to generalization. Poor chemical performance and other characteristics

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Address matching algorithm based on interest point knowledge graph pre-training
  • Address matching algorithm based on interest point knowledge graph pre-training
  • Address matching algorithm based on interest point knowledge graph pre-training

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0062] In order to describe the present invention more specifically, the technical solutions of the present invention will be described in detail below in conjunction with the accompanying drawings and specific embodiments.

[0063] like figure 1 As shown, step 1, distinguish the administrative area label of the address text, and output the label information of each word of the address text, as shown in Table 1, the input address text is "Huanglong Sports Center, No. 1 Huanglong Road, Xihu District, Hangzhou City, Zhejiang Province" ;

[0064] Table 1 List of administrative division results

[0065] Zhejiang Jiang Province Hangzhou State city West lake Area yellow Prov Prov Prov City City City District District District road dragon road 1 No yellow dragon body education middle Heart road road road road name name name name name name

[0066] Step 2. Use the results of administr...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses an address matching algorithm based on interest point knowledge graph pre-training, comprising the following steps of: obtaining interest point addresses, distinguishing administrative regions with different granularities, and obtaining marked interest point addresses; randomly covering a part of administrative regions with the marked interest point addresses, inputting themarked interest point addresses into a language model, outputting predicted interest point addresses, calculating a loss function by utilizing the interest point addresses and the predicted interestpoint addresses, and obtaining the language model for outputting accurate interest point addresses after multiple iterations; connecting a full connection layer behind the language model, and performing overall parameter fine adjustment on the model and the full connection layer by using the marked address matching task data set to obtain a fine-adjusted language model and a fine-adjusted full connection layer; and inputting the marked to-be-predicted original interest point address into the finely tuned language model and the full connection layer to obtain a predicted address of the to-be-predicted interest point, and performing similarity calculation on the to-be-predicted original interest point address and the predicted address of the to-be-predicted interest point to complete addressmatching.

Description

technical field [0001] The present invention relates to the fields of knowledge graph and natural language processing, in particular to an address matching algorithm based on the pre-training of interest point knowledge graph. Background technique [0002] Natural language processing tasks in traditional network training models require a large amount of labeled data. The labeling of these data requires a lot of manpower. ability. [0003] Semantic matching of text refers to judging whether two pieces of natural language express the same meaning. The problem with the traditional bag-of-words model is that it cannot handle the ambiguity of natural language well. There are many titles and expressions for the same meaning, but the same The expression may have multiple meanings in different contexts. Classic semantic matching models include traditional bag-of-words-based TF-IDF, BM25 algorithm, deep learning-based DSSM, Match Pyramid and other models. [0004] As a natural lang...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/29G06F16/33G06F16/36
CPCG06F16/29G06F16/3347G06F16/3346G06F16/367
Inventor 陈华钧叶志权
Owner ZHEJIANG UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products