Check patentability & draft patents in minutes with Patsnap Eureka AI!

An address similarity measurement method based on hierarchical labeling

A measurement method and similarity measurement technology, which is applied in the field of address similarity measurement based on hierarchical labeling, can solve the problem that similarity analysis has not achieved breakthrough progress, and achieve the goal of overcoming large differences in similarity measurement and strong generalization ability , the effect of high accuracy

Active Publication Date: 2019-04-26
北京惠盈金科技术有限公司 +1
View PDF5 Cites 18 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, how to quickly and accurately determine the similarity analysis of two addresses has not yet achieved breakthrough progress

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • An address similarity measurement method based on hierarchical labeling
  • An address similarity measurement method based on hierarchical labeling
  • An address similarity measurement method based on hierarchical labeling

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0042] The technical scheme of the present invention is described in detail below in conjunction with accompanying drawing:

[0043] The idea of ​​the present invention is to write codes to clean, complete, standardize, manually split, hierarchically label, splice, etc. a small amount of original data to obtain standard training data, use these training data to train the Address-LSTM model, and then use the training A good model will need to identify and label named entities for the compared address strings to obtain address entities at each level, and then input the address strings marked at these levels into the similarity calculation method module containing relational rules, based on editing The similarity calculation method module of distance, the similarity calculation method module based on pinyin edit distance, the similarity calculation method module based on Word2vec and the similarity calculation method module based on Baidu search index, calculate the similarity vec...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses an address similarity calculation method based on hierarchical labeling. Sufficient samples are generated by utilizing a small amount of existing hierarchical annotation data to train an Addats-LSTM model as the core of the system to carry out address similarity calculation based on automatic address hierarchical annotation. In actual operation, input address data is cleaned and complemented, abnormal symbols and other processing are removed, an original character string is decomposed into appropriate substring sequences through the processes of word segmentation, regular expression, re-splicing and the like, and the trained Address-LSTM model is used to mark the address hierarchy label of each substring, and then a multi-similarity calculation module and a comprehensive integration method are used for providing a comprehensive similarity index between addresses. Through program operation deployed on the computer, manual checking can be greatly shortened or evenavoided, and the entity identification efficiency in financial data is improved under the condition that the accuracy is ensured.

Description

technical field [0001] The present invention relates to the field of computer technology, specifically to the field of natural language processing, and its application target is address recognition (toponym recognition). Vectors, edit distance, LSTM and BP neural network modules in machine learning and word2vector in natural language processing. Background technique [0002] In recent years, with the development of the Internet and the economy, some banks and financial companies have begun to provide more and more loan services to ordinary users. When these companies provide services to users, they will collect some personal information of users as a standard to measure whether to provide services to users. Among these personal information, the more important information is the user's address information. Because there are many connections among many financial fraud groups at present, and there are often great similarities in certain addresses (such as residential addresse...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F16/33G06F17/27G06K9/62
CPCG06F40/295G06F18/22
Inventor 陈清华王建斌张常青刘晶南晓杰杨秀波张江朱瑞鹤邓建博李本继
Owner 北京惠盈金科技术有限公司
Features
  • R&D
  • Intellectual Property
  • Life Sciences
  • Materials
  • Tech Scout
Why Patsnap Eureka
  • Unparalleled Data Quality
  • Higher Quality Content
  • 60% Fewer Hallucinations
Social media
Patsnap Eureka Blog
Learn More