Address similarity calculation method and device, equipment and storage medium

A similarity calculation and address technology, applied in computing, neural learning methods, instruments, etc., can solve the problems of low retrieval speed and accuracy of matching algorithms, low matching success rate of unregistered words, and complex address matching rules. The effect of improving first choice accuracy, improving retrieval speed and accuracy, and increasing success rate

Pending Publication Date: 2020-10-16
SHANGHAI DONGPU INFORMATION TECH CO LTD
View PDF7 Cites 19 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0012] The object of the present invention is to provide a method, device, device and storage medium for calculating address similarity, so as to solve the problem of low matching success rate of unregistered words in the existing address matching technology when the amount of address information data is large; And address matching rules are complex, the retrieval speed and accuracy of existing matching algorithms are not high, and the address matching efficiency is low

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Address similarity calculation method and device, equipment and storage medium
  • Address similarity calculation method and device, equipment and storage medium
  • Address similarity calculation method and device, equipment and storage medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0058] This embodiment provides a method for calculating address similarity, please refer to figure 1 , the method includes:

[0059] S1: Extract the text information of the input address, filter and segment the text information, and obtain multiple candidate words;

[0060] S2: Input multiple candidate words into an address vector generation model to obtain an initial address vector of the input address;

[0061] S3: Input the initial address vector into the address similarity calculation model based on the twin neural network, and combine the gradient descent algorithm of the ternary loss function to obtain the feature vector of the initial address vector;

[0062] S4: Calculate the cosine distance or L2 distance between the feature vector and the address vector in the standard address data set, to obtain a known address vector close to the input address.

[0063] When the amount of address information data is large, the problem of low matching success rate of unregistered...

Embodiment 2

[0134]The above-mentioned embodiment has described the address similarity calculation method in the embodiment of the present invention, and the address similarity calculation device in the embodiment of the present invention is described below, please refer to figure 2 , the address similarity calculation device in the embodiment of the present invention includes:

[0135] The input address preprocessing module 1 is used to extract the text information of the input address, filter and segment the text information, and obtain a plurality of candidate words;

[0136] The initial vector forming module 2 is used to input a plurality of candidate words into the address vector generation model to obtain the initial address vector of the input address;

[0137] The feature vector extraction module 3 is used to input the initial address vector into the depth adaptation network model based on the ternary loss to obtain the feature vector of the initial address vector;

[0138] The s...

Embodiment 3

[0148] The second embodiment described above describes in detail the address similarity calculation device in the embodiment of the present invention from the perspective of modular functional entities, and the following describes the address similarity calculation device in the embodiment of the present invention in detail from the perspective of hardware processing.

[0149] Please see image 3 , the address similarity calculation device 500 may have relatively large differences due to different configurations or performances, and may include one or more processors (central processing units, CPU) 510 (for example, one or more processors) and memory 520 , one or more storage media 530 (such as one or more mass storage devices) for storing application programs 533 or data 532 . Wherein, the memory 520 and the storage medium 530 may be temporary storage or persistent storage. The program stored in the storage medium 530 may include one or more modules (not shown in the figure)...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses an address similarity calculation method and device, equipment and a storage medium. Aiming at the problems that address matching has complex rules, an existing matching algorithm is not high in retrieval speed and accuracy, and the address matching efficiency is low, a solution is proposed. Input address information is expressed by a proper initial vector; an address similarity calculation model based on a twin neural network is used, a gradient descent algorithm of a ternary loss function is combined, thus obtaining a feature vector of the initial address vector; finally, the cosine distance or L2 distance between the feature vector and an address vector in a standard address data set is calculated, the known address vector closest to the input address vector is obtained, so that the address matching rule is simplified, the preferred accuracy of the same address is improved, and the retrieval speed and accuracy of the matching algorithm are further improved.

Description

technical field [0001] The invention belongs to the design field of address matching, and in particular relates to an address similarity calculation method, device, equipment and storage medium. Background technique [0002] The commonly used address matching algorithm is to use the Chinese word segmentation algorithm, set dictionary rules to extract various place nouns, and then combine word segmentation features to calculate the similarity between addresses. Specifically, there are three situations: [0003] 1. Based on dictionary word segmentation algorithm [0004] Also known as string matching word segmentation algorithm. The algorithm matches the character string to be matched with words in an established "sufficiently large" dictionary according to a certain strategy. If an entry is found, it means that the matching is successful and the word is recognized. Common dictionary-based word segmentation algorithms are divided into the following types: forward maximum ma...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F40/205G06F40/279G06N3/04G06N3/08
CPCG06F40/279G06F40/205G06N3/08G06N3/045
Inventor 杨天宇李斯
Owner SHANGHAI DONGPU INFORMATION TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products