A method, apparatus and device for address search

What is AI technical title?
AI technical title is built by PatSnap AI team. It summarizes the technical point description of the patent document.
By segmenting and splitting fuzzy address text and performing multi-dimensional encoding matching, the problem of low search accuracy caused by non-standard user input addresses is solved, achieving efficient and accurate address matching.

CN122240860APending Publication Date: 2026-06-19吕国晖

View PDF 0 Cites 0 Cited by

Patent Information

Authority / Receiving Office: CN · China
Patent Type: Applications(China)
Current Assignee / Owner: 吕国晖
Filing Date: 2026-03-14
Publication Date: 2026-06-19

Application Information

Patent Timeline

14 Mar 2026

Application

19 Jun 2026

Publication

CN122240860A

IPC: G06F16/387; G06F40/289; G06F40/126; G06F16/31; G06F16/338

AI Tagging

Application Domain

Natural language data processing Text database indexing

Explore More Agents

Novelty Search
Search existing technologies and assess novelty
↗
FTO
Analyze whether a product may infringe others' patents
↗
Design FTO
Check prior-design risk for exterior design
↗
Drafting
Draft patent application text based on a technical solution
↗
Find Solutions with TRIZ
Generate feasible solution to solve your technical challenge
↗

Similar Technology Patents

A system function comparison method, a page display method, and a computing device
CN122219968AError detection/correctionReverse engineering
Intelligent metal physical and chemical laboratory detection report automatic generation system
CN122242474ANatural language data processing Office automation
Diversity-preserved domain adaptation using text-to-image diffusion for 3D generative model
US12657667B2Image enhancement Image analysis
Electronic system and method for providing suggested revised electronic communications in real time based on a recipient communication style
US20260170241A1Semantic analysis Input/output processes for data processingPersonalizationTelecommunications
Method and apparatus with data description
US20260170242A1Digital data information retrieval Natural language data processingLinguistic modelData description

Get free access to AI patent search and analysis

Check patentability, review prior art and ask IP Agent with full patent context.

AI Technical Summary

Technical Problem

The address text entered by users is highly arbitrary, vague, and non-standard, resulting in low accuracy of address search and failing to meet user needs.

Method used

By segmenting and encoding the fuzzy address search text, and using multi-dimensional database indexes for matching and combination, accurate query term combinations are generated, and matching results are retrieved from standard databases.

Benefits of technology

It improves the accuracy and efficiency of address search, effectively handles noise such as spelling errors, homophones, and similar-looking characters, and achieves highly robust and accurate address matching.

✦ Generated by Eureka AI based on patent content.

Smart Images

Figure CN122240860A_ABST

Patent Text Reader

Abstract

This application provides a method, apparatus, and device for address search. Relating to the field of address search, the method includes: segmenting and splitting an acquired fuzzy address search text to obtain at least one search segment word corresponding to the fuzzy address search text; expanding and encoding the at least one search segment word to obtain search segment word groups corresponding to each search segment word, each search segment word group including one search segment word and multiple search extended words; matching each word in the search segment word group against a corresponding database index to obtain at least one candidate word; combining the candidate words corresponding to the at least one search segment word to obtain a precise query word combination corresponding to the fuzzy address search text; and using the precise query word combination to retrieve search results corresponding to the fuzzy address search text from a standard database. This application embodiment can improve the matching accuracy of user-input addresses.

Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This application relates to the field of address processing, and more specifically, to a method, apparatus, and device for address search. Background Technology

[0002] Address data serves as a crucial link between online information and offline entities, playing a vital role in numerous industries such as logistics, transportation, tourism, retail, and public services.

[0003] To standardize management, the industry typically uses hierarchical standards to classify addresses, such as China's 11-level address standard, which refines addresses from provincial administrative divisions down to specific room numbers. However, in practice, user-input address text is often highly arbitrary, vague, and non-standardized, resulting in low accuracy of address searches based on user-input addresses, failing to meet user needs.

[0004] Therefore, improving the matching accuracy of user-input addresses has become a problem that needs to be solved. Summary of the Invention

[0005] The purpose of one embodiment of this application is to provide a method, apparatus, and device for address search, which can improve the matching accuracy of user-input addresses.

[0006] In a first aspect, embodiments of this application provide a method for address search, comprising: performing word segmentation on an acquired fuzzy address search text to obtain at least one search segment word corresponding to the fuzzy address search text; performing extended encoding on the at least one search segment word to obtain search segment word groups corresponding to each search segment word, each search segment word group including one search segment word and multiple search extended words; matching each word in the search segment word group on a corresponding database index to obtain at least one candidate word; combining the candidate words corresponding to the at least one search segment word to obtain a precise query word combination corresponding to the fuzzy address search text; and using the precise query word combination to retrieve search results corresponding to the fuzzy address search text from a standard database.

[0007] This application embodiment first splits the obtained search text, i.e., the fuzzy address search text, and then matches the split search segment words with the segment words in the standard database to find the standard segment words. In this way, the search text can be converted into a standard segment word combination, and the accuracy of the search can be improved by using the converted search text for searching.

[0008] In one embodiment, the method further includes: performing word segmentation on each address in the standard address database to obtain a standard segmented word set corresponding to each address, wherein the standard address database includes multiple addresses, and each standard segmented word set includes at least one segmented word; performing deduplication on the segmented words in the standard segmented word sets corresponding to all addresses in the standard address database to obtain a segmented address word table corresponding to the standard address database, wherein the segmented address word table includes all deduplicated segmented words, and the segmented words in the segmented address word table are all different; performing extended encoding on each segmented word in the segmented address word table to obtain a segmented word group corresponding to each segmented word, wherein each segmented word group includes one segmented word and multiple extended words, and creating a database index for words of the same type in all segmented word groups.

[0009] In one implementation, the plurality of extended words includes one or more of pinyin, initials of pinyin, glyphs, strokes, and numbers.

[0010] In one implementation, creating a database index for words of the same type in all segmented word groups includes: constructing the database index after removing address level identifiers from words of the same type.

[0011] In one implementation, the step of matching each word in the search segment word group with its corresponding database index to obtain at least one candidate word includes: matching each word in the search segment word group with its corresponding database index to obtain at least one initial candidate word; and selecting the at least one candidate word from the at least one initial candidate word based on the similarity between the at least one initial candidate word and its corresponding search segment.

[0012] In one embodiment, the at least one search segment term includes multiple search segment terms, and the step of combining the candidate terms corresponding to the at least one search segment term to obtain the precise query term combination corresponding to the fuzzy address search text includes: cross-combining the candidate terms corresponding to the multiple search segment terms to obtain multiple precise query term combinations corresponding to the fuzzy address search text.

[0013] In one implementation, a fuzzy address search text corresponds to multiple precise query term combinations. The step of using the precise query term combinations to retrieve search results corresponding to the fuzzy address search text from a standard database includes: using the precise query term combinations to retrieve addresses corresponding to a standard segmented word set including the precise query term combinations from the standard database; and determining the search results from the multiple addresses based on the similarity between the multiple addresses corresponding to the fuzzy address search text and the fuzzy address search text.

[0014] In one implementation, the search results include multiple addresses sorted by similarity from high to low.

[0015] Secondly, one embodiment of this application provides an address search apparatus, comprising: a splitting unit, configured to perform word segmentation processing on the acquired fuzzy address search text to obtain at least one search segment word corresponding to the fuzzy address search text; an expansion unit, configured to expand and encode the at least one search segment word to obtain search segment word groups corresponding to each search segment word, each search segment word group including one search segment word and multiple search expansion words; a matching unit, configured to match each word in the search segment word group on a corresponding database index to obtain at least one candidate word; a combination unit, configured to combine the candidate words corresponding to the at least one search segment word to obtain a precise query word combination corresponding to the fuzzy address search text; and a detection unit, configured to use the precise query word combination to detect the search results corresponding to the fuzzy address search text from a standard database.

[0016] Thirdly, one embodiment of this application provides a computer-readable storage medium having a computer program stored thereon, which, when executed by a processor, can implement the methods described in the first aspect and any embodiment of the first aspect.

[0017] Fourthly, one embodiment of this application provides an electronic device including a memory, a processor, and a computer program stored in the memory and executable on the processor, wherein the processor, when executing the program, can implement the method as described in the first aspect and any embodiment of the first aspect.

[0018] Fifthly, one embodiment of this application provides a computer program product, the computer program product including a computer program, wherein the computer program, when executed by a processor, can implement the method as described in the first aspect and any embodiment of the first aspect. Attached Figure Description

[0019] To more clearly illustrate the technical solution of one embodiment of this application, the accompanying drawings used in one embodiment of this application will be briefly described below. It should be understood that the following drawings only show some embodiments of this application and should not be regarded as a limitation of the scope. For those skilled in the art, other related drawings can be obtained based on these drawings without creative effort.

[0020] Figure 1 A flowchart of an address search method provided in one embodiment of this application; Figure 2 A schematic diagram of an address search apparatus provided for one embodiment of this application; Figure 3 Schematic diagram of an electronic device provided for an embodiment of the present application. Specific embodiments

[0021] To make the objectives, technical solutions, and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be clearly and completely described below in conjunction with the accompanying drawings in the embodiments of the present application.

[0022] In practical applications, the address text input by users often has a high degree of randomness, ambiguity, and non-standardization, resulting in low accuracy in searching based on the user-input address and being unable to meet the needs of users.

[0023] For example, the address text input by users often has the following problems: spelling mistakes: including misspelled words or similar-looking words, such as misspelling "Weiming" as "Moming". Pronunciation mistakes: including homophones or near-homophones, such as misspelling "Daoli District" as "Daoli District". Structural omission: incomplete address level information, such as omitting the district or county where it is located. Structural disorder: the order of address elements is reversed, not conforming to the standard level. Format mixture: including mixed content such as pinyin, Chinese characters, foreign languages, abbreviations, etc. Therefore, in such similar situations of non-standard addresses input by users, how to search for addresses and accurately match the addresses that users want to search for has become a problem to be solved.

[0024] In view of the above problems, the present application provides a method for address search. First, the obtained search text, that is, the fuzzy address search text, is split, and then the split search segment words are matched with the segment words in the standard database to find the standard segment words, so as to be able to convert the search text into a combination of standard segment words. Furthermore, using the converted search text for further search can improve the accuracy of the search.

[0025] Hereinafter, for the convenience of understanding and illustration, by way of example and not limitation, the execution process and actions in the method for address search of the present application will be described.

[0026] It should be understood that in the embodiments of the present application, in order to improve the accuracy of the search, two improvements have been made. One is to improve the address database (corresponding to the first stage below). The purpose of this first stage is to perform offline or online processing on the existing standard address database to construct an index system that supports multi-dimensional rapid retrieval. The other is to improve the search process (corresponding to the second stage below). The purpose of this second stage is to process the fuzzy address query request input by the user and return the most relevant standard address result through multi-level matching and intelligent sorting. In practical applications, generally, the first stage is completed in advance. After completing the first stage, usually, for each search, the actions of the second stage are directly performed.

[0027] It should be understood that in the embodiments of this application, the first stage can be completed in advance, that is, only the second stage is performed when performing address search, and the first stage does not need to be executed again. Optionally, in the embodiments of this application, the first stage can be executed first, and then the second stage can be executed. Optionally, the first stage and the second stage in the embodiments of this application can be executed by the same execution entity, or they can be executed by separate execution entities. For example, the second stage can be executed by the address search device, which may not execute the first stage. The first stage can be executed by other devices, and the address search device directly uses the result of the first stage. The embodiments of this application are not limited to this.

[0028] The following is in conjunction with the appendix Figure 1 An example of an embodiment of this application illustrates a method for address searching.

[0029] like Figure 1 The method shown can be executed by an address search server, or by any third party with access to the address database; however, the embodiments of this application are not limited thereto. Figure 1 The method shown corresponds to the second stage of the embodiments of this application, and the method includes: 110. Perform word segmentation on the obtained fuzzy address search text to obtain at least one search segment word corresponding to the fuzzy address search text.

[0030] It should be understood that the fuzzy address search text can be text entered by the user. Of course, it can also be text from other scenarios that require address searching, and the embodiments of this application are not limited thereto.

[0031] Specifically, in the second stage, this embodiment first performs word segmentation on the fuzzy address search text. Specifically, the segmentation process in the second stage is similar to that in the first stage. Since some actions in the second stage are similar to or related to the first stage, for ease of understanding and description, the actions of the first stage will be described first below. The segmentation process in the second stage can be referred to the segmentation process of the first stage below.

[0032] Optionally, as an embodiment, the method of this application embodiment may further include the following four actions in the first stage: 1. Standard address segmentation; 2. Constructing a deduplicated segmented address word list; 3. Multi-dimensional intelligent encoding of segmented address words; 4. Constructing a database index.

[0033] Specifically, the first step is standard address segmentation, which includes: segmenting each address in the standard address database into words to obtain a set of standard segmented words for each address. The standard address database contains multiple addresses, and each set of standard segmented words includes at least one segmented word. It should be understood that the standard address database in this application embodiment can be an address database in a commonly used format. For example, it can be an address database based on China's 11-level address standard, that is, an address database that refines addresses from provincial administrative divisions to specific room numbers. For example, for each address record in the standard address database, a word segmentation tool is used to perform structured splitting to obtain a set of ordered address segment words. For example, the address "AA Province BB City CC District DD Street EE No. FF Unit GG Floor HH Room" can be split into a segment word set: {"AA Province", "BB City", "CC District", "DD Street", "EE No.", "FF Unit", "GG Floor", "HH Room"}. It should be understood that the word segmentation tool in the embodiments of this application may be: rule-based segmentation logic based on address keywords such as "province", "city", "district", "street", "road", "number", "unit"; or it may be based on a professional address word segmentation model, such as a sequence labeling model based on Conditional Random Field (CRF) or Bidirectional Long Short-Term Memory Network (BiLSTM-CRF); or it may be based on the address resolution capability provided by Large Language Model (LLM).

[0034] Furthermore, in this embodiment of the application, the word segmentation results can also be associated with the original address records, for example, by expanding new fields in the original address table or creating an association table for storage.

[0035] 2. Constructing a deduplicated segmented address word list, including: deduplicating the segmented words in the standard segmented word set corresponding to all addresses in the standard address library to obtain the segmented address word list corresponding to the standard address library. The segmented address word list includes all deduplicated segmented words, and the segmented words in the segmented address word list are all different.

[0036] Specifically, all address segment terms generated in the previous step are deduplicated to form a unique segmented address term list with a significantly reduced size. This embodiment of the application transforms the task of processing tens of millions of address records into processing segmented term lists of millions or even hundreds of thousands, greatly improving the efficiency of subsequent searches.

[0037] III. Multi-dimensional intelligent encoding of segmented address words, including: extending the encoding of each segmented word in the segmented address word table to obtain segmented word groups corresponding to each segmented word, wherein each segmented word group includes one segmented word and multiple extended words.

[0038] Optionally, as an example, the plurality of extended words includes one or more of the following: pinyin, initials of pinyin, glyphs, strokes, and numbers.

[0039] For example, for each word in the segmented address vocabulary, generate encoded representations in multiple dimensions to capture its different fuzzy features. These encodings can include but are not limited to: Original text code: That is, the segmented word itself, such as "Daoli District", for exact matching.

[0040] Pinyin code: Generate the voiceless full pinyin of the segmented word, such as "daoliqu", for matching homophones or pinyin input errors.

[0041] Initials of pinyin code (phonetic and graphic code): Generate the initials abbreviation of the pinyin of the segmented word, such as "dlq", for matching cases where the pronunciation is similar but the character form is different.

[0042] Graphic code: Use a Chinese character splitting tool to disassemble the components of a Chinese character, and optionally remove the radicals to generate the main structure code. For example, for "Bin", "Bin" can be generated, for matching similar-looking character errors.

[0043] Stroke code: Convert a Chinese character into a stroke sequence through a font file or a stroke dataset. For example, the stroke code of the character "Wei" is "horizontal, horizontal, vertical, left-falling stroke, right-falling stroke", for deeper comparison of character form similarity.

[0044] Numeric code: Extract Arabic numerals or Chinese numerals from the segmented word. For example, extract "216" from "No. 216" and extract "6" from "Unit 6", for quick matching of the numeric part.

[0045] It should be understood that other extended encoding methods can also be adopted in the embodiments of this application, and the embodiments of this application are not limited to this. For example, more detailed initial consonant and final consonant encodings can also be adopted to handle speech errors caused by more complex dialects or accents. The four-corner number / Wubi encoding can also be adopted: Traditional Chinese encoding schemes such as four-corner numbers or Wubi radicals can be used as additional graphic codes to supplement the graphic codes based on component disassembly, enhancing the recognition ability for different types of similar-looking characters. For addresses involving multiple languages (such as Chinese-English mixtures), a multilingual transliteration library (such as converting English into Chinese characters or pinyin with similar pronunciations) can also be integrated and indexed as a special encoding.

[0046] IV. Construct a database index, including: Creating a database index for words of the same type in all segmented word groups.

[0047] Optionally, as an embodiment, the creating a database index for words of the same type in all segmented word groups includes: Removing the address level identifiers in the words of the same type and then constructing the database index.

[0048] Specifically, the multi-encoding results generated in the previous step can be stored in a database, and an index can be created for each encoding to support fast retrieval. When creating the index, to improve the generalization ability of the matching, common address-level identifiers (such as "province", "city", "district", "street", "number", etc.) can be removed, preventing these high-frequency but low-information words from participating in index matching. For example, when indexing the original text code of "Daoli District", the actual index content is "Daoli". In this way, the B-Tree index or GIN index (especially the `gin_trgm_ops` type) of traditional databases (such as PostgreSQL) can be fully utilized to achieve efficient prefix, suffix, and inclusion matching.

[0049] 120. The at least one search segment term is extended and encoded to obtain the search segment term group corresponding to each search segment term.

[0050] Each search segment term group includes one search segment term and multiple search expansion terms.

[0051] The extended encoding of search segment terms here can refer to the extended encoding process in step 3 of the first stage above, which will not be repeated here.

[0052] 130. Match each word in the search segmented word group with its corresponding database index to obtain at least one candidate word.

[0053] Optionally, as another embodiment, the step of matching each word in the search segment word group with the corresponding database index to obtain at least one candidate word includes: matching each word in the search segment word group with the corresponding database index to obtain at least one initial candidate word; and selecting the at least one candidate word from the at least one initial candidate word based on the similarity between the at least one initial candidate word and the corresponding search segment.

[0054] Specifically, for each "search segment term" (i.e., the multi-dimensional intelligent index built in the first stage), a fast search can be performed in the "segment address term list". Specifically, the search segment term can also be multi-dimensionally encoded and matched against the corresponding index. In one implementation, as long as any encoding (original text, pinyin, initial letter, etc.) matches, the corresponding standard segment term is selected as a candidate term. This step can generate one or more sets of candidate standard segment terms for each search segment term. For each search segment term and its recalled set of candidate standard segment terms, a text similarity algorithm (such as normalized Levenshtein distance or Jaro-Winkler distance) can be used to calculate their similarity score. Candidate terms are sorted according to the score, and a threshold can be set for preliminary screening, retaining candidate terms with higher similarity.

[0055] 140. Combine the candidate words corresponding to the at least one search segment word to obtain the precise query word combination corresponding to the fuzzy address search text.

[0056] Optionally, as an embodiment, the at least one search segment term includes multiple search segment terms, and the step of combining the candidate terms corresponding to the at least one search segment term to obtain the precise query term combination corresponding to the fuzzy address search text includes: cross-combining the candidate terms corresponding to the multiple search segment terms to obtain multiple precise query term combinations corresponding to the fuzzy address search text.

[0057] Specifically, the candidate standard segmentation terms from all search segmentation terms can be cross-combined according to their similarity scores from high to low. Each combination forms a query condition, and then address records containing all the standard segmentation terms in that combination can be retrieved from the standard address database. This step ensures that information from all search segmentation terms participates in the final address matching.

[0058] 150. Using the precise query term combination, retrieve search results corresponding to the fuzzy address search text from the standard database.

[0059] Optionally, as an embodiment, a fuzzy address search text corresponds to multiple precise query term combinations. The step of using the precise query term combinations to retrieve search results corresponding to the fuzzy address search text from a standard database includes: using the precise query term combinations to retrieve addresses corresponding to a standard segmented word set including the precise query term combinations from the standard database; and determining the search results from the multiple addresses based on the similarity between the multiple addresses corresponding to the fuzzy address search text and the fuzzy address search text.

[0060] Specifically, the system retrieves address records from the standard address database that simultaneously contain all the standard segment words in the combination, and performs a comprehensive similarity score to determine the final matching degree with the original fuzzy query address. In this embodiment, the comprehensive score can be weighted from multiple dimensions. For example, it may include overall text similarity: calculating the text similarity between the original fuzzy address and the full name of the candidate standard address (e.g., Jaro-Winkler similarity). It may also include overall pinyin similarity: calculating the text similarity of the full pinyin spelling of both, to enhance tolerance for homophone errors. Furthermore, it may include a segment length bonus: assigning higher scores to candidate addresses with fewer segments.

[0061] In one optional implementation, the formula for calculating the comprehensive score S in this application embodiment can be designed as follows: S = (w1 × name similarity + w2 × pinyin similarity) × f (segment length) Here, w1 and w2 are configurable weight coefficients, and (segment length) is a reward function inversely proportional to the segment length (e.g., `1 / log(segment length)`). In practical applications, this can be simplified to first sorting by the number of segments from smallest to largest, and then sorting by weighted similarity within groups with the same number of segments.

[0062] It should be understood that the similarity calculation method in the embodiments of this application can also take other forms. For example, it can be determined based on only a part of the parameters or more other parameters. In addition, the formula for similarity calculation can also take other forms. The embodiments of this application are not limited to this.

[0063] Optionally, as another embodiment, the search results include multiple addresses sorted by similarity from high to low.

[0064] For example, based on the comprehensive score calculated above, all candidate address records can be sorted in descending order, and the top-N addresses with the highest scores can be returned to the user as the final search results.

[0065] It should be understood that in 130 and 150 above, a hybrid similarity model can also be used for similarity calculation, and the embodiments of this application are not limited to this. For example, character-based algorithms (such as Levenshtein) and bag-of-words or TF-IDF-based algorithms can be combined to comprehensively evaluate text similarity. Optionally, machine learning can also be used to assist scoring: for example, a lightweight machine learning model (such as logistic regression or gradient boosting tree) can be trained, taking various similarity scores (text similarity, pinyin similarity, segment length, etc.) as feature inputs, and the model learns the optimal weight combination to output the final comprehensive score, replacing the weighted formula in the above example. Furthermore, in 130 and 150, the embodiments of this application can also adopt a two-stage recall process to achieve matching, that is, coarse matching first, and then fine matching. For example, in 130, the first stage uses a low-computation-cost index (such as the initial letter code of pinyin) for coarse recall to obtain a large candidate set; the second stage uses a high-computation-cost similarity algorithm to fine-rank the candidate set to balance efficiency and accuracy.

[0066] The above-mentioned alternative solutions do not deviate from the core idea of this application, namely, to solve the problem of fuzzy address search through "multi-dimensional feature extraction and indexing" and "multi-level similarity evaluation", and therefore should fall within the protection scope of this invention.

[0067] Compared with the prior art, the embodiments of this application can achieve some or all of the following effects.

[0068] Efficiency and Scalability: This application's embodiments transform the matching problem of a large-scale address database into processing a significantly reduced segmented vocabulary by segmenting and deduplicating standard addresses. Furthermore, through database indexing technology, complex machine learning model inference processes are avoided, resulting in high query performance, easy horizontal scaling, and the ability to handle massive data and high-concurrency business scenarios.

[0069] High robustness and accuracy: By constructing a multi-dimensional intelligent index (original text, pinyin, initial letter, glyph, number, etc.), the embodiments of this application can capture fuzzy features of addresses from multiple dimensions, effectively dealing with various noises such as spelling errors, homophones, similar-looking characters, pinyin input, and abbreviations. Combined with the subsequent multi-level similarity scoring model, high accuracy and recall of search results are ensured.

[0070] Low cost and easy to implement: The technical solution of this application embodiment is relatively ingenious, requiring no expensive commercial software or large-scale GPU clusters for model training, making it easy to implement and with relatively low implementation costs.

[0071] High flexibility and interpretability: Each aspect of the embodiments of this application (such as word segmentation rules, encoding methods, similarity algorithms, and weight coefficients) is designed as a "white box," allowing for flexible configuration and optimization based on specific business needs and data characteristics. The matching process and scoring logic are clear, and the results have good interpretability, facilitating debugging and continuous improvement.

[0072] This application addresses the unique challenges of address matching by fully considering the structured and hierarchical nature of address text. For example, through segmentation and a reward mechanism based on segment length, it effectively distinguishes address matching at different granularities, avoiding the limitations of traditional semantic models in address matching tasks.

[0073] Please refer to Figure 2 , Figure 2 A block diagram of an address search apparatus provided in one embodiment of this application is shown. Figure 2 The device 200 shown can be the execution end of address search. It should be understood that the device 200 corresponds to the execution end in the above method embodiment and can perform the various steps involved in the above method embodiment. The specific functions of the device 200 can be found in the description above. To avoid repetition, detailed descriptions are omitted here.

[0074] Figure 2 The illustrated device 200 includes at least one software function module that can be stored in a memory or embedded in the device in the form of software or firmware. Figure 2The apparatus 200 shown includes: a splitting unit 210, used to perform word segmentation processing on the acquired fuzzy address search text to obtain at least one search segment word corresponding to the fuzzy address search text; an expansion unit 220, used to expand and encode the at least one search segment word to obtain search segment word groups corresponding to each search segment word, each search segment word group including one search segment word and multiple search expansion words; a matching unit 230, used to match each word in the search segment word group on the corresponding database index to obtain at least one candidate word; a combination unit 240, used to combine the candidate words corresponding to the at least one search segment word to obtain a precise query word combination corresponding to the fuzzy address search text; and a detection unit 250, used to use the precise query word combination to detect the search results corresponding to the fuzzy address search text from a standard database.

[0075] Those skilled in the art will understand that, for the sake of convenience and brevity, the specific working process of the device described above can be referred to the corresponding process in the aforementioned method, and will not be elaborated further here.

[0076] like Figure 3 As shown, one embodiment of this application provides an electronic device 300, which includes a memory 310, a processor 320, and a computer program stored in the memory 310 and executable on the processor 320. When the processor 320 reads the program from the memory 310 via a bus 330 and executes the program, it can achieve the above-described functionality. Figure 1 The method. Optional, Figure 3 The device shown may also include a transceiver that can be used to send and / or receive data streams.

[0077] Processor 320 can process digital signals and may include various computing architectures. For example, it may be a complex instruction set computer architecture, a reduced instruction set computer architecture, or an architecture that implements multiple instruction set combinations. In some examples, processor 320 may be a microprocessor.

[0078] The memory 310 can be used to store instructions executed by the processor 320 or data related to the execution of instructions. These instructions and / or data may include code for implementing some or all of the functions of one or more modules described in the embodiments of this application. The processor 320 of this disclosure embodiment can be used to execute the instructions in the memory 310 to implement the above-described methods. The memory 310 includes dynamic random access memory, static random access memory, flash memory, optical memory, or other memories well known to those skilled in the art.

[0079] One embodiment of this application also provides a computer-readable storage medium having a computer program stored thereon, which, when executed by a processor, can implement the methods described in the above embodiments.

[0080] An embodiment of this application also provides a computer program product, which includes a computer program, wherein the computer program, when executed by a processor, can implement the methods provided in the above embodiments.

[0081] It should be noted that the processor in the embodiments of the present invention (e.g., Figure 3 The processor in the above method embodiments can be an integrated circuit chip with signal processing capabilities. In implementation, each step of the above method embodiments can be completed by integrated logic circuits in the processor's hardware or by software instructions. The processor can be a general-purpose processor, a digital signal processor (DSP), an application-specific integrated circuit (ASIC), a field-programmable gate array (FPGA), or other programmable logic devices, discrete gate or transistor logic devices, or discrete hardware components. It can implement or execute the methods, steps, and logic block diagrams disclosed in the embodiments of this invention. The general-purpose processor can be a microprocessor or any conventional processor. The software module can reside in random access memory, flash memory, read-only memory, programmable read-only memory, electrically erasable programmable memory, registers, or other mature storage media in the art. The storage medium is located in memory, and the processor reads information from the memory and, in conjunction with its hardware, completes the steps of the above method.

[0082] It can be understood that the memory in the embodiments of the present invention (e.g., Figure 3The memory in the memory can be volatile or non-volatile, or it can include both. Non-volatile memory can be read-only memory (ROM), programmable read-only memory (PROM), erasable programmable read-only memory (EPROM), electrically erasable programmable read-only memory (EEPROM), or flash memory. Volatile memory can be random access memory (RAM), which is used as an external cache. By way of example, but not limitation, many forms of RAM are available, such as static random access memory (SRAM), dynamic random access memory (DRAM), synchronous dynamic random access memory (SDRAM), double data rate synchronous dynamic random access memory (DDR SDRAM), enhanced synchronous dynamic random access memory (ESDRAM), synchronous linked dynamic random access memory (SLDRAM), and direct rambus RAM (DR RAM). It should be noted that the memory used in the systems and methods described herein is intended to include, but is not limited to, these and any other suitable types of memory.

[0083] Those skilled in the art will recognize that the units and algorithm steps of the various examples described in conjunction with the embodiments disclosed herein can be implemented in electronic hardware, computer software, or a combination of both. To clearly illustrate the interchangeability of hardware and software, the components and steps of the various examples have been generally described in terms of functionality in the foregoing description. Whether these functions are implemented in hardware or software depends on the specific application and design constraints of the technical solution. Those skilled in the art can use different methods to implement the described functions for each specific application, but such implementations should not be considered beyond the scope of this invention.

[0084] Those skilled in the art will clearly understand that, for the sake of convenience and brevity, the specific working processes of the systems, devices, and units described above can be referred to the corresponding processes in the foregoing method embodiments, and will not be repeated here.

[0085] In the embodiments provided in this application, it should be understood that the disclosed systems, apparatuses, and methods can be implemented in other ways. For example, the apparatus embodiments described above are merely illustrative; for instance, the division of units is only a logical functional division, and in actual implementation, there may be other division methods. For example, multiple units or components may be combined or integrated into another system, or some features may be ignored or not executed. Furthermore, the couplings or direct couplings or communication connections shown or discussed may be indirect couplings or communication connections through some interfaces, apparatuses, or units, or they may be electrical, mechanical, or other forms of connection.

[0086] In summary, the above description is merely a preferred embodiment of the technical solution of the present invention and is not intended to limit the scope of protection of the present invention. Any modifications, equivalent substitutions, improvements, etc., made within the spirit and principles of the present invention should be included within the scope of protection of the present invention.

Claims

1. A method for address search, characterized in that, include: The obtained fuzzy address search text is segmented and split into words to obtain at least one search segment word corresponding to the fuzzy address search text; The at least one search segment term is extended and encoded to obtain search segment term groups corresponding to each search segment term, and each search segment term group includes one search segment term and multiple search extended terms; Each word in the search segment word group is matched against its corresponding database index to obtain at least one candidate word; The candidate words corresponding to the at least one search segment word are combined to obtain the precise query word combination corresponding to the fuzzy address search text; The precise query term combination is used to retrieve search results corresponding to the fuzzy address search text from a standard database.

2. The method according to claim 1, characterized in that, The method further includes: Each address in the standard address database is segmented and split into words to obtain a standard segmented word set corresponding to each address. The standard address database includes multiple addresses, and each standard segmented word set includes at least one segmented word. The segment words in the standard segment word set corresponding to all addresses in the standard address database are deduplicated to obtain the segment address word table corresponding to the standard address database. The segment address word table includes all deduplicated segment words, and the segment words in the segment address word table are all different. Each segment word in the segment address vocabulary is extended and encoded to obtain a segment word group corresponding to each segment word. Each segment word group includes one segment word and multiple extended words. Create a database index for words of the same type in all segmented word groups.

3. The method according to claim 2, characterized in that, The extended words include one or more of the following: pinyin, initials of pinyin, glyphs, strokes, and numbers.

4. The method according to claim 2, characterized in that, The process of creating a database index for words of the same type in all segmented word groups includes: The database index is constructed by removing address hierarchy identifiers from words of the same category.

5. The method according to any one of claims 1 to 4, characterized in that, The step of matching each word in the search segmented word group against its corresponding database index to obtain at least one candidate word includes: Each word in the search segment word group is matched against its corresponding database index to obtain at least one initial candidate word; The at least one candidate word is selected from the at least one initial candidate word based on the similarity between the at least one initial candidate word and the corresponding search segment.

6. The method according to any one of claims 1 to 4, characterized in that, The at least one search segment term includes multiple search segment terms, and the step of combining the candidate terms corresponding to the at least one search segment term to obtain the precise query term combination corresponding to the fuzzy address search text includes: The candidate words corresponding to the multiple search segment words are cross-combined to obtain multiple precise query word combinations corresponding to the fuzzy address search text.

7. The method according to any one of claims 1 to 4, characterized in that, A fuzzy address search text corresponds to multiple precise query term combinations. The step of using the precise query term combinations to retrieve search results corresponding to the fuzzy address search text from a standard database includes: Use the precise query term combination to retrieve the address corresponding to the standard segmented word set including the precise query term combination from the standard database; The search result is determined from the multiple addresses corresponding to a fuzzy address search text based on the similarity between the fuzzy address search text and the multiple addresses corresponding to the fuzzy address search text.

8. The method according to claim 7, characterized in that, The search results include multiple addresses sorted by similarity from high to low.

9. An address search device, characterized in that, include: The splitting unit is used to perform word segmentation processing on the acquired fuzzy address search text to obtain at least one search segment word corresponding to the fuzzy address search text; An extension unit is used to extend and encode the at least one search segment term to obtain search segment term groups corresponding to each search segment term, wherein each search segment term group includes one search segment term and multiple search extended terms; The matching unit is used to match each word in the search segment word group with the corresponding database index to obtain at least one candidate word; The combination unit is used to combine the candidate words corresponding to the at least one search segment word to obtain the precise query word combination corresponding to the fuzzy address search text; The detection unit is used to retrieve search results corresponding to the fuzzy address search text from a standard database using the precise query term combination.

10. An electronic device, characterized in that, The method includes a memory, a processor, and a computer program stored in the memory and running on the processor, wherein the computer program is executed by the processor to perform the method as claimed in any one of claims 1-8.