A method, apparatus and device for matching address information
By training and structurally adjusting the target model, the problem of consistency judgment in rich and diverse address information in traditional address matching methods is solved, and more accurate semantic matching of address information is achieved.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- ALIPAY (HANGZHOU) INFORMATION TECH CO LTD
- Filing Date
- 2022-08-18
- Publication Date
- 2026-06-16
Smart Images

Figure CN115345174B_ABST
Abstract
Description
Technical Field
[0001] This document relates to the field of computer technology, and in particular to a method, apparatus and device for matching address information. Background Technology
[0002] Semantic matching of text information is an application with significant practical implications, especially address matching. Address matching is a crucial task used to verify the authenticity and consistency of address information in various scenarios. Address matching is a highly challenging task because specific address information contains rich expressive forms. In practical applications, address matching mechanisms are needed to correlate and match different forms of address information. These mechanisms can standardize addresses of different input formats, thereby resolving many business problems related to addresses.
[0003] Address matching mechanisms have been studied for a long time. Traditional address matching methods are mainly based on certain rules and approximate string matching. For example, address element-based methods can be used to match addresses. That is, first, the address information is parsed into different address elements, and then, based on whether the corresponding address elements are consistent, a set threshold is used to determine whether two sets of address information are consistent. However, the above methods can only handle addresses with very standardized formats. In most actual business scenarios, address formats are rich and diverse, and it is difficult to directly judge the consistency of address information based on differences in font. Therefore, a technical solution is needed that can accurately and effectively perform semantic matching on rich and diverse address information. Summary of the Invention
[0004] The purpose of the embodiments in this specification is to provide a technical solution that can accurately and effectively perform semantic matching on a wide variety of address information.
[0005] To achieve the above technical solution, the embodiments in this specification are implemented as follows:
[0006] This specification provides an address information matching method, comprising: acquiring a target model, the target model being a model used for semantic matching of text information; training the target model based on a preset training strategy and first sample data corresponding to the preset training strategy to obtain a trained first model, the preset training strategy including one or more of a sub-strategy for identifying fake addresses, a sub-strategy for predicting preset geographical regions in addresses, and a sub-strategy for predicting address distances; training the first model based on a preset model structure adjustment strategy and second sample data corresponding to the preset model structure adjustment strategy to obtain a trained second model, the second model being a model used for semantic matching of address information; and performing semantic matching processing on two acquired address information based on the second model to determine whether the two address information are the same, thereby obtaining a matching result for the two address information.
[0007] This specification provides an address information matching device, comprising: an initial model acquisition module for acquiring a target model, the target model being used for semantic matching of text information; a first training module for training the target model based on a preset training strategy and first sample data corresponding to the preset training strategy, obtaining a trained first model; the preset training strategy including one or more of a sub-strategy for identifying fake addresses, a sub-strategy for predicting preset geographical regions in addresses, and a sub-strategy for predicting address distances; a second training module for training the first model based on a preset model structure adjustment strategy and second sample data corresponding to the preset model structure adjustment strategy, obtaining a trained second model, the second model being used for semantic matching of address information; and an address information matching module for performing semantic matching processing on two acquired address information based on the second model to determine whether the two address information are the same, obtaining a matching result for the two address information.
[0008] This specification provides an address information matching device, comprising: a processor; and a memory configured to store computer-executable instructions, which, when executed, cause the processor to: acquire a target model, which is a model used for semantic matching of text information; train the target model based on a preset training strategy and first sample data corresponding to the preset training strategy to obtain a trained first model; the preset training strategy includes one or more of a sub-strategy for identifying fake addresses, a sub-strategy for predicting preset geographical regions in addresses, and a sub-strategy for predicting address distances; train the first model based on a preset model structure adjustment strategy and second sample data corresponding to the preset model structure adjustment strategy to obtain a trained second model, which is a model used for semantic matching of address information; and perform semantic matching processing on two acquired address information based on the second model to determine whether the two address information are the same, thereby obtaining a matching result for the two address information.
[0009] This specification also provides a storage medium for storing computer-executable instructions. When executed by a processor, these instructions implement the following process: acquiring a target model, which is a model used for semantic matching of text information; training the target model based on a preset training strategy and first sample data corresponding to the preset training strategy to obtain a trained first model; the preset training strategy includes one or more of the following: a sub-strategy for identifying fake addresses, a sub-strategy for predicting preset geographical regions in addresses, and a sub-strategy for predicting address distances; training the first model based on a preset model structure adjustment strategy and second sample data corresponding to the preset model structure adjustment strategy to obtain a trained second model, which is a model used for semantic matching of address information; and performing semantic matching processing on two acquired address information based on the second model to determine whether the two address information are the same, thereby obtaining a matching result for the two address information. Attached Figure Description
[0010] To more clearly illustrate the technical solutions in the embodiments or prior art of this specification, the drawings used in the description of the embodiments or prior art will be briefly introduced below. Obviously, the drawings described below are only some embodiments recorded in this specification. For those skilled in the art, other drawings can be obtained based on these drawings without creative effort.
[0011] Figure 1 This is an embodiment of an address information matching method described in this specification;
[0012] Figure 2 This is another embodiment of the address information matching method described in this specification;
[0013] Figure 3 This is a schematic diagram of a statement feature determination process described in this specification;
[0014] Figure 4 This is a schematic diagram of an address information fine-tuning process described in this specification;
[0015] Figure 5 This specification provides an embodiment of an address information matching device.
[0016] Figure 6 This specification describes an embodiment of an address information matching device. Detailed Implementation
[0017] This specification provides an embodiment of an address information matching method, apparatus, and device.
[0018] To enable those skilled in the art to better understand the technical solutions in this specification, the technical solutions in the embodiments of this specification will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of this specification, and not all embodiments. Based on the embodiments in this specification, all other embodiments obtained by those skilled in the art without creative effort should fall within the scope of protection of this specification.
[0019] Example 1
[0020] like Figure 1 As shown in the embodiments of this specification, an address information matching method is provided. The execution subject of this method can be a terminal device or a server. The terminal device can be a mobile phone, tablet computer, or a computer such as a laptop or desktop computer, or an IoT device (specifically, a smartwatch, in-vehicle device, etc.). The server can be a single server or a server cluster composed of multiple servers. The server can be a backend server for financial services or online shopping services, or a backend server for an application. This embodiment uses a server as an example for detailed description. For the execution process of the terminal device, please refer to the relevant content below, which will not be repeated here. The method specifically includes the following steps:
[0021] In step S102, a target model is obtained, which is a model used to match the semantics of text information.
[0022] The target model can be a model used to match the semantics of text information, that is, it can determine whether two pieces of text information express the same semantics. The text information can be information presented through preset languages and characters, such as Chinese characters, numbers, and specified symbols (e.g., "..."). Information consisting of symbols such as “#”, “-”, “+”, etc. can be set according to the actual situation, and the embodiments in this specification do not limit this.
[0023] In practice, semantic matching of text information is an application of significant practical importance, especially address matching. Address matching is a crucial task used to verify the authenticity and consistency of address information in various scenarios, such as delivery addresses and business addresses. Address matching is a highly challenging task because specific address information contains rich expressive forms. For example, Chinese address information in China typically consists of elements such as province, city, district, county, street, name, house number, alias, and remarks. Even when using the same address, different users may express it differently. In practical applications, to correlate and match address information in different forms, an address matching mechanism is needed. This mechanism can standardize addresses with different input formats, thereby resolving many business problems related to address issues.
[0024] Address matching mechanisms have been studied for a long time. Traditional address matching methods are mainly based on certain rules and approximate string matching. For example, distance can be calculated to determine whether two address strings are similar. Some researchers use address element-based methods to match addresses. That is, first, the address information is parsed into different address elements, and then, based on whether the corresponding address elements are consistent, a set threshold is used to determine whether two sets of address information are consistent. However, the above methods can only handle addresses with very standardized formats. In most actual business scenarios, address formats are rich and diverse, making it difficult to directly determine the consistency of address information based on differences in font. Therefore, a technical solution is needed that can accurately and effectively perform semantic matching on rich and diverse address information. The embodiments of this specification provide a feasible technical solution, which may include the following:
[0025] Target models for semantic matching of text information can be obtained in various ways. For example, an algorithm for semantic matching of text information can be selected, and a target model can be constructed using this algorithm. In this case, the target model can be an untrained model. Alternatively, a publicly available or open target model for semantic matching of text information can be obtained, where the target model has already been trained. In this case, the target model (such as the BERT model) can be directly applied to the corresponding scenario or business. The target model can be obtained by training the model using publicly available sample data (such as specified data that can be searched online, or data from a specified publicly available database). Yet another example is obtaining a publicly available or open target model for semantic matching of text information, where the target model has not been trained. In this case, the target model includes a certain model structure and initialized model parameters, which can be set according to the actual situation. This specification does not limit this in the embodiments.
[0026] In step S104, the target model is trained based on the preset training strategy and the first sample data corresponding to the preset training strategy to obtain the trained first model. The preset training strategy includes one or more of the following: a sub-strategy for identifying fake addresses, a sub-strategy for predicting preset geographical regions in addresses, and a sub-strategy for predicting address distances.
[0027] The preset training strategy can be a pre-defined strategy for training the target model so that the trained model can possess a certain function. In this embodiment, the preset training strategy can be a strategy that transfers the target model capable of semantic matching of text information to a model capable of matching address information. That is, the preset training strategy can transfer the target model from a general text information semantic space to an address semantic space. The fake address identification sub-strategy can be a strategy that pre-sets a certain number of fake address information and trains the target model with this fake address information so that the target model has the ability to identify real and fake address information. The preset geographic region prediction sub-strategy can be a strategy that pre-sets or selects a certain number of address information, and masks or removes one or more geographic region information contained in an appropriate amount of address information. Then, the target model is trained with the processed address information and the remaining unprocessed address information so that the target model has the ability to complete or restore the missing geographic region information in the address information. The address distance prediction sub-strategy can be a strategy that pre-sets or selects a certain number of address information pairs and trains the target model with these address information pairs so that the target model has the ability to determine the distance between two address information pairs. The first sample data can include a variety of types. It can be a certain number of false address information pre-set according to the sub-strategy of false address identification, or it can be the processed address information and the remaining unprocessed address information from a certain number of pre-set or selected address information corresponding to the sub-strategy of pre-set geographical area prediction in the address, or it can be a certain number of pre-set or selected address information corresponding to the sub-strategy of address distance prediction. The specific data can be set according to the actual situation. This specification does not limit the implementation of this embodiment.
[0028] In implementation, considering that the acquired target model may not accurately represent the data characteristics of short text information, especially address information, a training task related to address information can be set for the target model. This allows the trained model to generate a better semantic representation of address information. To this end, one or more model training tasks can be designed, i.e., a training strategy can be preset. This strategy can include one or more sub-strategies, such as a fake address identification sub-strategy, a sub-strategy for predicting a pre-defined geographical region within an address, and a sub-strategy for predicting address distance. Taking the fake address identification sub-strategy as an example, a certain number of fake address information can be obtained as first sample data. This first sample data can be input into the target model. Based on the fake address identification sub-strategy and the target model, it can be determined whether the first sample data is real or fake address information. Then, the model parameters in the target model can be adjusted based on the judgment result, ultimately obtaining the trained model, i.e., the first model. Furthermore, this training strategy can also include another seed strategy, or it can include two of the three sub-strategies mentioned above, or it can include all three sub-strategies mentioned above, etc. For specific processing procedures, please refer to the meaning of each sub-strategy and the processing procedures in the examples above; they will not be repeated here. Through the above method, the trained model can generate a better semantic representation of address information.
[0029] In step S106, the first model is trained based on the preset model structure adjustment strategy and the second sample data corresponding to the preset model structure adjustment strategy to obtain the trained second model. The second model is used to match the semantics of address information.
[0030] The preset model structure adjustment strategy can be to select a certain number of address information and adjust the model structure during the model training process using this address information, so that the final model can better learn the interaction process of semantic information of different granularities between the two address information. The second sample data can be address information, specifically a certain number of address information selected in the preset model structure adjustment strategy. The specific details can be set according to the actual situation, and this embodiment does not limit this.
[0031] In implementation, after the processing in step S104, the first model already possesses a certain address information matching capability. Compared with the original target model, the model parameters of the current first model have been adjusted to a certain extent, that is, the model parameters have changed. A model includes not only model parameters but also model structure. The model structures of the first model and the target model are almost unchanged. Based on this, the model structure of the first model (or the target model) can be appropriately adjusted based on the processing in step S104. Specifically, a model structure adjustment strategy can be preset. Based on the model structure adjustment strategy, a certain amount of address information can be obtained as second sample data. The second sample data can be input into the first... In one model, considering that the address information corresponding to the second sample data carries geographical area information (or administrative area information) and detailed address information, in order to prevent forgetting in the deep transformer nesting structure of the first model, the character features and word features in the address information corresponding to the second sample data can be explicitly fused with the sentence features after deep interaction encoding through a gating mechanism. Finally, it is connected to the classification layer of the first model to train the first model and adjust the model structure, and finally obtain the trained second model. Among them, the sentence features after deep interaction encoding can be processed by the character features and word features in the address information based on the multi-head attention mechanism. The results are fused and standardized to obtain the processed results. The processed results are forward propagated and fused and standardized to obtain the sentence features.
[0032] It should be noted that the above processing method is only one optional processing method. In practical applications, there may be many other different processing methods. For example, by using multiple sub-tasks to jointly constrain each other, it is possible to effectively capture the spatial semantic relationship between address information, learn the administrative level elements of address information and the subordinate relationship between administrative divisions without relying on external information, thereby semantically representing the single characters and their contextual information in address information, and maintaining the true relationship between two address information in high-dimensional space. The specific settings can be set according to the actual situation, and the embodiments in this specification do not limit this.
[0033] In step S108, the semantics of the two acquired address information are matched based on the second model to determine whether the two address information are the same, and the matching result of the two address information is obtained.
[0034] In implementation, the model parameters and structure of the second model have been adjusted to some extent through the above method, enabling the second model to be used for semantic matching of address information. This allows the second model to be deployed in relevant business applications. When a matching request for two address pieces of information is received, the two address pieces of information can be obtained and input into the second model. The second model performs semantic matching on the two address pieces of information to determine whether they are the same. If they are the same, a matching result for both is output; if they are different, a matching result for both is output. The specific settings can be configured according to actual circumstances, and this embodiment does not limit this.
[0035] This specification provides an embodiment of an address information matching method. The method involves acquiring a target model, which is used to semantically match text information. Then, the target model is trained based on a preset training strategy and corresponding first sample data to obtain a trained first model. The preset training strategy includes one or more sub-strategies such as fake address identification, preset geographical region prediction within the address, and address distance prediction. The first model is then trained based on a preset model structure adjustment strategy and corresponding second sample data to obtain a trained second model. This second model is used to semantically match address information. Finally, the method can be used to match the acquired address information. The semantics of two address pieces of information are matched to determine whether they are identical, thus obtaining a matching result. Based on the geographical and semantic characteristics of the strings in the address information, a targeted processing method for the target model is designed. Pre-training is performed on a large amount of unsupervised address information sample data, enabling the target model to acquire relevant address information knowledge to a certain extent. Furthermore, the model structure is adjusted according to the special element composition of the strings in the address information, allowing the final model to better learn the interaction process of different granularities of semantic information in the two address pieces of information. This allows the model to learn more finely how to determine whether two address pieces of information match, thereby ensuring the accuracy of the second model's semantic matching of address information.
[0036] Example 2
[0037] like Figure 2As shown in the embodiments of this specification, an address information matching method is provided. The execution subject of this method can be a terminal device or a server. The terminal device can be a mobile phone, tablet computer, or a computer such as a laptop or desktop computer, or an IoT device (specifically, a smartwatch, in-vehicle device, etc.). The server can be a single server or a server cluster composed of multiple servers. The server can be a backend server for financial services or online shopping services, or a backend server for an application. This embodiment uses a server as an example for detailed description. For the execution process of the terminal device, please refer to the relevant content below, which will not be repeated here. The method specifically includes the following steps:
[0038] In step S202, a target model trained based on a preset text corpus is obtained. The preset text corpus includes a corpus composed of texts from online encyclopedias presented in a preset language. The target model is a model used to match the semantics of text information.
[0039] The target model is the BERT model, which is a Transformer-based encoder. The main model structure is a stack of Transformers. The encoder of each Transformer layer in the BERT model obtains a corresponding number of latent vectors and passes them to the next Transformer layer. This process is repeated layer by layer until the final output result is obtained.
[0040] In implementation, the model structure of the BERT model used for semantic matching of text information can be selected, and the BERT model can be constructed using corresponding algorithms. The BERT model can be trained using a pre-defined text corpus composed of text from an online encyclopedia presented in a pre-defined language, resulting in the trained BERT model, i.e., the target model. Alternatively, the trained target model can be directly obtained. The target model can be obtained by training the model using a pre-defined text corpus composed of text from an online encyclopedia presented in a pre-defined language, etc. The specific details can be set according to the actual situation, and this specification does not limit this aspect in the embodiments.
[0041] Based on the language model knowledge of the general preset language learned by the target model, deep pre-training can be performed on the address information dataset so that the target model can learn domain knowledge related to address information. For details, please refer to the processing in step S204 below.
[0042] In step S204, the target model is trained based on a preset training strategy and the first sample data corresponding to the preset training strategy to obtain a trained first model. The preset training strategy includes one or more of the following: a sub-strategy for identifying fake addresses, a sub-strategy for predicting preset geographical regions in addresses, and a sub-strategy for predicting address distances.
[0043] The specific processing of step S204 above can be found in the relevant content of Embodiment 1 above, and will not be repeated here.
[0044] In practical applications, the specific processing of step S204 above can also be implemented in a variety of different ways. The following provides three types of specific processing methods, which can be found in Type 1 to Type 3 below.
[0045] Type 1: If the preset training strategy includes a sub-strategy for identifying fake addresses, then the specific processing of step S204 above can be found in the processing of steps A2 to A6 below.
[0046] In step A2, multiple different first pre-selected address information is obtained, and a preset number of first pre-selected address information is selected from the multiple different first pre-selected address information.
[0047] In practice, the first pre-selected address information can be obtained in various ways. For example, address information can be collected by purchasing it from users and used as the first pre-selected address information. Alternatively, address information can be crawled from the Internet by web crawlers and used as the first pre-selected address information. The specific method can be set according to the actual situation, and the embodiments in this specification do not limit this.
[0048] A preset number of first pre-selected address information can be selected from multiple different first pre-selected address information according to a preset selection rule, or a preset number of first pre-selected address information can be selected randomly from multiple different first pre-selected address information, or a preset number of first pre-selected address information can be selected from multiple different first pre-selected address information based on a preset selection probability, etc. The preset number can be set according to the actual situation. For example, the preset number can be 50% of the total number of multiple different first pre-selected address information, or the preset number can be 60% of the total number of multiple different first pre-selected address information, etc. The specific setting can be set according to the actual situation, and the embodiments in this specification do not limit it.
[0049] In step A4, the first preset geographical area information is removed from the first preset address information of a preset number, and the remaining part is concatenated with the second preset geographical area information to obtain the second preset address information of a preset number.
[0050] The first preset geographical region information can be information about administrative regions. Administrative regions can be areas divided into levels by a country or region for the convenience of administration, such as provinces, cities, counties, and towns. The second preset geographical region information can also be information about administrative regions.
[0051] In implementation, the first preset geographical region information can be removed from the preset number of first preset address information to obtain the preset number of first preset address information lacking administrative region information (i.e., the remaining part). Then, the preset number of first preset address information lacking administrative region information can be concatenated with the preset second preset geographical region information to generate a formally complete address information, which is the preset number of second preset address information.
[0052] It should be noted that the second preset geographic region information can originate from address information different from the first preset address information, or it can be pre-set administrative region information, etc. The specific settings can be configured according to the actual situation. For example, 50% of the total number of first preset address information can be selected from multiple different first preset address information, or two first preset address information can be selected from the 50% first preset address information. For instance, the two first preset address information might be: Wuyue Plaza opposite Shiziqiao Pedestrian Street, Hubei Road, Gulou District, Nanjing City, Jiangsu Province, and Hami Hotel next to the Wetland Park, Yingbin Road, Yizhou District, Hami City, Xinjiang Uygur Autonomous Region. The first preset geographic region information is removed from both the first and second first preset address information. The portion of the second first preset address information from which the first preset geographic region information has been removed is then appended to the first preset geographic region information in the first first preset address information to generate a new address information. The portion of the first first preset address information from which the first preset geographic region information has been removed is then appended to the first preset geographic region information in the first first preset address information to generate a new address information. Partially appended to the first preset geographic region information in the second first pre-selected address information, another new address information is generated. For example, "Jiangsu Province, Nanjing City, Gulou District, Hubei Road, opposite Shiziqiao Pedestrian Street, Wuyue Plaza" removes the first preset geographic region information "Jiangsu Province, Nanjing City, Gulou District, Hubei Road", and the remaining part is "Jiangsu Province, Nanjing City, Gulou District, Hubei Road, opposite Shiziqiao Pedestrian Street, Wuyue Plaza"; "Xinjiang Uygur Autonomous Region, Hami City, Yizhou District, Yingbin Road, next to the Wetland Park, Hami Hotel" removes the first preset geographic region information "Xinjiang Uygur Autonomous Region, Hami City, Yizhou District, Yingbin Road", and the remaining part is "Xinjiang Uygur Autonomous Region, Yizhou District, Yingbin Road". For example, the address "Hami Hotel next to the Wetland Park" can be generated by concatenating "Hubei Road, Gulou District, Nanjing City, Jiangsu Province" with "Hami Hotel next to the Wetland Park". Similarly, "Yingbin Road, Yizhou District, Hami City, Xinjiang Uygur Autonomous Region" can be concatenated with "Wuyue Plaza opposite Shiziqiao Pedestrian Street" to generate "Wuyue Plaza opposite Shiziqiao Pedestrian Street, Yingbin Road, Yizhou District, Hami City, Xinjiang Uygur Autonomous Region". This process can be repeated to obtain a preset number of second pre-selected address information.
[0053] In step A6, a preset number of second pre-selected address information and a number of different first pre-selected address information other than the preset number of first pre-selected address information are used as first sample data. The first sample data is used to train the target model so as to predict whether the address information corresponding to the first sample data is real, and the trained first model is obtained.
[0054] Type 2: The preset training strategy includes a sub-strategy for predicting the preset geographical area in the address. The specific processing of step S204 above can be found in the processing of steps B2 and B4 below.
[0055] In step B2, multiple different third pre-selected address information are obtained.
[0056] The third pre-selected address information may be different from the second pre-selected address information, or it may be partially the same as, partially different from, or both of them may be the same. The specific settings can be determined according to the actual situation, and the embodiments in this specification are not limited in this regard.
[0057] In practice, third pre-selected address information can be obtained in various ways. For example, address information can be collected by purchasing it from users and used as the third pre-selected address information. Alternatively, address information can be crawled from the Internet by web crawlers and used as the third pre-selected address information. The specific method can be set according to the actual situation, and the embodiments in this specification do not limit this.
[0058] In step B4, the preset geographical area information is removed from multiple different third pre-selected address information, and the remaining part is used as the first sample data. The first sample data is used to train the target model so that the target model can predict the preset geographical area information removed from the third pre-selected address information, and the trained first model is obtained.
[0059] The preset geographic region information removed from the third pre-selected address information can be one administrative region or multiple administrative regions. For example, removing the geographic region information of the province from the third pre-selected address information, specifically removing "Jiangsu Province" from the address information "Wuyue Plaza opposite Shiziqiao Pedestrian Street, Hubei Road, Gulou District, Nanjing City, Jiangsu Province", or removing "Nanjing City", etc., can be set according to the actual situation. The preset geographic region information can be information of administrative regions.
[0060] In practice, one or more geographic region information can be randomly masked from multiple different third pre-selected address information. Then, the target model is used to predict the masked geographic region information based on the remaining part, thereby restoring the third pre-selected address information. In this way, the target model can learn the hierarchical relationship of administrative regions and also learn to a certain extent the ability of different address information to be located in administrative regions.
[0061] Type 3: If the preset training strategy includes a sub-strategy for address distance prediction, then the specific processing of step S204 above can be found in the processing of steps C2 and C4 below.
[0062] In step C2, multiple different pairs of preselected address information are obtained, and the distance between the two preselected address information in each pair is obtained.
[0063] In implementation, considering that the main task of the target model is to determine whether two address information pairs match, the pre-training task can also be designed with two address information inputs to satisfy a common model structure, thus obtaining multiple different pre-selected address information pairs. For each address information pair, the relative distance between them can be calculated based on their respective latitude and longitude information.
[0064] In step C4, multiple different pre-selected address information pairs are input into the target model. The target model predicts the distance interval between the two pre-selected address information pairs in each pair and uses the distance between the two pre-selected address information pairs as training labels to train the target model and obtain the first trained model.
[0065] The distance range can be set according to the actual situation, such as within 100 meters, 100 meters to 1 kilometer, 1 kilometer to 5 kilometers, 5 kilometers to 10 kilometers, and more than 10 kilometers. The above is only one optional division method. In actual application, there can be many different division methods, which can be set according to the actual situation.
[0066] In practice, if the target model is directly used as a regression task to predict the relative distance values mentioned above, the implementation difficulty for the target model may be quite high. Therefore, the process can be simplified by binning different distance information to obtain multiple different distance intervals. Then, multiple different pre-selected address information pairs are input into the target model. The target model predicts the distance interval between the two pre-selected address information pairs in each pair. The distance between the two pre-selected address information pairs in each pair is used as the training label to train the target model and obtain the first trained model. In this way, the task of predicting the target of the target model is transformed into a relatively easy classification task.
[0067] After the above processing, the first model has a certain address information recognition capability. Next, the model structure (or network structure) of the first model can be modified on this basis, and the first model can be fine-tuned on the specified sample data. For details, please refer to the processing of steps S206 to S212 below.
[0068] In step S206, the semantic entities contained in the address information pairs corresponding to the second sample data are identified based on the pre-trained address entity recognition model, so as to obtain the semantic entities contained in the address information pairs corresponding to each second sample data.
[0069] In implementation, such as Figure 3As shown, address information can be divided into three parts based on its hierarchical structure. Each part can be considered a semantic entity (or address entity). These three parts can be administrative region information, detailed house number information, and other information (such as aliases). An address entity recognition model can be pre-trained using sample data to identify different address entities within the address information. Different address entities have varying degrees of influence on determining whether two address pieces are consistent. Through this method, the target model can learn different discriminative abilities for different semantic blocks with finer granularity.
[0070] In step S208, max pooling is performed on the different types of semantic entities contained in each address information pair. Then, the max pooled data is fused and classified using the same type of semantic entities contained in the address information pair. The processed data is then input into the fully connected layer in the first model to obtain the statement features obtained by interactively encoding the statements contained in the second sample data.
[0071] In implementation, such as Figure 3 As shown, max pooling is performed on the different types of semantic entities contained in each address information pair. Then, the max-pooled data is fused and classified using the semantic entities of the same type contained in the address information pair. Specifically, after the three semantic entity aggregation codes of two address information pairs are calculated, the data can be fused and classified. The fusion process is performed, in which... and The max-pooled encodings of the same semantic entity representing two address information can be used to concatenate the original encodings of the two semantic entities, the bitwise subtraction of the two encodings, and the bitwise multiplication of the two encodings as a fusion representation of the interaction of the same semantic entity at different addresses. Finally, the processed data is input into the fully connected layer in the first model to obtain the encoding information of the two address information after deep interaction, that is, the sentence features obtained after the interaction encoding of the sentences contained in the second sample data.
[0072] In step S210, the features of the characters contained in the second sample data are fused with the features of the statements obtained by interactive encoding of the statements contained in the second sample data through a preset gating mechanism to obtain the fused features.
[0073] In implementation, such as Figure 4 As shown, a preset gating mechanism (i.e.) can be used. Figure 4The Input Gate in the second sample data integrates the character features of the second sample data with the sentence features obtained by interactive encoding of the second sample data, and then integrates them to obtain the integrated features.
[0074] In step S212, the fused features are input into the network layer of the first model for classification, so as to adjust the model structure of the first model and train the first model to obtain the trained second model.
[0075] In implementation, such as Figure 4 As shown, the fused features are fed into the network layer of the first model for classification and classification prediction, so as to adjust the model structure of the first model and train the first model to obtain the trained second model.
[0076] In step S214, the semantics of the two acquired address information are matched based on the second model to determine whether the two address information are the same, and the matching result of the two address information is obtained.
[0077] This specification provides an embodiment of an address information matching method. The method involves acquiring a target model, which is used to semantically match text information. Then, the target model is trained based on a preset training strategy and corresponding first sample data to obtain a trained first model. The preset training strategy includes one or more sub-strategies such as fake address identification, preset geographical region prediction within the address, and address distance prediction. The first model is then trained based on a preset model structure adjustment strategy and corresponding second sample data to obtain a trained second model. This second model is used to semantically match address information. Finally, the method can be used to match the acquired address information. The semantics of two address pieces of information are matched to determine whether they are identical, thus obtaining a matching result. Based on the geographical and semantic characteristics of the strings in the address information, a targeted processing method for the target model is designed. Pre-training is performed on a large amount of unsupervised address information sample data, enabling the target model to acquire relevant address information knowledge to a certain extent. Furthermore, the model structure is adjusted according to the special element composition of the strings in the address information, allowing the final model to better learn the interaction process of different granularities of semantic information in the two address pieces of information. This allows the model to learn more finely how to determine whether two address pieces of information match, thereby ensuring the accuracy of the second model's semantic matching of address information.
[0078] Example 3
[0079] The above describes the address information matching method provided in the embodiments of this specification. Based on the same idea, the embodiments of this specification also provide an address information matching device, such as... Figure 5 As shown.
[0080] The address information matching device includes: an initial model acquisition module 501, a first training module 502, a second training module 503, and an address information matching module 504, wherein:
[0081] The initial model acquisition module 501 acquires the target model, which is a model used for semantic matching of text information;
[0082] The first training module 502 trains the target model based on a preset training strategy and the first sample data corresponding to the preset training strategy to obtain a trained first model. The preset training strategy includes one or more of the following: a sub-strategy for identifying fake addresses, a sub-strategy for predicting preset geographical regions in addresses, and a sub-strategy for predicting address distances.
[0083] The second training module 503 trains the first model based on a preset model structure adjustment strategy and the second sample data corresponding to the preset model structure adjustment strategy to obtain a trained second model. The second model is used to match the semantics of address information.
[0084] The address information matching module 504 performs semantic matching processing on the two acquired address information based on the second model to determine whether the two address information are the same, and obtains the matching result of the two address information.
[0085] In this embodiment of the specification, the initial model acquisition module 501 acquires a target model trained based on a preset text corpus, wherein the preset text corpus includes a corpus composed of texts from online encyclopedias presented in a preset language.
[0086] In this embodiment of the specification, the preset training strategy includes a sub-strategy for identifying fake addresses, and the first training module 502 includes:
[0087] The first information selection unit acquires multiple different first pre-selected address information and selects a preset number of first pre-selected address information from the multiple different first pre-selected address information;
[0088] The information splicing unit removes the first preset geographical area information from the preset number of first preset address information, and splices the remaining part with the second preset geographical area information to obtain the preset number of second preset address information.
[0089] The first training unit takes a preset number of second pre-selected address information and the first pre-selected address information other than the preset number of first pre-selected address information from the plurality of different first pre-selected address information as first sample data, and uses the first sample data to train the target model, so as to predict whether the address information corresponding to the first sample data is real through the target model, and obtain the trained first model.
[0090] In the embodiments described in this specification, the first preset geographical area information is information about an administrative region.
[0091] In this embodiment of the specification, the preset training strategy includes a sub-strategy for predicting preset geographical regions in the address, and the first training module 502 includes:
[0092] The second information acquisition unit acquires multiple different third pre-selected address information;
[0093] The second training unit removes preset geographical area information from multiple different third pre-selected address information, uses the remaining part as first sample data, and uses the first sample data to train the target model, so as to predict the preset geographical area information removed from the third pre-selected address information through the target model, and obtains the trained first model.
[0094] In this embodiment of the specification, the preset training strategy includes a sub-strategy for address distance prediction, and the first training module 502 includes:
[0095] The third information acquisition unit acquires multiple different pairs of pre-selected address information and acquires the distance between the two pre-selected address information in each pair.
[0096] The third training unit inputs the multiple different pre-selected address information pairs into the target model, predicts the distance interval between the two pre-selected address information pairs in each pre-selected address information pair through the target model, and uses the distance between the two pre-selected address information pairs in each pre-selected address information pair as training labels to train the target model and obtain the trained first model.
[0097] In this embodiment of the specification, the second training module 503 includes:
[0098] The fusion unit, through a preset gating mechanism, fuses the features of the characters contained in the second sample data with the features of the statements obtained after interactive encoding of the statements contained in the second sample data, to obtain the fused features.
[0099] The training unit inputs the fused features into the network layer of the first model for classification, so as to adjust the model structure of the first model and train the first model to obtain the trained second model.
[0100] In the embodiments described in this specification, the device further includes:
[0101] The entity determination module identifies the semantic entities contained in the address information pairs corresponding to the second sample data based on a pre-trained address entity recognition model, thereby obtaining the semantic entities contained in each address information pair corresponding to the second sample data.
[0102] The feature determination module performs max pooling on the different types of semantic entities contained in each address information pair, and performs fusion and classification processing on the same type of semantic entities contained in the address information pair. The processed data is then input into the fully connected layer in the first model to obtain the statement features obtained by interactively encoding the statements contained in the second sample data.
[0103] In the embodiments described in this specification, the target model is the BERT model.
[0104] This specification provides an address information matching device. It acquires a target model, which is used to match the semantics of text information. Then, it trains the target model based on a preset training strategy and first sample data corresponding to the preset training strategy to obtain a trained first model. The preset training strategy includes one or more of the following: a sub-strategy for identifying fake addresses, a sub-strategy for predicting preset geographical regions in addresses, and a sub-strategy for predicting address distances. The first model is then trained based on a preset model structure adjustment strategy and second sample data corresponding to the preset model structure adjustment strategy to obtain a trained second model. The second model is used to match the semantics of address information. Finally, the device can be used to match the acquired address information. The semantics of two address pieces of information are matched to determine whether they are identical, thus obtaining a matching result. Based on the geographical and semantic characteristics of the strings in the address information, a targeted processing method for the target model is designed. Pre-training is performed on a large amount of unsupervised address information sample data, enabling the target model to acquire relevant address information knowledge to a certain extent. Furthermore, the model structure is adjusted according to the special element composition of the strings in the address information, allowing the final model to better learn the interaction process of different granularities of semantic information in the two address pieces of information. This allows the model to learn more finely how to determine whether two address pieces of information match, thereby ensuring the accuracy of the second model's semantic matching of address information.
[0105] Example 4
[0106] The above describes the address information matching device provided in the embodiments of this specification. Based on the same idea, the embodiments of this specification also provide an address information matching device, such as... Figure 6 As shown.
[0107] The address information matching device can be a terminal device or server, as described in the above embodiments.
[0108] Address information matching devices can vary significantly due to differences in configuration or performance. They may include one or more processors 601 and a memory 602, where one or more application programs or data may be stored. The memory 602 can be temporary or persistent storage. The application programs stored in the memory 602 may include one or more modules (not shown in the figures), each module including a series of computer-executable instructions in the address information matching device. Furthermore, the processor 601 may be configured to communicate with the memory 602 and execute the series of computer-executable instructions in the memory 602 on the address information matching device. The address information matching device may also include one or more power supplies 603, one or more wired or wireless network interfaces 604, one or more input / output interfaces 605, and one or more keyboards 606.
[0109] Specifically, in this embodiment, the address information matching device includes a memory and one or more programs, wherein one or more programs are stored in the memory, and one or more programs may include one or more modules, and each module may include a series of computer-executable instructions in the address information matching device, and is configured to be executed by one or more processors. The one or more programs include computer-executable instructions for performing the following:
[0110] Obtain a target model, which is used to match the semantics of text information;
[0111] The target model is trained based on a preset training strategy and the first sample data corresponding to the preset training strategy to obtain a trained first model. The preset training strategy includes one or more of the following: a sub-strategy for identifying fake addresses, a sub-strategy for predicting preset geographical regions in addresses, and a sub-strategy for predicting address distances.
[0112] The first model is trained based on a preset model structure adjustment strategy and the second sample data corresponding to the preset model structure adjustment strategy to obtain a trained second model. The second model is used to match the semantics of address information.
[0113] Based on the second model, the semantics of the two acquired address information are matched to determine whether the two address information are the same, and the matching result of the two address information is obtained.
[0114] In the embodiments of this specification, obtaining the target model includes:
[0115] Obtain a target model trained based on a preset text corpus, wherein the preset text corpus includes a corpus composed of texts from online encyclopedias presented in a preset language.
[0116] In this embodiment of the specification, the preset training strategy includes a sub-strategy for identifying fake addresses. The step of training the target model based on the preset training strategy and the first sample data corresponding to the preset training strategy to obtain a trained first model includes:
[0117] Obtain multiple different first pre-selected address information, and select a preset number of first pre-selected address information from the multiple different first pre-selected address information;
[0118] Remove the first preset geographical region information from the first preset address information of a preset number, and concatenate the remaining part with the second preset geographical region information to obtain the second preset address information of a preset number.
[0119] A preset number of second pre-selected address information and the first pre-selected address information other than the preset number of first pre-selected address information from the plurality of different first pre-selected address information are used as first sample data. The first sample data is used to train the target model so as to predict whether the address information corresponding to the first sample data is real through the target model, and thus obtain the trained first model.
[0120] In the embodiments described in this specification, the first preset geographical area information is information about an administrative region.
[0121] In this embodiment of the specification, the preset training strategy includes a sub-strategy for predicting preset geographical regions in addresses. The step of training the target model based on the preset training strategy and the first sample data corresponding to the preset training strategy to obtain a trained first model includes:
[0122] Obtain multiple different third-party pre-selected address information;
[0123] The preset geographical area information is removed from multiple different third pre-selected address information, and the remaining part is used as the first sample data. The first sample data is used to train the target model so as to predict the preset geographical area information removed from the third pre-selected address information through the target model, and thus obtain the trained first model.
[0124] In this embodiment of the specification, the preset training strategy includes a sub-strategy for address distance prediction. The step of training the target model based on the preset training strategy and the first sample data corresponding to the preset training strategy to obtain a trained first model includes:
[0125] Obtain multiple different pairs of preselected address information, and obtain the distance between the two preselected address information in each pair;
[0126] The multiple different pre-selected address information pairs are respectively input into the target model. The target model predicts the distance interval between the two pre-selected address information pairs in each pre-selected address information pair. The distance between the two pre-selected address information pairs in each pre-selected address information pair is used as the training label to train the target model and obtain the trained first model.
[0127] In this embodiment of the specification, the step of training the first model based on a preset model structure adjustment strategy and second sample data corresponding to the preset model structure adjustment strategy to obtain a trained second model includes:
[0128] By using a preset gating mechanism, the features of the characters contained in the second sample data are fused with the features of the statements obtained after interactive encoding of the statements contained in the second sample data, and the fused features are obtained.
[0129] The fused features are input into the network layer of the first model for classification to adjust the model structure of the first model, and the first model is trained to obtain the trained second model.
[0130] The embodiments in this specification also include:
[0131] The semantic entities contained in the address information pairs corresponding to the second sample data are identified based on the pre-trained address entity recognition model, so as to obtain the semantic entities contained in each address information pair corresponding to the second sample data.
[0132] Max pooling is performed on the different types of semantic entities contained in each address information pair. Then, the max pooled data is fused and classified using the same type of semantic entities contained in the address information pair. The processed data is then input into the fully connected layer in the first model to obtain the statement features obtained by interactively encoding the statements contained in the second sample data.
[0133] In the embodiments described in this specification, the target model is the BERT model.
[0134] This specification provides an address information matching device. It acquires a target model, which is used to semantically match text information. Then, it trains the target model based on a preset training strategy and corresponding first sample data to obtain a trained first model. The preset training strategy includes one or more sub-strategies: a fake address identification sub-strategy, a preset geographical region prediction sub-strategy, and an address distance prediction sub-strategy. Finally, it trains the first model based on a preset model structure adjustment strategy and corresponding second sample data to obtain a trained second model. The second model is used to semantically match address information. Ultimately, the device can be used to match acquired address information. The semantics of two address pieces of information are matched to determine whether they are identical, thus obtaining a matching result. Based on the geographical and semantic characteristics of the strings in the address information, a targeted processing method for the target model is designed. Pre-training is performed on a large amount of unsupervised address information sample data, enabling the target model to acquire relevant address information knowledge to a certain extent. Furthermore, the model structure is adjusted according to the special element composition of the strings in the address information, allowing the final model to better learn the interaction process of different granularities of semantic information in the two address pieces of information. This allows the model to learn more finely how to determine whether two address pieces of information match, thereby ensuring the accuracy of the second model's semantic matching of address information.
[0135] Example 5
[0136] Furthermore, based on the above Figures 1 to 4 The method shown in this specification, along with one or more embodiments, also provides a storage medium for storing computer-executable instruction information. In one specific embodiment, the storage medium can be a USB flash drive, optical disc, hard disk, etc. When the computer-executable instruction information stored in the storage medium is executed by a processor, it can achieve the following process:
[0137] Obtain a target model, which is used to match the semantics of text information;
[0138] The target model is trained based on a preset training strategy and the first sample data corresponding to the preset training strategy to obtain a trained first model. The preset training strategy includes one or more of the following: a sub-strategy for identifying fake addresses, a sub-strategy for predicting preset geographical regions in addresses, and a sub-strategy for predicting address distances.
[0139] The first model is trained based on a preset model structure adjustment strategy and the second sample data corresponding to the preset model structure adjustment strategy to obtain a trained second model. The second model is used to match the semantics of address information.
[0140] Based on the second model, the semantics of the two acquired address information are matched to determine whether the two address information are the same, and the matching result of the two address information is obtained.
[0141] In the embodiments of this specification, obtaining the target model includes:
[0142] Obtain a target model trained based on a preset text corpus, wherein the preset text corpus includes a corpus composed of texts from online encyclopedias presented in a preset language.
[0143] In this embodiment of the specification, the preset training strategy includes a sub-strategy for identifying fake addresses. The step of training the target model based on the preset training strategy and the first sample data corresponding to the preset training strategy to obtain a trained first model includes:
[0144] Obtain multiple different first pre-selected address information, and select a preset number of first pre-selected address information from the multiple different first pre-selected address information;
[0145] Remove the first preset geographical region information from the first preset address information of a preset number, and concatenate the remaining part with the second preset geographical region information to obtain the second preset address information of a preset number.
[0146] A preset number of second pre-selected address information and the first pre-selected address information other than the preset number of first pre-selected address information from the plurality of different first pre-selected address information are used as first sample data. The first sample data is used to train the target model so as to predict whether the address information corresponding to the first sample data is real through the target model, and thus obtain the trained first model.
[0147] In the embodiments described in this specification, the first preset geographical area information is information about an administrative region.
[0148] In this embodiment of the specification, the preset training strategy includes a sub-strategy for predicting preset geographical regions in addresses. The step of training the target model based on the preset training strategy and the first sample data corresponding to the preset training strategy to obtain a trained first model includes:
[0149] Obtain multiple different third-party pre-selected address information;
[0150] The preset geographical area information is removed from multiple different third pre-selected address information, and the remaining part is used as the first sample data. The first sample data is used to train the target model so as to predict the preset geographical area information removed from the third pre-selected address information through the target model, and thus obtain the trained first model.
[0151] In this embodiment of the specification, the preset training strategy includes a sub-strategy for address distance prediction. The step of training the target model based on the preset training strategy and the first sample data corresponding to the preset training strategy to obtain a trained first model includes:
[0152] Obtain multiple different pairs of preselected address information, and obtain the distance between the two preselected address information in each pair;
[0153] The multiple different pre-selected address information pairs are respectively input into the target model. The target model predicts the distance interval between the two pre-selected address information pairs in each pre-selected address information pair. The distance between the two pre-selected address information pairs in each pre-selected address information pair is used as the training label to train the target model and obtain the trained first model.
[0154] In this embodiment of the specification, the step of training the first model based on a preset model structure adjustment strategy and second sample data corresponding to the preset model structure adjustment strategy to obtain a trained second model includes:
[0155] By using a preset gating mechanism, the features of the characters contained in the second sample data are fused with the features of the statements obtained after interactive encoding of the statements contained in the second sample data, and the fused features are obtained.
[0156] The fused features are input into the network layer of the first model for classification to adjust the model structure of the first model, and the first model is trained to obtain the trained second model.
[0157] The embodiments in this specification also include:
[0158] The semantic entities contained in the address information pairs corresponding to the second sample data are identified based on the pre-trained address entity recognition model, so as to obtain the semantic entities contained in each address information pair corresponding to the second sample data.
[0159] Max pooling is performed on the different types of semantic entities contained in each address information pair. Then, the max pooled data is fused and classified using the same type of semantic entities contained in the address information pair. The processed data is then input into the fully connected layer in the first model to obtain the statement features obtained by interactively encoding the statements contained in the second sample data.
[0160] In the embodiments described in this specification, the target model is the BERT model.
[0161] This specification provides a storage medium that acquires a target model, a model used for semantic matching of text information. Then, based on a preset training strategy and first sample data corresponding to the preset training strategy, the target model is trained to obtain a trained first model. The preset training strategy includes one or more of the following: a sub-strategy for identifying fake addresses, a sub-strategy for predicting preset geographical regions in addresses, and a sub-strategy for predicting address distances. Based on a preset model structure adjustment strategy and second sample data corresponding to the preset model structure adjustment strategy, the first model is trained to obtain a trained second model, a model used for semantic matching of address information. Finally, the acquired two... The semantics of two address pieces of information are matched to determine if they are identical, thus obtaining a matching result. Based on the geographical and semantic characteristics of the strings in the address information, a targeted processing method for the target model is designed. Pre-training is performed on a large amount of unsupervised address information sample data, enabling the target model to acquire relevant address information knowledge to a certain extent. Furthermore, the model structure is adjusted according to the special element composition of the strings in the address information, allowing the final model to better learn the interaction process of different granularities of semantic information in the two address pieces of information. This allows the model to learn more finely how to determine whether two address pieces of information match, thereby ensuring the accuracy of the second model's semantic matching of address information.
[0162] The foregoing has described specific embodiments of this specification. Other embodiments are within the scope of the appended claims. In some cases, the actions or steps recited in the claims may be performed in a different order than that shown in the embodiments and may still achieve the desired result. Furthermore, the processes depicted in the drawings do not necessarily require the specific or sequential order shown to achieve the desired result. In some embodiments, multitasking and parallel processing are possible or may be advantageous.
[0163] In the 1990s, improvements to a technology could be clearly distinguished as either hardware improvements (e.g., improvements to the circuit structure of diodes, transistors, switches, etc.) or software improvements (improvements to the methodology). However, with technological advancements, many methodological improvements today can be considered direct improvements to the hardware circuit structure. Designers almost always obtain the corresponding hardware circuit structure by programming the improved methodology into the hardware circuit. Therefore, it cannot be said that a methodological improvement cannot be implemented using hardware physical modules. For example, a Programmable Logic Device (PLD) (such as a Field Programmable Gate Array (FPGA)) is such an integrated circuit whose logic function is determined by the user programming the device. Designers can program and "integrate" a digital system onto a PLD themselves, without needing chip manufacturers to design and manufacture dedicated integrated circuit chips. Furthermore, nowadays, instead of manually manufacturing integrated circuit chips, this programming is mostly implemented using "logic compiler" software. Similar to the software compiler used in program development, the original code before compilation must also be written in a specific programming language, called a Hardware Description Language (HDL). There are many HDLs, such as ABEL (Advanced Boolean Expression Language), AHDL (Altera Hardware Description Language), Confluence, CUPL (Cornell University Programming Language), HDCal, JHDL (Java Hardware Description Language), Lava, Lola, MyHDL, PALASM, and RHDL (Ruby Hardware Description Language). Currently, the most commonly used are VHDL (Very-High-Speed Integrated Circuit Hardware Description Language) and Verilog. Those skilled in the art should also understand that by simply performing some logic programming on the method flow using one of these hardware description languages and programming it into an integrated circuit, the hardware circuit implementing the logical method flow can be easily obtained.
[0164] The controller can be implemented in any suitable manner. For example, it can take the form of a microprocessor or processor and a computer-readable medium storing computer-readable program code (e.g., software or firmware) executable by the (micro)processor, logic gates, switches, application-specific integrated circuits (ASICs), programmable logic controllers, and embedded microcontrollers. Examples of controllers include, but are not limited to, the following microcontrollers: ARC 625D, Atmel AT91SAM, Microchip PIC18F26K20, and Silicon Labs C8051F320. A memory controller can also be implemented as part of the control logic of the memory. Those skilled in the art will also recognize that, in addition to implementing the controller in purely computer-readable program code form, the same functionality can be achieved by logically programming the method steps to make the controller take the form of logic gates, switches, application-specific integrated circuits, programmable logic controllers, and embedded microcontrollers. Therefore, such a controller can be considered a hardware component, and the means included therein for implementing various functions can also be considered as structures within the hardware component. Alternatively, the means for implementing various functions can be considered as both software modules implementing the method and structures within the hardware component.
[0165] The systems, devices, modules, or units described in the above embodiments can be implemented by computer chips or entities, or by products with certain functions. A typical implementation device is a computer. Specifically, a computer can be, for example, a personal computer, laptop computer, cellular phone, camera phone, smartphone, personal digital assistant, media player, navigation device, email device, game console, tablet computer, wearable device, or any combination of these devices.
[0166] For ease of description, the above apparatus is described by dividing it into various functional units. Of course, when implementing one or more embodiments of this specification, the functions of each unit can be implemented in one or more software and / or hardware.
[0167] Those skilled in the art will understand that the embodiments of this specification can be provided as methods, systems, or computer program products. Therefore, one or more embodiments of this specification may take the form of a completely hardware embodiment, a completely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, one or more embodiments of this specification may take the form of a computer program product implemented on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) containing computer-usable program code.
[0168] Embodiments in this specification are described with reference to flowchart illustrations and / or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of this specification. It will be understood that each block of the flowchart illustrations and / or block diagrams, and combinations of blocks in the flowchart illustrations and / or block diagrams, can be implemented by computer program instructions. These computer program instructions can be provided to a processor of a general-purpose computer, special-purpose computer, embedded processor, or other programmable parallel device to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable parallel device, generate instructions for implementing the flowchart illustrations. Figure 1 One or more processes and / or boxes Figure 1 A device that provides the functions specified in one or more boxes.
[0169] These computer program instructions may also be stored in a computer-readable storage medium that can direct a computer or other programmable fraud device to operate in a particular manner, such that the instructions stored in the computer-readable storage medium produce an article of manufacture including instruction means, which are implemented in a process Figure 1 One or more processes and / or boxes Figure 1 The function specified in one or more boxes.
[0170] These computer program instructions can also be loaded onto a computer or other programmable device, causing a series of operational steps to be performed on the computer or other programmable device to produce a computer-implemented process, thereby providing instructions that execute on the computer or other programmable device for implementing the process. Figure 1 One or more processes and / or boxes Figure 1 The steps of the function specified in one or more boxes.
[0171] In a typical configuration, a computing device includes one or more processors (CPU), input / output interfaces, network interfaces, and memory.
[0172] Memory may include non-persistent storage in computer-readable media, such as random access memory (RAM) and / or non-volatile memory, such as read-only memory (ROM) or flash RAM. Memory is an example of computer-readable media.
[0173] Computer-readable media includes both permanent and non-permanent, removable and non-removable media that can store information using any method or technology. Information can be computer-readable instructions, data structures, modules of programs, or other data. Examples of computer storage media include, but are not limited to, phase-change memory (PRAM), static random access memory (SRAM), dynamic random access memory (DRAM), other types of random access memory (RAM), read-only memory (ROM), electrically erasable programmable read-only memory (EEPROM), flash memory or other memory technologies, CD-ROM, digital versatile optical disc (DVD) or other optical storage, magnetic tape, magnetic magnetic disk storage or other magnetic storage devices, or any other non-transferable medium that can be used to store information accessible by a computing device. As defined herein, computer-readable media does not include transient computer-readable media, such as modulated data signals and carrier waves.
[0174] It should also be noted that the terms "comprising," "including," or any other variations thereof are intended to cover non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements includes not only those elements but also other elements not expressly listed, or elements inherent to such a process, method, article, or apparatus. Without further limitations, an element defined by the phrase "comprising one..." does not exclude the presence of other identical elements in the process, method, article, or apparatus that includes said element.
[0175] Those skilled in the art will understand that the embodiments of this specification can be provided as methods, systems, or computer program products. Therefore, one or more embodiments of this specification may take the form of a completely hardware embodiment, a completely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, one or more embodiments of this specification may take the form of a computer program product implemented on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) containing computer-usable program code.
[0176] One or more embodiments of this specification can be described in the general context of computer-executable instructions, such as program modules, that are executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, etc., that perform a particular task or implement a particular abstract data type. One or more embodiments of this specification can also be practiced in distributed computing environments where tasks are performed by remote processing devices connected via a communication network. In distributed computing environments, program modules can reside in local and remote computer storage media, including storage devices.
[0177] The various embodiments in this specification are described in a progressive manner. Similar or identical parts between embodiments can be referred to interchangeably. Each embodiment focuses on describing the differences from other embodiments. In particular, the system embodiments are basically similar to the method embodiments, so the description is relatively simple; relevant parts can be referred to the descriptions in the method embodiments.
[0178] The above description is merely an embodiment of this specification and is not intended to limit this application. Various modifications and variations can be made to this specification by those skilled in the art. Any modifications, equivalent substitutions, improvements, etc., made within the spirit and principles of this specification should be included within the scope of the claims of this specification.
Claims
1. A method for matching address information, the method comprising: Obtain a target model, which is used to match the semantics of text information; The target model is trained based on a preset training strategy and the first sample data corresponding to the preset training strategy to obtain a trained first model. The preset training strategy includes multiple sub-strategies for identifying false addresses, sub-strategies for predicting preset geographical regions in addresses, and sub-strategies for predicting address distances, or includes one of the sub-strategies for predicting preset geographical regions in addresses and sub-strategies for predicting address distances. The first model is a model that can generate a better semantic representation of address information. By using a preset gating mechanism, the features of the characters contained in the second sample data are fused with the features of the statements obtained after interactive encoding of the statements contained in the second sample data, and the fused features are obtained. The fused features are input into the network layer of the first model for classification to adjust the model structure of the first model, and the first model is trained to obtain the trained second model, which is a model for semantic matching of multiple address information. Based on the second model, the semantics of the two acquired address information are matched to determine whether the two address information are the same, and the matching result of the two address information is obtained.
2. The method according to claim 1, wherein obtaining the target model comprises: Obtain a target model trained based on a preset text corpus, wherein the preset text corpus includes a corpus composed of texts from online encyclopedias presented in a preset language.
3. The method according to claim 1 or 2, wherein the preset training strategy includes a sub-strategy for identifying fake addresses, and the step of training the target model based on the preset training strategy and the first sample data corresponding to the preset training strategy to obtain a trained first model includes: Obtain multiple different first pre-selected address information, and select a preset number of first pre-selected address information from the multiple different first pre-selected address information; Remove the first preset geographical region information from the first preset address information of a preset number, and concatenate the remaining part with the second preset geographical region information to obtain the second preset address information of a preset number. A preset number of second pre-selected address information and the first pre-selected address information other than the preset number of first pre-selected address information from the plurality of different first pre-selected address information are used as first sample data. The first sample data is used to train the target model so as to predict whether the address information corresponding to the first sample data is real through the target model, and thus obtain the trained first model.
4. The method according to claim 3, wherein the first preset geographical area information is information of an administrative region.
5. The method according to claim 1 or 2, wherein the preset training strategy includes a sub-strategy for predicting preset geographical regions in the address, and the step of training the target model based on the preset training strategy and the first sample data corresponding to the preset training strategy to obtain a trained first model includes: Obtain multiple different third-party pre-selected address information; The preset geographical area information is removed from multiple different third pre-selected address information, and the remaining part is used as the first sample data. The first sample data is used to train the target model so as to predict the preset geographical area information removed from the third pre-selected address information through the target model, and thus obtain the trained first model.
6. The method according to claim 1 or 2, wherein the preset training strategy includes a sub-strategy for address distance prediction, and the step of training the target model based on the preset training strategy and the first sample data corresponding to the preset training strategy to obtain a trained first model includes: Obtain multiple different pairs of preselected address information, and obtain the distance between the two preselected address information in each pair; The multiple different pre-selected address information pairs are respectively input into the target model. The target model predicts the distance interval between the two pre-selected address information pairs in each pre-selected address information pair. The distance between the two pre-selected address information pairs in each pre-selected address information pair is used as the training label to train the target model and obtain the trained first model.
7. The method according to claim 1, further comprising: The semantic entities contained in the address information pairs corresponding to the second sample data are identified based on the pre-trained address entity recognition model, so as to obtain the semantic entities contained in each address information pair corresponding to the second sample data. Max pooling is performed on the different types of semantic entities contained in each address information pair. Then, the max pooled data is fused and classified using the same type of semantic entities contained in the address information pair. The processed data is then input into the fully connected layer in the first model to obtain the statement features obtained by interactively encoding the statements contained in the second sample data.
8. The method according to claim 1, wherein the target model is a BERT model.
9. An address information matching device, the device comprising: The initial model acquisition module acquires the target model, which is a model used for semantic matching of text information; The first training module trains the target model based on a preset training strategy and the first sample data corresponding to the preset training strategy to obtain a trained first model. The preset training strategy includes multiple sub-strategies for identifying false addresses, sub-strategies for predicting preset geographical regions in addresses, and sub-strategies for predicting address distances, or includes one of the sub-strategies for predicting preset geographical regions in addresses and sub-strategies for predicting address distances. The first model is a model that can generate a better semantic representation of address information. The second training module, through a preset gating mechanism, fuses the character features contained in the second sample data with the sentence features obtained after interactive encoding of the sentences contained in the second sample data to obtain fused features; the fused features are then input into the network layer of the first model for classification to adjust the model structure of the first model and train the first model to obtain the trained second model, which is a model for semantic matching of multiple address information. The address information matching module performs semantic matching processing on the two acquired address information based on the second model to determine whether the two address information are the same, and obtains the matching result of the two address information.
10. An address information matching device, the address information matching device comprising: processor; as well as A memory configured to store computer-executable instructions, which, when executed, cause the processor to: Obtain a target model, which is used to match the semantics of text information; The target model is trained based on a preset training strategy and the first sample data corresponding to the preset training strategy to obtain a trained first model. The preset training strategy includes multiple sub-strategies for identifying false addresses, sub-strategies for predicting preset geographical regions in addresses, and sub-strategies for predicting address distances, or includes one of the sub-strategies for predicting preset geographical regions in addresses and sub-strategies for predicting address distances. The first model is a model that can generate a better semantic representation of address information. By using a preset gating mechanism, the features of the characters contained in the second sample data are fused with the features of the statements obtained after interactive encoding of the statements contained in the second sample data, and the fused features are obtained. The fused features are input into the network layer of the first model for classification to adjust the model structure of the first model, and the first model is trained to obtain the trained second model, which is a model for semantic matching of multiple address information. Based on the second model, the semantics of the two acquired address information are matched to determine whether the two address information are the same, and the matching result of the two address information is obtained.
11. A storage medium for storing computer-executable instructions, which, when executed by a processor, perform the following process: Obtain a target model, which is used to match the semantics of text information; The target model is trained based on a preset training strategy and the first sample data corresponding to the preset training strategy to obtain a trained first model. The preset training strategy includes multiple sub-strategies for identifying false addresses, sub-strategies for predicting preset geographical regions in addresses, and sub-strategies for predicting address distances, or includes one of the sub-strategies for predicting preset geographical regions in addresses and sub-strategies for predicting address distances. The first model is a model that can generate a better semantic representation of address information. By using a preset gating mechanism, the features of the characters contained in the second sample data are fused with the features of the statements obtained after interactive encoding of the statements contained in the second sample data, and the fused features are obtained. The fused features are input into the network layer of the first model for classification to adjust the model structure of the first model, and the first model is trained to obtain the trained second model, which is a model for semantic matching of multiple address information. Based on the second model, the semantics of the two acquired address information are matched to determine whether the two address information are the same, and the matching result of the two address information is obtained.