Entity recognition method and system considering text semantic information
A technology of entity recognition and semantic information, which is applied in the field of entity recognition methods and systems considering text semantic information, can solve problems such as increased calculation, similar content cannot be fully contained, and insufficient use of semantic information, etc., to achieve time complexity Reduced, good entity recognition effect, high entity recognition efficiency
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment 1
[0101] The entity recognition method based on the inverted index and the Sentence-BERT (SBERT for short) model provided by the embodiment of the present invention comprises the following steps:
[0102] For the record sets A and B to be identified
[0103] (1) Data reading and preprocessing:
[0104] Read the contents of record sets A and B respectively, perform preprocessing operations such as word segmentation, spelling correction, part of speech restoration, and stop word removal on the data contained in the records, and generate record sets A* and B composed of individual words *;
[0105] (2) Create an inverted index:
[0106] Deduplicate the word content in A* to generate a word dictionary, and use the words in the dictionary as index words to create an inverted index of the record set A;
[0107] (3) Load the SBERT model.
[0108] Load the SBERT model trained on the network into the method for standby;
[0109] (4) Calculate the IDF value:
[0110] Calculate the I...
Embodiment 2
[0126] The high-efficiency entity recognition method that fully considers the text semantic information provided by the present invention is based on the inverted index and the SBERT model. Firstly, through the inverted index and the calculation of the IDF value of the word in the data source, the pair of records to be matched is quickly generated to improve the recognition efficiency, and then through The SBERT model fully extracts the semantic information in the text records, uses cosine similarity to calculate the similarity between records, improves the recognition accuracy, and thus achieves efficient and accurate entity recognition.
[0127] The entity recognition method based on the inverted index and the SBERT model provided by the embodiment of the present invention takes two record sets A and B to be recognized as examples, and includes the following steps:
[0128] 1. Data reading and preprocessing. Read the record collection into the model, and combine the fields o...
Embodiment 3
[0137] The present invention divides the overall process of the entire entity recognition algorithm into three main stages, namely the preparation stage, the processing stage and the verification stage, and the detailed processing steps of each stage are as follows.
[0138] (1) Preparation stage:
[0139] The preparation stage mainly includes preprocessing the data and establishing related indexes. First determine whether the cache file exists. If there is a cache file, you need to load the cache file, then read the original data file, perform field merging and spelling correction on the information that needs to be processed, preload the SBERT model, and create an inverted index, including Dictionary files, location files, etc., will finally write the processing results and generated content into cache files for storage. The main considerations for merging the fields in the file set are as follows. One is that the algorithm can be flexibly applied to all data records mainly...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com