An intelligent retrieval method, device, electronic device and storage medium for computing patent document similarity based on word frequency and semantics
A technology of semantic computing and patent documents, applied in the field of intelligent retrieval, electronic equipment and its storage media, can solve the problems of strong subjectivity of review opinions, low accuracy of results, and single use method, so as to reduce the scope of review and save manpower and time, the effect of improving accuracy
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment 1
[0032] see figure 1 , an intelligent retrieval method for calculating the similarity of patent documents based on word frequency and semantics provided in this embodiment, the examples given are only for explaining the present invention, not for limiting the scope of the present invention. The method specifically includes the following steps:
[0033] S101. For all the patent data in the question bank, extract text information related to the content of the test questions, organize them into structured data, and form a word segmentation result;
[0034] S102. Carry out bag-of-words statistics and word vector conversion calculations for the word segmentation results of all the above-mentioned patent data, and obtain the weight value of each word as preloaded data for model prediction;
[0035] S103. Load all the word bags, word vectors, and vocabulary data above, perform a full matching query according to the test question publication number, compare the similarity predicted by...
Embodiment 2
[0074] see figure 2 , is an intelligent data retrieval method based on a single server provided in this embodiment, and the examples given are only used to explain the present invention, and are not used to limit the scope of the present invention. The method specifically includes the following steps:
[0075] S201, extracting patent information and content from the XML file of the question bank and performing storage operations, the extracted content is preliminarily cleaned and sorted in the patent database, and then downloaded into a CSV file with specified fields;
[0076] S202. After segmenting the full content, removing stop words, and screening high-frequency words, construct a vector model;
[0077] S203. Load the vector model data, and combine multiple fusion results of the literal-based bag-of-words algorithm and the semantics-based semantic algorithm to predict top-ranked patents.
[0078] Among them, S203 further includes:
[0079] S2031. Perform word segmentat...
Embodiment 3
[0086] see image 3 , an intelligent retrieval device 210 for calculating the similarity of patent documents based on word frequency and semantics provided in this embodiment, the examples given are only for explaining the present invention, not for limiting the scope of the present invention. The device specifically includes the following components:
[0087] Data processing module 211: used to extract all patent text content from the question bank according to fields and importance, and obtain the data standard format for modeling;
[0088] Intelligent calculation module 212: used to perform various calculations on the extracted standard data to obtain model data reflecting its frequency, semantics and weight in the text;
[0089] Model building module 213: used to model and calculate model data, combine and optimize calculation results, and build an intelligent retrieval model in combination with business requirements;
[0090] Model prediction module 214: for encapsulati...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com