Lightweight common webpage topic crawler method based on search engine
A search engine and theme crawler technology, applied in the field of information retrieval, can solve the problem of web crawling accuracy and high implementation cost, and achieve the effect of low cost and easy implementation.
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment Construction
[0020] The preferred embodiments of the present invention will be described in detail below in conjunction with the accompanying drawings, so that the advantages and features of the present invention can be more easily understood by those skilled in the art, so as to define the protection scope of the present invention more clearly.
[0021] see figure 1 , the present invention provides a novel search engine-based lightweight webpage theme crawling method, comprising the steps of:
[0022] (1) Given a small amount of vocabulary that describes a specific topic as seeds, such as the abbreviation and full name of a commodity, etc., and constructing seed expansion rules in this field, such as the seed of a commodity can be expanded into a series of seeds through brand rules, an academic The seed of the meeting can be expanded into a series of seeds by year;
[0023] (2) According to the expanded seeds, convert the seeds into query words, and obtain several candidate websites rela...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com