Subject-oriented web page collection system
A collection system, subject-oriented technology, applied in the field of network communication, can solve problems such as difficult to judge the accurate target website webpage, difficult to define the page, and collect a large number of non-theme webpages, etc., to achieve ideal results
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment Construction
[0026] like figure 1 As shown, the system of the present invention consists of three modules: a sample training module, a policy search module and a collection module. The sample training module analyzes and calculates the theme feature vector and value through the artificially set web page sample library, and calculates the similarity threshold of the page; the strategy search module is the URL address set retrieved by the control system, and the search range The control is in the candidate seed website; the function of the acquisition module is to accept the URL address sent by the strategy search module, and perform page purification, feature extraction, analysis, and collection and preservation.
[0027] The specific functions and interaction process of several main modules are described below.
[0028] 1. Strategy search module
[0029] The functional design of the strategy search module is based on the information search on the Internet, which is a technology based on ...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com