System and a method for detecting the key content of a web page based on visual characteristics
A key content and visual feature technology, applied in the Internet field, can solve problems such as poor identification of key content, lack of self-learning algorithms, complex process implementation, etc.
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment Construction
[0030] The solution of the present invention is realized through the following schemes: first, the DOM component that can extract features is obtained through the sample processing module; then the characteristics of all experimental samples are obtained through the feature extraction module; The feature table detected by the learning module; the accuracy rate is detected in the machine learning module, the model with high accuracy is selected, and the parameters are tuned to obtain the best model; finally, the web crawler submits the DOM component information, and the key content detection module returns the detection result .
[0031] 1. the realization process of the present invention is:
[0032] (1) Collect the webpage sample library, use chrome-headless to dynamically render HTML files, and manually mark the key content in the webpage to form the initial sample set of the scheme.
[0033] (2) Traverse the DOM components in the current page according to the dynamic rende...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com