Web page information extraction system and method
A web page information, web page technology, applied in special data processing applications, instruments, electrical digital data processing, etc., can solve the problems of dissimilar structure, fast website update speed, long running time, etc., to improve the accuracy rate and improve the recall rate. , the effect of fast extraction
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment Construction
[0072] The present invention will be further described in detail below in conjunction with the accompanying drawings.
[0073] The system structure of the present invention is as follows figure 1 shown, including:
[0074] The template generation module 101 is configured to select webpages to be automatically marked from the webpage collection, classify the webpages to be automatically marked according to the training webpages marked by the user, and generate webpage templates corresponding to the categories of the training webpages.
[0075] The webpage homogenization module 102 is used for shielding, according to the webpage template of the category, the difference between the webpage to be automatically marked belonging to the category and the webpage template of the category.
[0076] The automatic labeling module 103 is used to parse the training webpage of the category, generate a first wrapper (wrapper) file, and automatically label the webpage to be automatically labe...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com