Intelligent extraction system and intelligent extraction method for article type web pages
A technology for extracting systems and web pages, applied in special data processing applications, instruments, electrical and digital data processing, etc., can solve the problems of inability to accurately extract articles, low availability of captured articles, and relying on large human participation.
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment Construction
[0215] The real-time intelligent grasping system consists of 5 modules or subsystems, such as figure 1 shown. Including: real-time crawling module, article-type webpage intelligent extraction system, document approximate deduplication module, document automatic classification module, and article publishing module.
[0216] Detailed technical scheme of the article type web page intelligent extraction system of the present invention
[0217] There are many technical solutions in the field of information extraction, the core of which is how to generate and maintain extraction wrappers. Technically, there are two main categories:
[0218] 1) The extraction system that uses the machine-automatically generated extraction wrapper technology can capture a large number of articles, but it cannot achieve accurate extraction of articles, and the availability of captured articles is low;
[0219] 2) The extraction system adopts artificially generated extraction wrapper techn...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com