Webpage information extraction method and device based on http protocol
A web page information and extraction method technology, applied in special data processing applications, instruments, electrical digital data processing, etc., can solve the problem of difficult acquisition of target information, and achieve the effect of strong pertinence, improved efficiency and accuracy
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment Construction
[0026] The principles and features of the present invention are described below in conjunction with the accompanying drawings, and the examples given are only used to explain the present invention, and are not intended to limit the scope of the present invention.
[0027] Such as figure 1 As shown, the present embodiment provides a method for extracting webpage information based on the http protocol, including:
[0028] Template generation step: according to the target page to extract information, customize the corresponding page parsing template, and predefine the target fields and verification rules in the page parsing template;
[0029] Web page address parsing step: parsing the web page address of the target page to obtain the HTML source file of the target page;
[0030] Information extraction step: read and parse the HTML source file of the target page, and extract the page information matching the predefined target field of the page parsing template from the HTML sourc...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com