Method and device for extracting page information
A technology of page information and positioning information, applied in special data processing applications, instruments, electrical digital data processing, etc., can solve the problems of extracting page information and low efficiency of extracting page information, and achieve efficient and accurate extraction
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment 1
[0070] See Figure 1A The embodiment of the present invention provides a method for extracting page information.
[0071] In the embodiment of the present invention, before extracting the page information of the webpage to be processed, it is necessary to pre-set the algorithm library offline and configure the transcoding configuration information of the webpage to be processed. The transcoding configuration information of the webpage to be processed includes the positioning information and data structure type of each service block of the webpage to be processed. The preset algorithm library includes data structure types and their corresponding recognition algorithms. Such as Figure 1B As shown, the specific process of the above offline configuration operation includes:
[0072] S1: Obtain a DOM (Document Object Model, Document Object Model) tree of the webpage to be processed, and divide the DOM tree of the webpage to be processed according to the business type to obtain each bus...
Embodiment 2
[0124] See figure 2 An embodiment of the present invention provides a device for extracting page information, and the device is configured to execute the method for extracting page information provided in the first embodiment. The device includes:
[0125] The first obtaining module 201 is used to obtain the source code of the webpage to be processed and the document object model DOM tree, and obtain the transcoding configuration information of the webpage to be processed from the server. The transcoding configuration information of the webpage to be processed includes each service of the webpage to be processed Block location information and data structure type;
[0126] When a user browses a webpage through a terminal, the terminal sends a webpage acquisition request to the above-mentioned device, and the webpage acquisition request carries the webpage address of the webpage and the terminal identifier. The above-mentioned device receives the web page acquisition request sent ...
PUM

Abstract
Description
Claims
Application Information

- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic.
© 2023 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap