Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Webpage analysis container and method

A technology of webpage analysis and webpage, which is applied in the field of webpage analysis, can solve problems such as the inability to collect running results and content, and achieve the effect of improving precision and success rate

Active Publication Date: 2013-10-23
BEIJING JINGDONG SHANGKE INFORMATION TECH CO LTD
View PDF5 Cites 9 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] The technical problem to be solved by the present invention is to overcome the defect that the traditional web page collection method in the prior art cannot collect the running results and content including the client dynamic script after running, and to provide a method that can collect and parse the webpage including the client Webpage parsing container for webpage with dynamic script and webpage parsing method realized by using said webpage parsing container

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Webpage analysis container and method
  • Webpage analysis container and method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0029] The preferred embodiments of the present invention are given below in conjunction with the accompanying drawings to describe the technical solution of the present invention in detail.

[0030] Such as figure 1 As shown, the webpage analysis container of the preferred embodiment of the present invention includes a webpage download module 1 , a detection module 2 , a script analysis module 3 and a page rendering module 4 .

[0031] The web page download module 1 sends multiple requests to a web server to obtain an html text of a web page from the web server, and the detection module 2 detects the html text to detect the html text The version of the html in the html text and the version and classification of the dynamic script of at least one dynamic script trigger event in the html text, and the script parsing module 3 is just called respectively with the version of the dynamic script of the at least one dynamic script trigger event and a script engine with the same clas...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a webpage analysis container and method. The webpage analysis container comprises a webpage downloading module, a detection module, a script analysis module and a page rendering module. The webpage downloading module is used for sending requests to a web server repeatedly to obtain one html text of one webpage. The detection module is used for detecting the version of html in the html text, the version of a dynamic script of at least one dynamic script triggering event, and the classification of the version. The script analysis module is used for calling a script engine analysis the same as the version of the dynamic script of the dynamic script triggering event and the classification of the version, and operating at least one dynamic script triggering event. The page rendering module is used for calling a page rendering engine the same as the version of the detected html to render the webpage, and adding an operation result of the script engine to the webpage. According to the webpage analysis container and method, collection and analysis on complicated webpages including dynamic scripts of client sides are achieved, all contents in the webpage can be obtained, and precision and success rates of webpage collection are improved.

Description

technical field [0001] The present invention relates to a web page analysis container and method, in particular to a web page analysis container capable of collecting and analyzing web pages including client dynamic scripts and a web page analysis method realized by using the web page analysis container. Background technique [0002] With the rapid development of the Internet, various websites have appeared, and many websites include many webpages with very beautiful display effects and good user experience. These webpages use javascript, vbscript, jscript ( Above-mentioned javascript, vbscript, jscript all are the client dynamic script technologies such as client script language commonly used in the prior art, and these dynamic script technologies are widely used, also make original simple html (hypertext markup language) webpage become Very complex and very difficult to extract. [0003] The traditional web page information collection technology simulates http (hypertext ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F17/30
Inventor 黄哲铿
Owner BEIJING JINGDONG SHANGKE INFORMATION TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products