Web crawler system based on browser kernel

A browser kernel and web crawler technology, applied in the field of web search engines, can solve problems such as omissions

Inactive Publication Date: 2017-05-10
HANGZHOU ANHENG INFORMATION TECH CO LTD
View PDF4 Cites 15 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

If you use static analysis to analy

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Web crawler system based on browser kernel

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0024] First of all, it should be explained that the present invention relates to webpage search engine technology, and is an application of computer technology in the field of Internet technology. During the implementation of the present invention, the application of multiple software function modules will be involved. The applicant believes that, after carefully reading the application documents and accurately understanding the realization principle and purpose of the present invention, combined with existing known technologies, those skilled in the art can fully implement the present invention by using their software programming skills. The aforementioned software functional modules include but are not limited to: browser engine module, network communication module, policy module, etc. All mentioned in the application documents of the present invention belong to this category, and the applicant will not list them one by one.

[0025] Below in conjunction with accompanyi...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to a webpage search engine technology, and aims at providing a web crawler system based on a browser kernel. The web crawler system based on the browser kernel comprises a browser engine module, a network communication module and a strategy module and is used for conducting page analyzing and finding URLs of other pages. According to the web crawler system based on the browser kernel, resources relied by a page are dynamically loaded through the built-in browser kernel by using a dynamic analysis technology, Javascript is executed, dynamic operations such as events of simulating mouse clicks, double clicks and carriage return are conducted on a DOM node to find a new page, and the defects of a traditional crawler are overcome.

Description

technical field [0001] The invention relates to the technical field of webpage search engines, in particular to a browser kernel-based web crawler system. Background technique [0002] Web crawlers have a wide range of application scenarios. They are an important part of web search engines and are also used to obtain specific information on the Internet. The core function of a web crawler is to discover the URLs of other pages from a page. [0003] At present, common web crawlers are all based on static analysis of pages - when analyzing pages, they will not dynamically execute Javascript scripts in the pages, nor will they load resources such as pictures and scripts in the pages. When statically analyzing a page, it mainly extracts the Label, <form> Tags, etc. may contain content pointing to other page URLs. [0004] With the rapid development of Internet technology, the implementation methods of web pages are becoming more and more diverse, and various front-en...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/30G06F9/445
CPCG06F9/44521G06F16/951
Inventor 范渊陈刚黄进
Owner HANGZHOU ANHENG INFORMATION TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products