Supercharge Your Innovation With Domain-Expert AI Agents!

Method and device of information collection

An information collection and template technology, applied in the computer field, can solve the problems of increasing the number of logins, poor customization, failure of information collection, etc., and achieve the effect of reducing risks

Active Publication Date: 2018-11-23
BEIJING JINGDONG SHANGKE INFORMATION TECH CO LTD +1
View PDF9 Cites 10 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Conventional information collection methods cannot obtain these contents, and what you see is what you get
In addition, the conventional information collection methods are poorly customized, and without the addition of manual operations, they are easily recognized as non-manual operations by various machine learning algorithms on the target webpage. s failure

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and device of information collection
  • Method and device of information collection
  • Method and device of information collection

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0029] Exemplary embodiments of the present invention are described below in conjunction with the accompanying drawings, which include various details of the embodiments of the present invention to facilitate understanding, and they should be regarded as exemplary only. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the invention. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.

[0030] In the traditional sense of information collection, the technology used is generally to initiate an HTTP request to the target webpage of the information to be collected or use Apache components to optimize the HTTP link by adding pool object management, etc., and finally download the source code of the target webpage, and its content Carry out a series of operation...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention provides a method and a device of information collection. The method of information collection includes: receiving an information collection task distributed from a processing center; starting one or more browser processes according to the information collection task, and loading a simulation behavior template in a process of starting the one or more browser processes; receiving a uniform resource locator (URL) of a target webpage of to-be-collected information from the processing center; carrying out rendering on the target webpage according to the received URL, and obtaining page rendering status of the target webpage; according to the type of the URL, determining whether the loaded simulation behavior template needs to be configured on the target webpage; in response to determining that the simulation behavior template needs to be configured, triggering a function, which is defined in the simulation behavior template, on the target webpage; and parsing the target webpage, and returning a parsing result to cloud storage of the processing center.

Description

technical field [0001] The invention relates to the field of computers, in particular to an information collection method and device. Background technique [0002] Network information collection is a set of procedures that use network robots (commonly known as web crawlers) to automatically collect information on the Internet in accordance with a pre-agreed specification and protocol. Different acquisition algorithms can be used, and according to different scenarios, there are depth-first algorithm, breadth-first algorithm or a combination of the two to topologically obtain the information of the entire Internet website. [0003] At present, with the optimization and improvement of resources such as server hardware and network bandwidth, and the enrichment of the front-end technology of each site, the loading of web pages consumes bandwidth and increases traffic. Most of them adopt methods such as delayed asynchronous loading and lazy loading of display information. It is t...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F17/30
Inventor 李杰安伟佳许斌
Owner BEIJING JINGDONG SHANGKE INFORMATION TECH CO LTD
Features
  • R&D
  • Intellectual Property
  • Life Sciences
  • Materials
  • Tech Scout
Why Patsnap Eureka
  • Unparalleled Data Quality
  • Higher Quality Content
  • 60% Fewer Hallucinations
Social media
Patsnap Eureka Blog
Learn More