Unlock instant, AI-driven research and patent intelligence for your innovation.

Multi-process multi-thread distributed crawler method, system and device

A crawler system and multi-threaded technology, applied in the field of big data, can solve the problems of low data collection efficiency and achieve the effect of improving performance, improving data collection efficiency, fast and efficient acquisition

Active Publication Date: 2021-03-12
深圳前瞻资讯股份有限公司
View PDF5 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] In order to improve the problem of low data collection efficiency, this application provides a multi-process multi-thread distributed crawler method, system and device

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Multi-process multi-thread distributed crawler method, system and device
  • Multi-process multi-thread distributed crawler method, system and device
  • Multi-process multi-thread distributed crawler method, system and device

Examples

Experimental program
Comparison scheme
Effect test

Embodiment approach

[0071] refer to figure 1 and figure 2 , as a further implementation of the multi-process and multi-thread distributed crawler method, the crawler method also includes,

[0072] Alarm step 107, if the crawler is seriously abnormal, suspend the abnormal crawler task, and send an alarm message.

[0073] In a further embodiment of the above-mentioned multi-process and multi-thread distributed crawler method, the maintenance personnel can perform maintenance in time according to the alarm information, so that the crawler can recover as soon as possible, and the task data corresponding to the crawler task can be obtained as early as possible, so that to a certain extent It helps to improve the efficiency of data collection; and the suspension of abnormal crawler tasks can prevent the loss of abnormal crawler tasks to a certain extent.

[0074] Wherein, as an implementation manner of the alarm information, the alarm information includes but not limited to emails and / or short messa...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention relates to a multi-process multi-thread distributed crawler method, system and device, and belongs to the field of big data technologies. The crawler method comprises a resource configuration step, a virtual server acquisition step, a virtual server state judgment step, a collected data exception judgment step, a virtual server switching step, a data cleaning and storage step and analarm step. The crawler system comprises a resource configuration module, a virtual server acquisition module, a virtual server state judgment module, a collected data exception judgment module, a virtual server switching module, a data cleaning and storage module and an alarm module. Compared with the prior art, the method, the system and the device have the effect of improving the problem of relatively low data acquisition efficiency.

Description

technical field [0001] The present application relates to the field of big data technology, in particular to a multi-process and multi-thread distributed crawler method, system and device. Background technique [0002] Artificial intelligence (AI) is a new technical science that studies and develops theories, methods, technologies and application systems for simulating, extending and expanding human intelligence. Artificial intelligence is a branch of computer science that attempts to understand the essence of intelligence and produce a new intelligent machine that responds in a manner similar to human intelligence. Research in this field includes robotics, speech recognition, image recognition, natural language processing and expert systems, etc. [0003] In view of the increasing scale of my country's artificial intelligence market, in order to achieve more intelligent research, it is necessary to collect a large amount of relevant data to facilitate subsequent data analy...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F11/20G06F9/50G06F16/951G06F16/955
CPCG06F9/5077G06F9/5083G06F11/203G06F2201/815G06F16/951G06F16/955
Inventor 彭明亮
Owner 深圳前瞻资讯股份有限公司