A highly decoupled method for dynamically managing reptiles

A dynamic management and crawler technology, applied in the direction of electrical digital data processing, special data processing applications, instruments, etc., can solve problems such as coupling phenomena, achieve the effects of reducing complexity, enhancing robustness and stability, and reducing coupling

Active Publication Date: 2022-01-25
苏州市中地行信息技术有限公司
View PDF2 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] In order to solve the current problem of coupling phenomena in the realization of functions such as data capture, data analysis, data scheduling, and updating, the method proposed by the present invention, which is suitable for highly decoupled and dynamically manageable crawlers, can be based on effective data scheduling and updating methods To achieve the corresponding decoupling phenomenon

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A highly decoupled method for dynamically managing reptiles
  • A highly decoupled method for dynamically managing reptiles
  • A highly decoupled method for dynamically managing reptiles

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0015] The following will clearly and completely describe the technical solutions in the embodiments of the present invention with reference to the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are only some, not all, embodiments of the present invention. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without creative efforts fall within the protection scope of the present invention.

[0016] Refer to attached figure 1 , a method for highly decoupling and dynamically managing crawlers, characterized in that it includes:

[0017] Deploying the crawler host-side image and running the host-side service, the host-side image completes message transmission, data scheduling, storage records, and log analysis;

[0018] Deploy the crawler client image and run the client service, the client image completes message transmission, crawler control and crawler...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a highly decoupled and dynamically managed crawler method. The method divides the crawler into two stages: data analysis and new target generation, and writes the rules corresponding to the two stages of acquisition targets into json data and stores them in the host computer according to the protocol. end; the host end runs the task, sends it to the client with sufficient resources through the message queue module according to the resource scheduling algorithm, and the client receives the task information, converts it into executable information through the crawler protocol core and runs it by the crawler running module, and finally obtains the data; The host end obtains data and new tasks, stores and updates the task pool; separating the host end from the crawler server can reduce the coupling of the system. Therefore, after the crawler function is separated, the complexity of the crawler server can be reduced, and the host side can be modified while the distributed crawler system is running to achieve specific control and management purposes, thereby decoupling and extensible the entire module Design to enhance the robustness and stability of the entire framework.

Description

technical field [0001] The invention relates to computer data mining technology, in particular to a highly decoupled and dynamically managed crawler method. Background technique [0002] As a tool for network information search, search engine collects and discovers information on the Internet with certain strategies, understands, extracts, organizes and processes information, and provides retrieval services for users. In 1994, the crawler program was applied to the indexing program, and Yahoo, Google, etc. appeared one after another. But so far, no matter how powerful the search engine is, there are still problems such as information loss, low update rate, and low accuracy rate. Users need faster, more accurate, more convenient, and more effective query services, which has become the goal of research and development of search engine technology. [0003] In this case, topic crawlers that directionally grab related web resources came into being. Theme crawler, also known as...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G06F16/951G06F16/16G06F16/172G06F16/182
CPCG06F16/951G06F16/164G06F16/172G06F16/182
Inventor 金智辉
Owner 苏州市中地行信息技术有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products