Docker-based data acquisition method and device, computer equipment and storage medium

A data acquisition and computer technology, applied in the field of big data, can solve the problems of poor system stability, limited scale, easy to block, etc., and achieve the effect of occupying less system resources, strong isolation, and strong versatility

Inactive Publication Date: 2019-11-15
PINGAN INT SMART CITY TECH CO LTD
View PDF5 Cites 3 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] In view of this, the embodiment of the present invention provides a Docker-based data collection method to solve the problem of wasting cloud server resources by adding crawling nodes in a horizontally enhanced manner, and it is easy to be limited in scale due to limited cloud server resources, which cannot be effectively implemented The problem of increasing crawling nodes on demand, and solving the problem of poor isolation of multi-threaded crawler threads when multi-threaded scheduling is used on crawler nodes for data collection, prone to blocking and poor system stability

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Docker-based data acquisition method and device, computer equipment and storage medium
  • Docker-based data acquisition method and device, computer equipment and storage medium
  • Docker-based data acquisition method and device, computer equipment and storage medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0041] In order to enable those skilled in the art to better understand the solutions of the present invention, the technical solutions in the embodiments of the present invention will be clearly and completely described below in conjunction with the drawings in the embodiments of the present invention. Unless otherwise defined, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the technical field of the invention.

[0042] The appearances of the phrase "an embodiment" in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. It is understood explicitly and implicitly by those skilled in the art that the embodiments described herein can be combined with other embodiments.

[0043] Embodiments of the present invention provide a Docker-based data acquisition method, which can be applied t...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention belongs to the technical field of big data, and relates to a Docker-based data acquisition method and device, computer equipment and a storage medium. The method comprises the steps of obtaining a data acquisition task, and sending a mirror image of a crawling program container to at least one cloud server according to the data acquisition task; generating at least one crawling program container in each cloud server according to the mirror image of the crawling program container, wherein a crawling program runs in the crawling program container to become a crawling node; and sending the data acquisition task to the crawling node, and executing data acquisition operation on the data acquisition task through the crawling node. According to the scheme provided by the invention,a Docker technology is adopted to send the crawling program container mirror image to at least one cloud server, a plurality of crawling nodes can be automatically deployed in the same cloud server, the occupied system resources are few, the cloud server resources can be effectively utilized, the crawling nodes can be increased as required, the isolation among the crawling nodes is high, and the system stability is good.

Description

technical field [0001] Embodiments of the present invention belong to the field of big data technology, and in particular relate to a Docker-based data collection method, device, computer equipment, and storage medium. Background technique [0002] In the era of big data, in data-based systems, it is often necessary to collect a large amount of raw data. Part of these raw data comes from the Internet. For this part of the Internet, the existing data collection process is generally through the cloud server In the face of large-scale collection requirements, the existing data collection generally adopts the method of horizontal enhancement, that is, increasing the number of cloud servers to achieve the purpose of large-scale collection by increasing the number of crawling nodes, or through crawling Multi-threaded scheduling is used on the nodes to run the crawler program concurrently to achieve the purpose of large-scale collection. [0003] However, for the method of horizon...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/951G06F9/455
CPCG06F9/45558G06F16/951G06F2009/45595
Inventor 林岳鹏吕东玉张川
Owner PINGAN INT SMART CITY TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products