Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Information processing method and server

An information processing method and server technology, applied in the direction of digital transmission systems, electrical components, transmission systems, etc., can solve the problems of fast system capture rate, no solution, and inability to automatically adjust

Active Publication Date: 2017-07-25
TENCENT TECH (BEIJING) CO LTD
View PDF6 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] 1. The crawling rate and frequency of webpage data are pre-configured in the script program, and cannot be automatically adjusted according to the number of tasks during the data capture process; 2. The setting of the crawling rate is based on the configured project information , different projects may configure the same domain name, which may cause the crawling rate of the system under the same domain name to be too fast, resulting in the proxy network interconnection protocol (IP) being blocked; however, for the above problems, there is no effective solution

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Information processing method and server
  • Information processing method and server
  • Information processing method and server

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0089] An embodiment of the present invention provides an information processing method. figure 1 It is a schematic diagram of the composition and structure of the server in Embodiment 1 of the present invention; figure 1 As shown, the server includes: project starter 101, token pool 102, task pool 103, rate controller 104, token generator 105, scheduling queue 106, scheduler 107, processing queue 108 and processor 109; ,

[0090] The project launcher 101 is used to add task information to be fetched into the task pool 103; the task information includes: address information, task identification and initial capture frequency of the task; extracting the task information Domain name information, generate a token according to the domain name information, and add the token to the token pool 102; the token includes: domain name information, initial crawl rate, last token generation time, domain name IP and proxy IP;

[0091] The task pool 103 is used to store task information;

...

Embodiment 2

[0126] The embodiment of the invention also provides a server. Figure 4 It is a schematic diagram of the third composition structure of the server in the embodiment of the present invention; Figure 4 As shown, the server includes: a configuration unit 110, an item pool 111, an item initiator 101, a token pool 102, a task pool 103, a rate controller 104, a frequency controller 112, a token generator 105, a scheduling queue 106, Scheduler 107, processing queue 108 and processor 109; Wherein,

[0127] The configuration unit 110 is configured to configure the captured items, generate corresponding item information based on the configured captured items, and send the item information to the item pool 111;

[0128] The project pool 111 is used to store project information; the project information includes project scripts and project status; the project status includes installed status and uninstalled status;

[0129]The project launcher 101 is configured to scan project informat...

Embodiment 3

[0153] The embodiment of the invention also provides a server. Figure 5 It is a schematic diagram of the fourth composition structure of the server in the embodiment of the present invention; Figure 5As shown, the server includes: a configuration unit 110, an item pool 111, an item initiator 101, a token pool 102, a task pool 103, a rate controller 104, a frequency controller 112, a token generator 105, a scheduling queue 106, Scheduler 107, processing queue 108, processor 109 and domain name resolver 113; Wherein,

[0154] The configuration unit 110 is configured to configure the captured items, generate corresponding item information based on the configured captured items, and send the item information to the item pool 111;

[0155] The project pool 111 is used to store project information; the project information includes project script programs and project status; the project status includes installed status and uninstalled status;

[0156] The project launcher 101 is ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The embodiment of the invention discloses an information processing method and a server. The method comprises the following steps: adding the task information to be subjected to data capture into a task pool; extracting the domain name information of the task information, generating tokens based on the domain name information, and adding the tokens into a token pool; determining the capture rate corresponding to the domain name information according to the number of the schedulable task information under the domain name information in the token pool; scanning the token pool, determining the number of the tokens corresponding to the domain name information based on the capture data, and when the number of the tokens meets the preset conditions, sending the corresponding number of the tokens to a scheduling queue; acquiring the tokens from the scheduling queue, selecting the first task information that meets the second preset conditions from the task pool based on the domain name information corresponding to the tokens, and adding the first task information to a processing queue; and extracting the first task information from the processing queue, and executing the capture of the corresponding data based on the first task information.

Description

technical field [0001] The present invention relates to information processing technology, in particular to an information processing method and server. Background technique [0002] With the rapid development of Internet technology, web pages have become the carrier of massive information. One of the methods currently used to extract information from webpages is a web crawler, which specifically extracts specified webpage content through a script program configured for webpage data extraction. [0003] In the process of implementing the technical solutions of the embodiments of the present application, the inventors of the present application at least found the following technical problems in the related art: [0004] 1. The crawling rate and frequency of webpage data are pre-configured in the script program, and cannot be automatically adjusted according to the number of tasks during the data capture process; 2. The setting of the crawling rate is based on the configured ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): H04L29/12H04L12/873H04L47/52
CPCH04L47/527H04L61/5061H04L61/4511
Inventor 黄文飞吴一飞施驭李振洋
Owner TENCENT TECH (BEIJING) CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products