Recent message sending priority processing method and system based on crawler texts

A technology of priority processing and text, applied in the field of text processing, can solve problems such as insufficient redis memory, difficulty in ensuring complete recovery of data, and affecting system stability, etc., to reduce the number of consumers, improve maintainability, and increase system operation and maintenance effect of difficulty

Pending Publication Date: 2020-01-24
DATAGRAND TECH INC
View PDF3 Cites 1 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, Redis is an in-memory database after all. If consumers have problems such as image download exceptions and blockages, it is very easy to cause redis memory to be insufficient, resulting in queue crashes and affecting system stability. It is an extremely troublesome thing, and it is difficult to ensure complete recovery of data
[0005] Mysql database method: Another common method is to store the posting time of files in the mysql database, and filter out the article data of the latest posting time through the posting time. The disadvantage of this method is that it needs to use database transactions, and the system will appear very cumbersome. Moreover, there may be multiple processes operating the same record in the data table, triggering the mysq lock table problem, causing all consumers to be stuck

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Recent message sending priority processing method and system based on crawler texts
  • Recent message sending priority processing method and system based on crawler texts

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0044] In order to better understand the technical solutions of the present invention, the embodiments of the present invention will be described in detail below in conjunction with the accompanying drawings.

[0045] It should be clear that the described embodiments are only some of the embodiments of the present invention, not all of them. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without creative efforts fall within the protection scope of the present invention.

[0046] The present application will be described in further detail below through specific embodiments and in conjunction with the accompanying drawings.

[0047] Embodiments of the present invention provide a method and system for prioritizing processing of recently published documents based on crawler texts.

[0048]The recent document priority processing system based on the crawler text of this application includes a web service int...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a recent message sending priority processing method and system based on a crawler text. The recent message sending priority processing method comprises the steps: a producer process stores all crawled data files in time folders with the message sending time of the producer process as a file name, and generates a mark file correlated with the data files in each time folder;and the consumer process extracts a time folder with the latest current time, regularizes the corresponding data file according to the mark file under the time folder, and moves the regularized data file to a historical folder. The recent message sending priority processing adopts a double-file control method, so that a producer and a consumer can be prevented from operating one file at the same time without using a system lock, and the accuracy of data is ensured, and the logic complexity of the system is reduced, and the maintainability of the system is improved.

Description

technical field [0001] The invention relates to the technical field of text processing, in particular to a crawler text-based method and system for prioritizing processing of recently published text. Background technique [0002] Companies that display news streams often grab news data from the "World Wide Web" and send it locally. The captured articles cannot be directly pushed to production. It is necessary to remove external links from articles, download pictures in articles, and tag and classify articles. After normalization processing, articles after normalization processing can be published. Usually some articles are extremely time-sensitive, so they cannot be processed according to the order in which the articles are crawled, but the most recently published articles need to be processed first to ensure that time-sensitive articles are processed first. [0003] Due to the large data volume of individual articles, for example, an article containing multiple base64 pict...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/953G06F16/16G06F16/28G06F16/35
CPCG06F16/953G06F16/35G06F16/283G06F16/164
Inventor 蹇智华陈运文陈鼎景健刘友敏纪达麒
Owner DATAGRAND TECH INC
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products