Kafka-based stacked data consumption method, terminal equipment and storage medium

A data and consumption data technology, applied in the field of data processing, can solve the problems of data accumulation, insufficient downstream consumption capacity, inability to monitor topic data accumulation and deal with them, and achieve the effect of improving data throughput capacity

Active Publication Date: 2022-07-29
XIAMEN FUYUN INFORMATION TECH CO LTD
View PDF10 Cites 2 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] 1) Insufficient downstream consumption capacity will lead to data accumulation;
[0005] 2) When it is impossible to monitor the accumulation of topic data, make a response;

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Kafka-based stacked data consumption method, terminal equipment and storage medium
  • Kafka-based stacked data consumption method, terminal equipment and storage medium
  • Kafka-based stacked data consumption method, terminal equipment and storage medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0033] The embodiment of the present invention provides a Kafka-based accumulation data consumption method, such as figure 1 and figure 2 As shown, the method includes the following steps:

[0034] S1: Poll the latest offset (offset_latest) and the current offset (offset_current) under each partition (Partition) of the topic in each unit time, calculate the difference between the latest offset and the current offset, and Store, determine whether the topic is in a delayed state according to the relationship between the difference between the latest offset and the current offset and the preset maximum message accumulation threshold, and mark the topic in the delayed state.

[0035] In this example, a topic whose difference between the latest offset and the current offset is greater than the maximum message accumulation threshold is set as a topic in a delayed state. In order to improve the accuracy of judgment, it is further set that the mean or median value of the difference...

Embodiment 2

[0052] The present invention also provides a Kafka-based accumulation data consumption terminal device, including a memory, a processor, and a computer program stored in the memory and executable on the processor, when the processor executes the computer program The steps in the foregoing method embodiment of Embodiment 1 of the present invention are implemented.

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to a kafka-based stacked data consumption method, terminal equipment and a storage medium, and the method comprises the steps: carrying out the polling calculation of a difference value between a latest offset and a current offset of a topic under each partition in each unit time, storing the difference value, and judging whether the topic is in a delay state or not, and carrying out the marking; when a consumption request for the topic is received, judging whether the topic is in a delay state or not, if so, creating a plurality of downstream topics, and averagely distributing the accumulated data to be consumed to all the downstream topics; and according to the total thread count and the number of the partitions, after the remaining non-consumed offset of each partition is averagely segmented according to the total thread count, the remaining non-consumed offset is allocated to each thread for consumption. According to the method, the number of the top partitions can be flexibly increased, and the data throughput in the service peak period can be increased.

Description

technical field [0001] The invention relates to the field of data processing, in particular to a Kafka-based accumulation data consumption method, a terminal device and a storage medium. Background technique [0002] Kafka is a distributed messaging system developed by LinkedIn and written in Scala. Kafka is a distributed, publish- or subscribe-based messaging system that provides message persistence with a time complexity of O(1). High throughput, even on very cheap commercial machines, a single machine can support the transmission of more than 100K messages per second, support message partitioning and distributed consumption between Kafka Servers, and ensure messages in each partition (Partition) Sequential transmission, which supports both offline data processing and real-time data processing, is widely used for its horizontal scalability and high throughput. At present, more and more open source distributed processing systems such as Cloudera, Apache Storm, Spark, flin...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): H04L47/56H04L47/10
CPCH04L47/56H04L47/10Y02D10/00
Inventor 徐雄辉陈奋陈荣有李伟彬薛世平
Owner XIAMEN FUYUN INFORMATION TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products