A data processing method, device and electronic equipment

A data processing device and data processing technology, applied in the computer field, can solve the problems of time consumption, high reading times, and high frequency of context switching, and achieve the effects of reducing resources, improving processing efficiency, and reducing the frequency of context switching

Active Publication Date: 2021-08-27
ALIBABA GRP HLDG LTD
View PDF4 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0009] 2. In extreme cases, when a reading thread has been unable to read the available data, it is equivalent to an infinite loop, which consumes at least one CPU's user time
[0010] 3. If a partition generates a small amount of data on average, it will also cause the corresponding read thread to read a small amount of data each time, but the number of reads is relatively large
[0017] In this mode, when there are too many reading threads, it will lead to serious CPU preemption; in addition, switching the CPU to another thread requires context switching, that is, saving the running environment of the current thread and restoring the running environment of the thread to be switched to, so when reading When there are too many threads, the frequency of CPU context switching is very high, and a large amount of computing resources will be consumed in CPU context switching

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A data processing method, device and electronic equipment
  • A data processing method, device and electronic equipment
  • A data processing method, device and electronic equipment

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0077] Embodiment 1. A data processing method, such as Figure 4 As shown, including steps S110-S120:

[0078] S110. Put the data reading task generated by the partition into the task queue;

[0079] S120. When the number of reading threads in the thread pool does not reach the predetermined upper limit, extract data reading tasks from the task queue, create reading threads according to the extracted tasks and put them into the thread pool; wherein, the threads The pool is used to hold the read threads that take up processing resources in turn.

[0080] In this embodiment, when there are many partitions and the throughput of the partitions is quite different, the data generated by the partitions can be efficiently read and the CPU resources can be reasonably used. On the one hand, this embodiment controls the number of threads that preempt CPU resources through the task queue, so not all data reading tasks of each partition can occupy CPU resources. Only data reading tasks ...

Embodiment 2

[0128] Embodiment 2. A data processing device, such as Figure 8 shown, including:

[0129] Queue management module 81, is used for putting the data reading task that partition produces into task queue;

[0130] The extraction module 82 is used to extract data reading tasks from the task queue when the number of reading threads in the thread pool does not reach a predetermined upper limit, and set up reading threads according to the extracted tasks and put them into the thread pool; wherein , the thread pool is used to store reading threads that take up processing resources in turn.

[0131] In this embodiment, the queue management module 81 is a part responsible for adding data reading tasks to the task queue in the above-mentioned data processing device, and may be software, hardware, or a combination of both.

[0132] In this embodiment, the extracting module 82 is the part responsible for generating reading threads according to the data reading tasks in the task queue in...

Embodiment 3

[0151] Embodiment 3. An electronic device for data processing, including: a memory and a processor;

[0152] The memory is used to store a program for data processing; when the program for data processing is read and executed by the processor, the following operations are performed:

[0153] Put the data reading tasks generated by the partition into the task queue;

[0154] When the number of reading threads in the thread pool does not reach the predetermined upper limit, extract data reading tasks from the task queue, set up reading threads according to the extracted tasks and put them into the thread pool; wherein, the thread pool uses It is used to store the reading threads that take up processing resources in turn.

[0155] When the program for data processing in this embodiment is read and executed by the processor, the operations performed correspond to steps S110-S120 in the first embodiment. For other details of the operations performed by the program, refer to the fi...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The present application provides a data processing method, device and electronic equipment. The data processing method includes: putting the data reading tasks generated by partitions into the task queue; when the number of reading threads in the thread pool does not reach the predetermined upper limit , extracting data reading tasks from the task queue, creating reading threads according to the extracted tasks and putting them into the thread pool; wherein, the thread pool is used to store reading threads that occupy processing resources in turn. This application can improve the reading efficiency of partition data when there are many partitions and the throughput of the partitions is quite different.

Description

technical field [0001] The invention relates to the field of computers, in particular to a data processing method, device and electronic equipment. Background technique [0002] At present, the common data source (or data transfer) of cloud computing big data module is generally implemented by relying on Kafka (or products similar to Kafka, such as MetaQ or Loghub). Kafka is a high-throughput distributed publish-subscribe message system. The publish of the message is called the producer, and the subscribe of the message is called the consumer. MetaQ is a high-performance, high-availability, and scalable distributed message middleware. LogHub is a service of a log product that provides business functions similar to Kafka. [0003] This type of data source has two characteristics, one is divided into multiple partitions, and the other is that each partition can only be consumed by one thread, which is usually a reading thread; under these two characteristics, it can support ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G06F9/50G06F9/48
CPCG06F9/4881G06F9/5038G06F2209/5021
Inventor 刘峰
Owner ALIBABA GRP HLDG LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products