Data processing method and device based on improved Flume

A data processing and data technology, applied in the field of data processing, can solve problems such as waste of CPU usage, increased operation and maintenance costs, and inability to solve problems such as different processing speeds of Flume, so as to achieve the effect of improving performance and solving bottleneck problems.

Active Publication Date: 2019-10-25
INSPUR SUZHOU INTELLIGENT TECH CO LTD
View PDF4 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] However, the current Flume mostly adopts single business, single channel (including only one FileChannel), and single sink (only one thread is used to process data), which has low disk utilization and wasteful CPU usage.
The existing technology can overcome the above defects by deploying a pair of Flumes, but it cannot solve the problem of different processing speeds of different Flumes, and increases the cost of operation and maintenance

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Data processing method and device based on improved Flume
  • Data processing method and device based on improved Flume
  • Data processing method and device based on improved Flume

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0061] Such as image 3 As shown, the improved Flume of the present invention includes a source component, a channel channel component that encapsulates M file channels FileChannel and a sink component that includes N threads, M is an integer greater than or equal to 2, and N is an integer greater than or equal to 1. M is greater than N. The Flume-based data processing method includes the following steps:

[0062] Step 301: regularly counting the amount of stored data of each of the M FileChannels in the encapsulated channel component;

[0063] Among them, M FileChannels are encapsulated to become a new channel component. For example, if a machine has 2 hard disks, 4 FileChannels are set, and every two FileChannels are placed on the same hard disk to form a high-performance channel component. The high-performance channel component includes 4 FileChannels, and every two FileChannels are on a hard disk. It is also possible to place all 4 FileChannels on one hard disk. You c...

Embodiment 2

[0074] Such as Figure 4 As shown, the improved Flume of the present invention includes a source component, a channel channel component that encapsulates M file channels FileChannel, and a sink component that includes N threads. Both M and N are integers greater than or equal to 2, and M is equal to N. The Flume-based data processing method includes the following steps:

[0075] Step 401: regularly counting the amount of stored data of each of the M FileChannels in the encapsulated channel component;

[0076] Among them, M FileChannels are encapsulated to become a new channel component. For example, if a machine has 2 hard disks, 4 FileChannels are set, and every two FileChannels are placed on the same hard disk to form a high-performance channel component. The high-performance channel component includes 4 FileChannels, and every two FileChannels are on a hard disk. It is also possible to place all 4 FileChannels on one hard disk. You can also require at least 2 hard disks...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a data processing method based on improved Flume. The method comprises the steps of conducting statistics on the storage data volume of each FileChannel in M FileChannels in apackaged channel assembly at regular time; determining the weight of each FileChannel according to the storage data volume of each FileChannel; and scheduling the data transmission of the source component according to the weight of each FileChannel. The invention further discloses a data processing device based on the improved Flume. According to the method and the device provided by the invention, data processing performance of Flume can be improved.

Description

technical field [0001] The invention relates to the technical field of data processing, in particular to an improved Flume-based data processing method. Background technique [0002] With the rapid development of the Internet industry and the continuous expansion of the network scale, the business systems and network security systems of various enterprises and institutions will generate a large amount of log data every day. Apache Flume is a distributed, reliable, and highly available service that can effectively collect, aggregate, and transmit massive log data. Flume has a simple and flexible architecture for streaming data, with robustness, fault tolerance, reliability mechanisms and many failover and rollback mechanisms. Flume supports customizing various data senders in the system to collect data, and at the same time, Flume can simply process the data and write to various data receivers. [0003] Such as figure 1 As shown, a conventional Flume includes a source comp...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/182
CPCG06F16/182
Inventor 李勇
Owner INSPUR SUZHOU INTELLIGENT TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products