Detection and recovery system and method for RabbitMQ cluster fault

A detection recovery and clustering technology, applied in the field of cloud computing, can solve problems such as unavailable queues, inability to handle network partitions well, and failure of RabbitMQ component services to run normally, so as to improve availability and timeliness

Inactive Publication Date: 2019-09-27
SHANDONG LANGCHAO YUNTOU INFORMATION TECH CO LTD
View PDF4 Cites 16 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0012] In summary, RabbitMQ cluster cannot handle network partitions well. RabbitMQ stores queue, exchange, bindings and other information in Erlang's distributed database Mnesia
When network abnormalities, RabbitMQ service node downtime, CPU soft locks, etc. occur, th...

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Detection and recovery system and method for RabbitMQ cluster fault
  • Detection and recovery system and method for RabbitMQ cluster fault

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0062] The RabbitMQ cluster failure detection and recovery system of the present invention includes an information collection module, an abnormality detection module, a monitoring server, a failure analysis and processing module, a recovery detection module and a data storage module.

[0063] Among them, the information collection module is connected to the RabbitMQ Cluster to obtain health data from the RabbitMQ Cluster. The health data is data related to the health status check of RabbitMQ nodes, including RabbitMQ service status, cluster status, log data, operating system performance indicators, operating system Performance indicators include operating system CPU / memory / disk / system load, etc.

[0064] The monitoring server includes a data processing module and an alarm management module. The data processing module is connected to the information collection module, and the health data collected by the information collection module is uploaded and stored in the data processing...

Embodiment 2

[0077] A method for detecting and recovering a RabbitMQ cluster fault of the present invention is realized based on the detection and recovery system for a RabbitMQ cluster fault disclosed in Embodiment 1, comprising the following steps:

[0078] S100, collect health data through the information collection module, and upload the health data to the monitoring server, and check the health data as data related to the health status check of the RabbitMQ node;

[0079] S200. Detect and analyze the health data through the anomaly detection module, and detect and analyze the consistency of the RabbitMQ cluster status and queue metadata, obtain the detection and analysis results, and upload the detection and analysis results to the monitoring server;

[0080] S300. Generate an alarm message when the detection and analysis results are abnormal through the monitoring server;

[0081] S400, generate a processing result to the RabbitMQ node according to the detection analysis result throu...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a RabbitMQ cluster fault detection and recovery system and method, belongs to the field of cloud computing, and aims to solve the technical problem of how to judge and rapidly recover the fault condition that the RabbitMQ cluster state is normal but cluster queue metadata is inconsistent. The system comprises a structure information acquisition module, an abnormity detection module, a monitoring server, a fault analysis processing module, a recovery detection module and a data storage module. The method comprises the steps of collecting health data; detecting and analyzing the health data, and detecting and analyzing the consistency of the RabbitMQ cluster state and the queue metadata to obtain a detection and analysis result; when it is detected that the analysis result is abnormal, generating alarm information; generating a processing result of the RabbitMQ node according to a detection analysis result; and after the fault processing is finished, verifying the availability of the RabbitMQ cluster.

Description

technical field [0001] The invention relates to the field of cloud computing, in particular to a RabbitMQ cluster failure detection and recovery system and method. Background technique [0002] The main features of AMQP (Advanced Message Queuing Protocol) are message-oriented, queue, routing (including point-to-point and publish / subscribe), reliability, and security. RabbitMQ is an open source implementation of AMQP's message middleware, which is mainly used to store and forward messages in distributed systems. The RabbitMQ server is written in Erlang language and supports multiple clients, such as Java, Python, C, etc. [0003] The cluster modes provided by RabbitMQ are divided into: ordinary cluster mode and mirror cluster mode. [0004] Ordinary cluster mode is the default cluster mode. The following three nodes (rabbit01, rabbit02, rabbit03) are used as examples to illustrate: for Queue, the message entity only exists in one of the nodes rabbit01 (or rabbit02, or rabbi...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): H04L12/24H04L29/08
CPCH04L41/0631H04L41/0654H04L67/1097
Inventor 宋伟蔡卫卫谢涛涛赖振
Owner SHANDONG LANGCHAO YUNTOU INFORMATION TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products