Unlock instant, AI-driven research and patent intelligence for your innovation.

System and method for monitoring system availability based on statistical method

A technology of monitoring system and statistical method, applied in the direction of transmission system, digital transmission system, electrical components, etc., can solve the problems of false alarm of threshold method, loud alarm noise, manual maintenance of monitoring items, etc., to reduce false alarm and missed alarm, The effect of reducing false positives and false negatives and improving accuracy

Active Publication Date: 2018-09-28
NANJING TUNIU TECHNOLOGY CO LTD
View PDF6 Cites 9 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0006] 1. The thresholds in method 1 and method 2 need to be set manually. The thresholds of different systems vary greatly, and the thresholds of the same system in different periods are also completely different. The setting and maintenance of thresholds have a lot of workload
In actual operation, the trial and error method is generally adopted, that is, the threshold is relaxed after false positives, and the threshold is tightened after false negatives, so that the false positive rate and false negative rate are very high
[0007] 2. The monitoring of method 1 can only partially reflect the availability, but cannot be used as an actual availability indicator. The detected abnormalities do not mean that the system availability is reduced, and when the system is unavailable, they are not all reflected in these monitoring indicators.
[0008] 3. The monitoring of method 2 directly reflects the usability, but as a sampling method, the number of samples is small and the coverage is narrow. It can only monitor read operations and is less used for write operations.
[0009] 4. When the system is large and complex, the above two monitoring methods have too many indicators, a large number of alarms, and loud alarm noise, which will affect the judgment and location of the problem
[0010] 5. When a new system is launched, a new service is launched, or system and service deployment changes, the above two monitoring methods require manual maintenance of monitoring items, and are not suitable for systems with automatic fault switching and dynamic expansion of service capabilities
[0011] 6. When performing error rate monitoring and alarming, the threshold method often causes false alarms. For example, when the error rate requirement does not exceed 1%, if only one operation occurs and fails (error rate 100%), it will alarm, but in most cases no warning
[0012] 7. When multiple systems in a complex system cluster fail at the same time, it is difficult to quickly locate the system that actually failed, and you can only grab the beard and eyebrows, wasting precious time

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • System and method for monitoring system availability based on statistical method
  • System and method for monitoring system availability based on statistical method
  • System and method for monitoring system availability based on statistical method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0088] The technical solutions provided by the present invention will be described in detail below in conjunction with specific examples. It should be understood that the following specific embodiments are only used to illustrate the present invention and not to limit the scope of the present invention. In addition, the steps shown in the flowcharts of the drawings can be executed in a computer system such as a set of computer-executable instructions, and although the logical sequence is shown in the flowcharts, in some cases, they can be different Perform the steps shown or described in the order here.

[0089] We believe that the number of errors in a system unit time t is affected by many independent random factors. Generally, the influence of each factor is very small, so we can study it as a random variable that obeys a normal distribution. . The density function of the normal distribution is:

[0090]

[0091] By collecting the performance data of the system in the past per...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention proposes a system and method for monitoring the system availability based on a statistical method, and the method comprises a system-to-system service call log module, an alarm thresholdanalysis module, a warning alarm module, and a monitoring alarm display module. The system regularly performs the statistical learning of historical data through the collection of call logs between systems, and obtains the general performances of each system. The system performs the analysis of the data in a last time unit t, and recognizes whether the number of current errors of each system is abnormal or not, whether the error rate between system-to-system call is abnormal or not, and whether the usability of all systems and examples of the system are abnormal or not. The system marks the abnormal systems and the call relation between the abnormal systems on a system topological graph in an alarm mode. During the displaying of the alarm information, the system states, the call states between systems and the states of the system services and examples are displayed on the topological graph, so as to achieve the quick locating of systems with problems when systems in a large area haveproblems.

Description

Technical field [0001] The invention belongs to the technical field of software system monitoring, and relates to a system and method for monitoring system availability based on a statistical method. Background technique [0002] Internet companies generally contain a large number of application systems. In addition to open websites and APPs, there are also many internal application systems to support the operation and management of the enterprise. In general, there are more complicated calling relationships among internal application systems, and the functions that one system provides to another system are called services. The application system availability monitoring industry generally adopts the following methods: [0003] Method 1: Use tools such as zabbix to monitor certain indicators of the system server, such as the number of web system processes / threads, CPU load, available memory, the number of http abnormal status codes, and request response time. When the indicator ex...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): H04L12/24H04L12/26
CPCH04L41/0631H04L41/065H04L41/0681H04L41/069H04L41/22H04L43/0817H04L43/16
Inventor 梅存兵
Owner NANJING TUNIU TECHNOLOGY CO LTD