Application failure risk impact range analysis method and system
By dynamically analyzing the associated applications and confidence levels of the target application through the real-time call chain set in the log management platform, the shortcomings of application failure risk range analysis in multi-cross collaborative scenarios are solved, and the stability of the application system is improved.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- DIGITAL ZHEJIANG TECH OPERATION CO LTD
- Filing Date
- 2022-08-30
- Publication Date
- 2026-06-16
AI Technical Summary
Existing application failure risk range analysis methods cannot effectively handle dynamic changes in multi-sectoral collaborative scenarios, resulting in the inability to timely detect and warn of related applications with potential risks, thus affecting the stability of application systems.
By using the real-time call chain set in the log management platform, the associated applications and their confidence levels of the target application are identified. The support level and failure probability of the associated applications are calculated using formulas, and the potential risk range is dynamically analyzed.
Dynamically identify potential risks of application failures in multi-span collaborative scenarios, reduce the likelihood of failures in other applications, and improve the stability of the entire application system.
Smart Images

Figure CN115437812B_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of operation and maintenance technology, and in particular to a method and system for analyzing the impact range of application failure risks. Background Technology
[0002] As digital transformation deepens, there are increasingly more cross-level, cross-regional, cross-system, cross-departmental, and cross-business collaborative scenarios between applications. Furthermore, the topology of applications and the topology between them has changed from traditional single dependencies to complex multi-system and multi-dimensional dependencies. In addition, more and more application systems are adopting agile development models to adapt to the rapid changes and iterations of digital transformation-related businesses, which makes the relationships between application systems also change dynamically.
[0003] Existing application failure risk scope analysis methods are for early warning processing of failures or risks in a single scenario or on a single entity. When an application fails, it cannot obtain information about other applications with potential risks associated with the failed application, which will affect the stability of the entire application system. Summary of the Invention
[0004] In view of this, the purpose of the present invention is to provide a method and system for analyzing the impact range of application failure risks. By identifying the correlation between applications in each call chain, the method determines information about other applications that pose potential risks when one or more applications fail, thereby reducing the likelihood of other applications failing and improving the stability of the entire application system.
[0005] In a first aspect, embodiments of the present invention provide a method for analyzing the impact range of application failure risks, comprising: determining the associated applications of a target application based on a real-time call chain set in a log management platform; wherein, the associated applications are applications that belong to at least one call chain as the target application; determining the confidence level of the associated applications of the target application based on the association relationship between each call chain in the real-time call chain set and the target application; and determining the scope of applications affected when the target application fails based on the confidence level of the associated applications of the target application.
[0006] Furthermore, the real-time call chain set in the log management platform is constructed in the following way: obtain the call chain contained in each log message in the log management platform; obtain all applications in the call chain, and generate a real-time call chain set according to the correspondence between each call chain and its corresponding application.
[0007] Furthermore, the step of determining the associated applications of the target application based on the real-time call chain set in the log management platform includes: obtaining the target call chain containing the target application in the real-time call chain set; and determining the applications that belong to at least one target call chain as associated applications of the target application.
[0008] Furthermore, the step of determining the confidence level of the associated applications of the target application based on the association relationship between each call chain in the real-time call chain set and the target application includes: calculating the support level of the target application and the support level between the target application and the associated applications based on the real-time call chain set; and calculating the confidence level of the associated applications based on the support level of the target application and the support level between the target application and the associated applications.
[0009] Furthermore, the support score of the target application and the support score between the target application and related applications are calculated using the following formulas:
[0010]
[0011] Where P(X) represents the support level of the target application; X represents the target application; count(X) represents the number of call chains in the real-time call chain set that contain the target application; D represents the call chain set in the real-time call chain set; count(D) represents the total number of call chain sets.
[0012]
[0013] Where P(X∩Y) represents the support level between the target application and related applications; count(X∩Y) represents the number of call chains in the real-time call chain set that simultaneously include the target application and related applications.
[0014] Furthermore, the confidence level of the associated application is calculated according to the following formula:
[0015]
[0016] Where X represents the target application; Y represents the associated application; P(X→Y) represents the probability that the associated application will fail when the target application fails; P(X∩Y) represents the support between the target application and the associated application; and P(X) represents the support of the target application.
[0017] Furthermore, the step of determining the scope of applications affected when the target application fails, based on the confidence level of the associated applications of the target application, includes: comparing the confidence level of the associated applications with a preset confidence level; and determining the associated applications whose confidence level is greater than the preset confidence level as the scope of affected applications.
[0018] Furthermore, the target application includes at least one application.
[0019] Secondly, embodiments of the present invention provide an analysis system for the scope of impact of application failure risks, comprising: an associated application determination module, used to determine associated applications of a target application based on a real-time call chain set in a log management platform; wherein, an associated application is an application that belongs to at least one call chain as the target application; a confidence calculation module, used to determine the confidence level of associated applications of the target application based on the association relationship between each call chain in the real-time call chain set and the target application; and an affected application scope determination module, used to determine the scope of applications affected when the target application fails, based on the confidence level of associated applications of the target application.
[0020] Thirdly, embodiments of the present invention provide an electronic device, including a memory and a processor, wherein the memory stores a computer program that can run on the processor, and the processor executes the computer program to implement the method described above.
[0021] This invention provides a method and system for analyzing the impact scope of application failure risks, comprising: determining associated applications of a target application based on a real-time call chain set in a log management platform; wherein, associated applications are applications that belong to at least one call chain as the target application; determining the confidence level of the associated applications of the target application based on the association relationship between each call chain in the real-time call chain set and the target application; and determining the scope of applications affected when the target application fails based on the confidence level of the associated applications of the target application. In this method, by integrating and determining the associated applications of the target application through real-time call chains in the log management platform, it is possible to dynamically determine other application information that poses a potential risk when one or more applications fail in multi-cross collaborative scenarios, thereby reducing the likelihood of other applications failing and improving the stability of the entire application system.
[0022] Other features and advantages of the invention will be set forth in the description which follows, and will be apparent in part from the description, or may be learned by practicing the invention. The objects and other advantages of the invention are realized and obtained in accordance with the structures particularly pointed out in the description, claims and drawings.
[0023] To make the above-mentioned objects, features and advantages of the present invention more apparent and understandable, preferred embodiments are described below in detail with reference to the accompanying drawings. Attached Figure Description
[0024] To more clearly illustrate the specific embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the specific embodiments or the prior art will be briefly introduced below. Obviously, the drawings described below are some embodiments of the present invention. For those skilled in the art, other drawings can be obtained from these drawings without creative effort.
[0025] Figure 1 This is a flowchart of the application failure risk impact range analysis method provided in Embodiment 1 of the present invention;
[0026] Figure 2 This is a schematic diagram of the real-time call chain set provided in Embodiment 1 of the present invention;
[0027] Figure 3 This is a schematic diagram of the application failure risk impact range analysis system provided in Embodiment 2 of the present invention.
[0028] Icons: 1-Associated application determination module; 2-Confidence calculation module; 3-Affected application scope determination module. Detailed Implementation
[0029] To make the objectives, technical solutions, and advantages of the embodiments of the present invention clearer, the technical solutions of the present invention will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of the present invention, not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of the present invention.
[0030] In multi-collaboration scenarios and complex system architectures with multiple system dependencies and multi-dimensional dependencies, the target application for operation and maintenance (O&M) changes dynamically with the scenario. The target application itself also changes dynamically. However, existing methods for analyzing the scope of application failure risks are designed for single scenarios, single failures or risks, and single applications. When an application failure risk warning is issued, it cannot perform multi-dimensional, dynamic, and in-depth analysis of the scope of the application failure's impact and provide timely warnings and handling. For example, when monitoring the CPU utilization of an O&M element, if the CPU of a server is overloaded in a distributed cluster deployment application system, currently only a single service warning can be issued. For traditional IT system O&M, this monitoring technology can achieve good results. However, for the O&M of multi-system and multi-dimensional dependent application systems, a single monitoring O&M approach is obviously risky. It cannot dynamically and instantly discover the impact scope of a single object's failure or risk, potentially exposing related applications to potential risks and even generating secondary related risks.
[0031] In digital government systems, there are tens of thousands of items and applications. Each user's access generates a call chain. Different people handle different items and access different applications, which are associated with their corresponding backend applications, forming a complex set of call chains. When a certain application system or deployment resource experiences a failure risk, we can quickly identify the affected related application systems or deployment resources in the complex call chain based on the application failure risk impact scope analysis method.
[0032] To facilitate understanding of this embodiment, the embodiments of the present invention will be described in detail below.
[0033] Example 1:
[0034] Figure 1 The flowchart is for the application failure risk impact range analysis method provided in Embodiment 1 of the present invention.
[0035] Reference Figure 1 The methods for analyzing the impact range of failure risks include:
[0036] Step S101: Determine the associated applications of the target application based on the real-time call chain set in the log management platform; wherein, the associated applications are those that belong to at least one call chain as the target application.
[0037] Here, the target application and related applications include developer information, operation and maintenance information, contact person, application description, monitoring address, and application scenario. The target application contains at least one application. When the target application changes, its corresponding call chain will also change.
[0038] In one embodiment, in step S101, the real-time call chain set in the log management platform is constructed in the following manner:
[0039] Obtain the call chain contained in each log message in the log management platform;
[0040] Retrieve all applications in the call chain and generate a real-time call chain set based on the correspondence between each call chain and its corresponding application.
[0041] Here, the log management platform stores log information of the user's actual operations, including the call chain of those operations. The call chain is generated based on the user's actual actions and is dynamically changing; the correspondence between each call chain and its corresponding application is also dynamically changing.
[0042] In one embodiment, step S101, which involves determining the associated application of the target application based on the real-time call chain set in the log management platform, includes:
[0043] Retrieve the target call chain of the target application contained in the real-time call chain set;
[0044] Applications that belong to at least one target call chain as the target application are identified as associated applications of the target application.
[0045] Step S102: Determine the confidence level of the associated application of the target application based on the association relationship between each call chain in the real-time call chain set and the target application.
[0046] In one embodiment, step S102, the step of determining the confidence level of the associated application of the target application based on the association relationship between each call chain in the real-time call chain set and the target application, includes:
[0047] Based on the real-time call chain set, calculate the support degree of the target application and the support degree between the target application and related applications;
[0048] Calculate the confidence level of the related applications based on the support level of the target application and the support level between the target application and related applications.
[0049] In one embodiment, the support score of the target application is calculated according to the following formula (1), and the support score between the target application and related applications is calculated according to the following formula (2):
[0050]
[0051] Where P(X) represents the support level of the target application; X represents the target application; count(X) represents the number of call chains in the real-time call chain set that contain the target application; D represents the call chain set in the real-time call chain set; and count(D) represents the total number of call chain sets.
[0052]
[0053] Where P(X∩Y) represents the support level between the target application and related applications; count(X∩Y) represents the number of call chains in the real-time call chain set that simultaneously include the target application and related applications.
[0054] Here, support represents the probability of the target application or related applications appearing. A threshold can be set based on the actual situation; when the support is not less than this threshold, the target application or related application is considered a frequent itemset.
[0055] In one embodiment, the confidence level of the associated application is calculated according to the following formula (3):
[0056]
[0057] Where X represents the target application; Y represents the associated application; P(X→Y) represents the probability that the associated application will fail when the target application fails; P(X∩Y) represents the support between the target application and the associated application; and P(X) represents the support of the target application.
[0058] Step S103: Based on the confidence level of the target application's associated applications, determine the scope of applications affected when the target application fails.
[0059] In one embodiment, step S103 includes: determining the scope of applications affected when the target application malfunctions, based on the confidence level of associated applications of the target application; this includes:
[0060] Compare the confidence scores of the associated applications with the preset confidence scores;
[0061] The scope of affected applications includes those whose confidence level is greater than the preset confidence level.
[0062] Here, based on a pre-set reliability level, the scope of applications affected when a target reference fails is determined. The pre-set reliability level is set in advance according to the actual situation. The pre-set reliability level includes a first pre-set reliability level and a second pre-set reliability level.
[0063] The system identifies applications with a confidence level greater than a pre-set confidence level as first target applications, and defines the set of first target applications as the first scope of influence. It then determines whether all first target applications meet pre-set application requirements; if so, the first scope of influence is defined as the range of affected applications. These pre-set application requirements are set based on actual circumstances.
[0064] If there are applications in the first target associated applications that do not meet the preset associated application requirements, then based on the algorithm's dynamic iteration, the confidence level of the first target associated application is determined, and the confidence level of the first target associated application is compared with the second preset confidence level; the first target associated application whose confidence level is greater than the second preset confidence level is determined as the second target associated application, the set of the second target associated applications is determined as the second influence range, and the second influence range is determined as the range of affected applications.
[0065] Specifically, the application system includes an application set I = {i1, i2, ..., i...} m}, where i represents an application in the application system; m represents the number of applications contained in the application system. The target application set X = {i1, i2, ..., i...} in the application system. p The system contains p applications; within the application system, the set of associated applications is Y = {i1, i2, ..., i...}. q The set contains q applications, where the target application set and the associated application set are subsets of the application set. The real-time call chain set D = {t1, t2, ..., t...} n}, where t represents the real-time call chain, which is generated based on the log information of the user's actual operation and is a subset of I. One t corresponds to a call relationship of a specific business; the set of real-time call chains consists of n call chains.
[0066] In one embodiment, reference is made to Figure 2The diagram shows a real-time call chain set, which includes 6 call chains. Call chain 1 includes i1, i5, i3, i4, i2; call chain 2 includes i2, i5, i3, i4, i6; call chain 3 includes i3, i5; call chain 4 includes i4, i3, i5; call chain 5 includes i5, i6; and call chain 6 includes i6, i3.
[0067] Here, taking i5 as the target application, i5 appeared 5 times in the real-time call chain set, which indicates the support level of i5. The number of times i5 and i2 appear simultaneously is 2, which is the support of i5∩i2. Therefore, when the i5 is at risk of failure, the probability that the i2 will be affected is... Similarly, the probability that i1 is affected is The probability that i3 is affected is The probability that i4 is affected is The probability that i6 is affected is
[0068] Set the default reliability as The affected applications are in the range {i3, i4}. That is, when application 5 fails, applications 3 and 4 will be affected.
[0069] This invention provides a method for analyzing the impact scope of application failure risks, comprising: determining associated applications of a target application based on a real-time call chain set in a log management platform; wherein, associated applications are applications that belong to at least one call chain as the target application; determining the confidence level of the associated applications of the target application based on the association relationship between each call chain in the real-time call chain set and the target application; and determining the scope of applications affected when the target application fails based on the confidence level of the associated applications of the target application. In this method, by integrating and determining the associated applications of the target application through real-time call chains in the log management platform, it is possible to dynamically determine other application information that poses a potential risk when one or more applications fail in multi-cross collaborative scenarios, thereby reducing the likelihood of other applications failing and improving the stability of the entire application system.
[0070] Example 2:
[0071] Figure 3 This is a schematic diagram of the application failure risk impact range analysis system provided in Embodiment 2 of the present invention.
[0072] Reference Figure 3 The analysis system includes:
[0073] The associated application determination module 1 is used to determine the associated applications of the target application based on the real-time call chain set in the log management platform; wherein, the associated application is an application that belongs to at least one call chain as the target application;
[0074] Confidence calculation module 2 is used to determine the confidence of the associated application of the target application based on the association relationship between each call chain in the real-time call chain set and the target application.
[0075] The affected application scope determination module 3 is used to determine the scope of applications affected when the target application fails, based on the confidence level of the associated applications of the target application.
[0076] This invention provides an application failure risk impact scope analysis system, comprising: determining associated applications of a target application based on a real-time call chain set in a log management platform; wherein, associated applications are applications belonging to at least one call chain as the target application; determining the confidence level of associated applications of the target application based on the association relationship between each call chain in the real-time call chain set and the target application; and determining the scope of applications affected when the target application fails based on the confidence level of associated applications of the target application. In this method, by integrating and determining associated applications of the target application through real-time call chains in the log management platform, it is possible to dynamically determine other application information that poses a potential risk when one or more applications fail in multi-cross collaborative scenarios, thereby reducing the likelihood of other applications failing and improving the stability of the entire application system.
[0077] This invention also provides an electronic device, including a memory, a processor, and a computer program stored in the memory and executable on the processor. When the processor executes the computer program, it implements the steps of the application failure risk impact range analysis method provided in the above embodiments.
[0078] The computer program product provided in the embodiments of the present invention includes a computer-readable storage medium storing program code. The instructions included in the program code can be used to execute the methods described in the preceding method embodiments. For specific implementation, please refer to the method embodiments, which will not be repeated here.
[0079] Those skilled in the art will clearly understand that, for the sake of convenience and brevity, the specific working process of the system and apparatus described above can be referred to the corresponding process in the foregoing method embodiments, and will not be repeated here.
[0080] Furthermore, in the description of the embodiments of the present invention, unless otherwise explicitly specified and limited, the terms "installation," "connection," and "linking" should be interpreted broadly. For example, they can refer to a fixed connection, a detachable connection, or an integral connection; they can refer to a mechanical connection or an electrical connection; they can refer to a direct connection or an indirect connection through an intermediate medium; and they can refer to the internal connection of two components. Those skilled in the art can understand the specific meaning of the above terms in the present invention based on the specific circumstances.
[0081] If the aforementioned functions are implemented as software functional units and sold or used as independent products, they can be stored in a computer-readable storage medium. Based on this understanding, the technical solution of this invention, essentially, or the part that contributes to the prior art, or a portion of the technical solution, can be embodied in the form of a software product. This computer software product is stored in a storage medium and includes several instructions to cause a computer device (which may be a personal computer, server, or network device, etc.) to execute all or part of the steps of the methods described in the various embodiments of this invention. The aforementioned storage medium includes various media capable of storing program code, such as USB flash drives, portable hard drives, read-only memory (ROM), random access memory (RAM), magnetic disks, or optical disks.
[0082] In the description of this invention, it should be noted that the terms "center," "upper," "lower," "left," "right," "vertical," "horizontal," "inner," and "outer," etc., indicate the orientation or positional relationship based on the orientation or positional relationship shown in the accompanying drawings. They are used only for the convenience of describing the invention and for simplifying the description, and do not indicate or imply that the device or element referred to must have a specific orientation, or be constructed and operated in a specific orientation. Therefore, they should not be construed as limitations on the invention. Furthermore, the terms "first," "second," and "third" are used for descriptive purposes only and should not be construed as indicating or implying relative importance.
[0083] Finally, it should be noted that the above-described embodiments are merely specific implementations of the present invention, used to illustrate the technical solutions of the present invention, and not to limit it. The scope of protection of the present invention is not limited thereto. Although the present invention has been described in detail with reference to the foregoing embodiments, those skilled in the art should understand that any person skilled in the art can still modify or easily conceive of changes to the technical solutions described in the foregoing embodiments within the technical scope disclosed in the present invention, or make equivalent substitutions for some of the technical features; and these modifications, changes, or substitutions do not cause the essence of the corresponding technical solutions to deviate from the spirit and scope of the technical solutions of the embodiments of the present invention, and should all be covered within the scope of protection of the present invention. Therefore, the scope of protection of the present invention should be determined by the scope of the claims.
Claims
1. A method for analyzing the impact range of application failure risks, characterized in that, include: The associated applications of the target application are determined based on the real-time call chain set in the log management platform; wherein, the associated applications are those that belong to at least one call chain as the target application. Based on the association between each call chain in the real-time call chain set and the target application, the confidence level of the associated application of the target application is determined; Based on the confidence level of the associated applications of the target application, determine the scope of applications affected when the target application fails; The step of determining the scope of applications affected when the target application malfunctions, based on the confidence level of associated applications of the target application, includes: The confidence level of the associated application is compared with the preset confidence level; The associated applications whose confidence level is greater than the preset confidence level are identified as the affected application range; The preset reliability includes a first preset reliability and a second preset reliability; The confidence level of the associated application is compared with the first preset confidence level. The associated applications whose confidence level is greater than the first preset execution level are determined as the first target associated applications. The set of the first target associated applications is determined as the first influence range. Determine whether the first target associated application meets the preset associated application requirements; If the first target associated application meets the preset associated application requirements, the first scope of influence is determined to be the scope of applications affected when the target application malfunctions. If there is an application in the first target associated application that does not meet the preset associated application requirements, the confidence level of the first target associated application is determined based on dynamic algorithm iteration, and the confidence level of the first target associated application is compared with the second preset confidence level; the first target associated application whose confidence level is greater than the second preset confidence level is determined as the second target associated application, the set of the second target associated applications is determined as the second influence range, and the second influence range is determined as the application range affected when the target application fails.
2. The method for analyzing the impact range of application failure risks according to claim 1, characterized in that, The real-time call chain set in the log management platform is constructed in the following manner: Obtain the call chain contained in each log message in the log management platform; All applications in the call chain are obtained, and the real-time call chain set is generated according to the correspondence between each call chain and its corresponding application.
3. The method for analyzing the impact range of application failure risks according to claim 1, characterized in that, The step of determining the associated applications of the target application based on the real-time call chain set in the log management platform includes: Obtain the target call chain of the target application contained in the real-time call chain set; Applications that belong to at least one of the target call chains as the target application are identified as associated applications of the target application.
4. The method for analyzing the impact range of application failure risks according to claim 1, characterized in that, The step of determining the confidence level of the associated application of the target application based on the association relationship between each call chain in the real-time call chain set and the target application includes: Based on the real-time call chain set, calculate the support degree of the target application and the support degree between the target application and related applications; The confidence level of the associated application is calculated based on the support level of the target application and the support level between the target application and associated applications.
5. The application failure risk impact range analysis method according to claim 4, characterized in that, The support score of the target application and the support score between the target application and related applications are calculated using the following formulas: in, This indicates the level of support for the target application; This refers to the target application; This indicates the number of call chains containing the target application in the real-time call chain set; This represents the set of call chains in the real-time call chain set; This indicates the total number of call chain sets; in, This indicates the degree of support between the target application and related applications; This indicates the number of call chains in the real-time call chain set that simultaneously include the target application and the associated application.
6. The method for analyzing the impact range of application failure risks according to claim 5, characterized in that, The confidence level of the associated application is calculated using the following formula: in, This refers to the target application; This refers to the associated application; This indicates the probability that the associated application will fail when the target application fails. This indicates the degree of support between the target application and related applications; This indicates the level of support for the target application.
7. The method for analyzing the impact range of application failure risks according to any one of claims 1-6, characterized in that, The target application includes at least one application.
8. An analysis system for assessing the impact range of application failure risks, characterized in that, The system, applied to the application failure risk impact range analysis method as described in any one of claims 1-7, comprises: The associated application determination module is used to determine the associated applications of a target application based on the real-time call chain set in the log management platform; wherein, the associated application is an application that belongs to at least one call chain as the target application; The confidence calculation module is used to determine the confidence of the associated application of the target application based on the association relationship between each call chain in the real-time call chain set and the target application. The affected application scope determination module is used to determine the scope of applications affected when the target application fails, based on the confidence level of the associated applications of the target application.
9. An electronic device comprising a memory and a processor, wherein the memory stores a computer program executable on the processor, characterized in that, When the processor executes the computer program, it implements the method described in any one of claims 1 to 7.