Fault location methods and devices
By constructing a fault location model based on LSTM and BP neural networks, the fault labels and root causes of data center business systems are automatically identified, solving the problem of difficult fault location in existing technologies and achieving efficient fault handling and improved system stability.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- CHINA MOBILE INFORMATION TECHNOLOGY CO LTD
- Filing Date
- 2022-03-25
- Publication Date
- 2026-06-30
AI Technical Summary
The existing data center business system is difficult to locate faults, it is difficult to determine the source of the fault, it is time-consuming and labor-intensive, it relies on the experience and knowledge of operation and maintenance experts, and it is inefficient.
A fault location model based on Long Short-Term Neural Network (LSTM) and Backpropagation Backpropagation (BP) neural network is constructed. Through the label recommendation model and the fault location model, machine learning techniques are used to automatically identify fault labels and root causes.
Quickly locate the root cause of the fault, save time, reduce manual intervention, improve fault handling efficiency, reduce economic losses, and improve system stability.
Smart Images

Figure CN116860529B_ABST
Abstract
Description
Technical Field
[0001] This application relates to the field of computer technology, specifically to a fault location method and apparatus. Background Technology
[0002] Currently, existing data center business systems are deployed in a distributed manner, divided into three layers: Infrastructure as a Service (IaaS), Platform as a Service (PaaS), and Software as a Service (SaaS). A business system is deployed on multiple virtual machines in a cluster and uses many third-party components. When a failure occurs, it is difficult to pinpoint which layer caused the failure. Generally, operations and maintenance experts determine the cause of the anomaly based on their experience and knowledge, check the relevant machines based on error codes, and examine each node bit by bit from logs, status, and other aspects. Sometimes, it is even necessary to hold expert group meetings to jointly diagnose the problem, which is time-consuming and labor-intensive. Summary of the Invention
[0003] This application provides a fault location method and apparatus to solve the technical problem of low work order allocation efficiency.
[0004] In a first aspect, embodiments of this application provide a fault location method, including:
[0005] Obtain the target fault description statement, input the target fault description statement into the label recommendation model, and obtain at least one fault label output by the label recommendation model;
[0006] Input the at least one fault label into the fault location model to obtain the root cause of the fault corresponding to the target fault description statement output by the fault location model;
[0007] The tag recommendation model is obtained by training a long short-term neural network (LSTM) model based on fault description statement samples, fault tag samples corresponding to the fault description statement samples, and a weighted binary cross-entropy loss function.
[0008] The fault location model is obtained by training a backpropagation BP neural network model based on the fault label samples and the corresponding fault root cause samples.
[0009] In one embodiment, the label recommendation model includes a word embedding layer, a bidirectional LSTM layer, and an attention layer;
[0010] The step of inputting the target fault description statement into the label recommendation model to obtain at least one fault label output by the label recommendation model includes:
[0011] The target fault description statement is input into the word embedding layer to obtain the word vectors corresponding to each word in the target fault description statement output by the word embedding layer.
[0012] The word vectors are input into the bidirectional LSTM layer, and forward LSTM and backward LSTM processing are performed to obtain the hidden layer states corresponding to each word output by the bidirectional LSTM layer.
[0013] The hidden layer state is input into the attention layer to determine the weight corresponding to each hidden layer state, and the vector representation of the target fault description statement is determined based on the hidden layer state and the weight corresponding to each hidden layer state.
[0014] The confidence probability of each candidate fault label is determined based on the vector representation of the target fault description statement, and the at least one fault label output by the label recommendation model is obtained based on the magnitude of the confidence probability of each candidate fault label.
[0015] In one embodiment, determining the weight corresponding to each hidden layer state includes:
[0016] Based on the weight matrix, bias vector, and each hidden layer state, the energy value of each hidden layer state is determined;
[0017] The weights corresponding to each hidden layer state are determined based on the energy value of each hidden layer state and the initial attention matrix.
[0018] In one embodiment, it also includes:
[0019] Obtain the fault label sample and the corresponding fault root cause sample;
[0020] The fault label samples and the corresponding fault root cause samples are input into the LSTM model for training to obtain the positive and negative fault labels output by the LSTM model.
[0021] Based on the positive category fault label and the negative category fault label, the positive category loss and the negative category loss are calculated using the weighted binary cross-entropy loss function, and the total loss is obtained based on the positive category loss and the negative category loss.
[0022] If the total loss is less than a first threshold, the parameters of the LSTM model are saved to obtain the label recommendation model.
[0023] In one embodiment, the method further includes:
[0024] Determine the input layer and the output layer;
[0025] Based on the number of nodes in the input layer and the number of nodes in the output layer, the number of neurons is determined using the first formula, and the hidden layer is determined based on the number of neurons.
[0026] The BP neural network model is obtained based on the input layer, the output layer, and the hidden layer;
[0027] The first formula is:
[0028]
[0029] Where l is the number of neurons, n is the number of nodes in the input layer, m is the number of nodes in the output layer, and a is the adjustment parameter.
[0030] Secondly, embodiments of this application provide a fault location device, comprising:
[0031] The fault label determination module is used to: obtain a target fault description statement, input the target fault description statement into a label recommendation model, and obtain at least one fault label output by the label recommendation model;
[0032] The fault root cause determination module is used to: input the at least one fault label into the fault location model to obtain the fault root cause corresponding to the target fault description statement output by the fault location model;
[0033] The tag recommendation model is obtained by training a long short-term neural network (LSTM) model based on fault description statement samples, fault tag samples corresponding to the fault description statement samples, and a weighted binary cross-entropy loss function.
[0034] The fault location model is obtained by training a backpropagation BP neural network model based on the fault label samples and the corresponding fault root cause samples.
[0035] In one embodiment, the label recommendation model includes a word embedding layer, a bidirectional LSTM layer, and an attention layer;
[0036] The step of inputting the target fault description statement into the label recommendation model to obtain at least one fault label output by the label recommendation model includes:
[0037] The target fault description statement is input into the word embedding layer to obtain the word vectors corresponding to each word in the target fault description statement output by the word embedding layer.
[0038] The word vectors are input into the bidirectional LSTM layer, and forward LSTM and backward LSTM processing are performed to obtain the hidden layer states corresponding to each word output by the bidirectional LSTM layer.
[0039] The hidden layer state is input into the attention layer to determine the weight corresponding to each hidden layer state, and the vector representation of the target fault description statement is determined based on the hidden layer state and the weight corresponding to each hidden layer state.
[0040] The confidence probability of each candidate fault label is determined based on the vector representation of the target fault description statement, and the at least one fault label output by the label recommendation model is obtained based on the magnitude of the confidence probability of each candidate fault label.
[0041] Thirdly, embodiments of this application provide an electronic device, including a processor and a memory storing a computer program, wherein the processor executes the program to implement the fault location method described in the first aspect.
[0042] Fourthly, embodiments of this application provide a non-transitory computer-readable storage medium having a computer program stored thereon, wherein the computer program, when executed by a processor, implements the fault location method as described in the first aspect.
[0043] Fifthly, embodiments of this application provide a computer program product, including a computer program that, when executed by a processor, implements the fault location method described in the first aspect.
[0044] The fault location method and apparatus provided in this application construct a label recommendation model and a fault location model through machine learning. When a fault occurs, the maintenance personnel only need to input the description data of the alarm problem into the label recommendation model to obtain the alarm label corresponding to the description of the alarm problem. Then, the alarm label can be input into the fault location model, thereby quickly finding the root cause of the fault, saving fault location time, solving the problem of fault location that must rely on experience, and shortening the fault handling time. Attached Figure Description
[0045] To more clearly illustrate the technical solutions in this application or the prior art, the drawings used in the description of the embodiments or the prior art will be briefly introduced below. Obviously, the drawings described below are some embodiments of this application. For those skilled in the art, other drawings can be obtained based on these drawings without creative effort.
[0046] Figure 1 This is one of the flowcharts illustrating the fault location method provided in this application;
[0047] Figure 2 This is a schematic diagram of the tag recommendation model provided in this application;
[0048] Figure 3This application provides a flowchart illustrating the process of inputting the target fault description statement into the word embedding layer to obtain the word vectors corresponding to each word in the target fault description statement output by the word embedding layer.
[0049] Figure 4 This is a flowchart illustrating the determination of a BP neural network provided in this application;
[0050] Figure 5 This is the second flowchart of the fault location method provided in this application;
[0051] Figure 6 This is a schematic diagram of the fault location device provided in this application;
[0052] Figure 7 This is a schematic diagram of the structure of the electronic device provided in this application. Detailed Implementation
[0053] To make the objectives, technical solutions, and advantages of this application clearer, the technical solutions of this application will be clearly and completely described below with reference to the accompanying drawings of the embodiments. Obviously, the described embodiments are only some embodiments of this application, not all embodiments. Based on the embodiments of this application, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of this application.
[0054] Figure 1 This is one of the flowcharts illustrating the fault location method provided in this application. (Refer to...) Figure 1 This application provides a fault location method, which may include steps 100 and 101.
[0055] Step 100: Obtain the target fault description statement, input the target fault description statement into the label recommendation model, and obtain at least one fault label output by the label recommendation model.
[0056] Optionally, the target fault description statement can be a statement obtained from the operation and maintenance alarm system, or it can be a manually edited statement describing the fault that has occurred.
[0057] The target fault description statement includes some situations when the fault occurs, such as the system page failing to load, an error occurring when clicking submit within the system, or a network connection failure message.
[0058] Before inputting the target fault description statement into the label recommendation model, the target fault description statement needs to be preprocessed, including word segmentation and stop word removal.
[0059] Word segmentation means dividing the target fault description statement into individual words or phrases according to rules, while stop word removal means removing meaningless articles, prepositions, and modal particles from the segmented words or phrases.
[0060] The label recommendation model is obtained by training a long short-term neural network (LSTM) model based on fault description statement samples, fault label samples corresponding to the fault description statement samples, and a weighted binary cross-entropy loss function.
[0061] Optionally, a Bi-LSTM model based on an attention mechanism is constructed based on the relationship between fault description statements and fault label samples in historical fault problem data, and the model is trained and tested using the training set and test set in S1 to obtain a label recommendation model.
[0062] Figure 2 This is a schematic diagram of the structure of the tag recommendation model provided in the embodiments of this application, such as... Figure 2 As shown, in some embodiments, the label recommendation model includes a Word Embedding Layer, a Bi-LSTM Layer, and an Attention mechanism, with the activation function used during training being Sigmoid.
[0063] Figure 3 This is a flowchart illustrating how the target fault description statement is input into the word embedding layer to obtain word vectors corresponding to each word in the target fault description statement output by the word embedding layer, as provided in this embodiment of the application. Figure 3 As shown, step 100 includes steps 300, 301, 302 and 303.
[0064] Step 300: Input the target fault description statement into the word embedding layer to obtain the word vectors corresponding to each word in the target fault description statement output by the word embedding layer.
[0065] Optionally, the word embedding layer, also called the embedding layer, is used to obtain the vector representation of each preprocessed alarm problem description text. The embedding layer utilizes the previously trained Word2vec word vector model to query the word vector of each word or phrase and combine them into a sentence vector.
[0066] During training, the Trainable parameter of the Embedding layer can be set to True to perform backpropagation updates on the word vectors.
[0067] Step 301: Input the word vector into the bidirectional LSTM layer, perform forward LSTM processing and backward LSTM processing, and obtain the hidden layer state corresponding to each word output by the bidirectional LSTM layer.
[0068] Given a target fault description statement, the word vector representation q = [x1, x2, ..., x] can be obtained through a word embedding layer. n ], where x1, x2, ..., x n This represents the word vector of each word or phrase in the target fault description statement.
[0069] This word vector representation is fed into a Bi-LSTM layer to extract features. Hidden layer unit state h i This represents the output obtained after the word vector at time step i is processed by the Bi-LSTM layer.
[0070] Optionally, h i Obtained by forward LSTM processing and the result of reverse LSTM processing It is formed by combining elements.
[0071] in Cell state from the previous LSTM unit Hidden layer state and the current word vector input x i Calculated; and Cell state from an LSTM unit Hidden layer state and the current word vector input x i get.
[0072] and The calculation formula is as follows:
[0073]
[0074]
[0075] Where f (LSTM) This refers to the LSTM algorithm.
[0076] The final hidden layer unit state h at the i-th time step i It can be by and The result is obtained by piecing together, that is In this way, this embodiment considers the context of the entire target fault description statement when extracting semantic features, which can more accurately describe the long sequence information of the problem text.
[0077] Step 302: Input the hidden layer state into the attention layer, determine the weight corresponding to each hidden layer state, and determine the vector representation of the target fault description statement based on the hidden layer state and the weight corresponding to each hidden layer state.
[0078] Optionally, determining the weights corresponding to each hidden layer state includes:
[0079] Based on the weight matrix, bias vector, and each hidden layer state, the energy value of each hidden layer state is determined;
[0080] The weights corresponding to each hidden layer state are determined based on the energy value of each hidden layer state and the initial attention matrix.
[0081] It should be noted that, in order to capture the key parts of the alarm problem description text, an attention mechanism is applied to focus more on information closely related to the alarm label.
[0082] Note that the mechanism first involves weight α in the method of this invention. i {1≤i≤n} are appended to the hidden layer state h. i The weight calculation formula is as follows:
[0083]
[0084] Where n is the number of hidden states, h0 is the randomly initialized attention matrix, and a i Indicates the hidden state h i A definite energy value. i The larger the value of h, the better. i The greater the attention weight, the better.
[0085] a i The following formula is used to calculate:
[0086] a i =tanh(W h ·h i +b h )
[0087] Among them, W h is the weight matrix, and bh is the bias vector.
[0088] Based on the hidden state hi and its corresponding weight a i The target fault description statement, after feature extraction, can be calculated using the following formula:
[0089]
[0090] Step 303: Determine the confidence probability of each candidate fault label based on the vector representation of the target fault description statement, and obtain the at least one fault label output by the label recommendation model based on the magnitude of the confidence probability of each candidate fault label.
[0091] In the final layer of the neural network in the label recommendation model, the Sigmoid function is used as the activation function to obtain a value representing the confidence probability of each label. Unlike Softmax, Sigmoid ensures that the confidence probability of each label is independent.
[0092] Given the input Q = [q1, q2, ..., q] of a fully connected layer n ] and weight vector W = [W1, W2, ..., W n ], a list of independent probabilities for candidate labels It can be calculated using the following formula:
[0093]
[0094] Where, q i is the i-th element in Q, Wi is the i-th element in W, b is the bias vector, and n is the number of labels in the candidate label set.
[0095] In some embodiments, the fault location method further includes a trained label recommendation model, specifically including:
[0096] Obtain the fault label sample and the corresponding fault root cause sample.
[0097] The fault label samples and the corresponding fault root cause samples are input into the LSTM model for training, and the positive and negative fault labels output by the LSTM model are obtained.
[0098] Based on the positive and negative fault labels, the positive and negative class losses are calculated using a weighted binary cross-entropy loss function, and the total loss is obtained based on these losses. Considering the significant bias in the distribution of recommended and unrecommended labels—that is, most candidate labels are not recommended to the alarm problem description—a weighted binary cross-entropy loss function is set to balance the losses between the positive and negative classes, defined as follows:
[0099]
[0100] in It is a list of actual confidence probabilities. It is a pre-defined list of confidence probabilities, and β is the weight attached to the positive samples.
[0101] If the total loss is less than a first threshold, the parameters of the LSTM model are saved to obtain the label recommendation model.
[0102] The Bi-LSTM model based on the attention mechanism can calculate the independent confidence probability of each label in the candidate set, and the labels with the highest confidence probability values will be recommended to the user as fault labels.
[0103] Step 101: Input the at least one fault label into the fault location model to obtain the root cause of the fault corresponding to the target fault description statement output by the fault location model.
[0104] The fault location model is obtained by training a backpropagation (BP) neural network model based on the fault label samples and the corresponding fault root cause samples. The BP neural network model includes an input layer, an output layer, and a hidden layer. The number of neurons in the hidden layer is determined based on the number of nodes in the input layer and the number of nodes in the output layer.
[0105] Among them, the BP neural network (Back-Propagation Network) is also known as the back-propagation neural network. The BP network can be trained based on sample data, and the network weights and thresholds are continuously adjusted to make the error function decrease along the negative gradient direction, approaching the desired output. When the loss is less than the threshold, the model parameters of the BP neural network are saved to obtain the fault location model.
[0106] The embodiments of this application use a three-layer multi-input single-output BP network with one hidden layer to establish a prediction model.
[0107] The root cause of a fault refers to the source that causes the fault, including: network fiber optic cables, servers, storage devices, operating systems, server programs, and databases, etc.
[0108] It should be noted that the fault location model is a multi-input single-output network model. After inputting at least one fault label into the fault location model, the fault location model outputs the fault root cause corresponding to at least one fault label, which is the fault root cause corresponding to the target fault description statement.
[0109] The work order matching method provided in this application constructs a tag recommendation model and a fault location model through machine learning. When a fault occurs, the maintenance personnel only need to input the description data of the alarm problem into the tag recommendation model to obtain the alarm tag corresponding to the alarm problem description. Then, the alarm tag can be input into the fault location model, thereby quickly finding the root cause of the fault, saving fault location time, solving the problem of fault location that previously relied on experience, thereby shortening the fault handling time, reducing the impact on the production system, reducing economic losses caused by faults, and improving the stability of the system.
[0110] In some embodiments, the fault location method further includes: determining a BP neural network.
[0111] Figure 4 This is a flowchart illustrating the process of determining a BP neural network according to an embodiment of this application, as shown below. Figure 4 As shown, determining the BP neural network includes steps 400, 401, and 402.
[0112] Step 400: Determine the input layer and the output layer.
[0113] Optionally, the BP neural network model takes alarm labels from historical fault data as input and alarm root causes from historical fault data as output.
[0114] Step 401: Determine the number of neurons based on the number of nodes in the input layer and the number of nodes in the output layer, and determine the hidden layer based on the number of neurons.
[0115] Determining the number of hidden layer neurons is crucial in network design. Too many neurons increase computational cost and can lead to overfitting; too few neurons negatively impact performance and fail to achieve the desired results. The number of hidden layer neurons is directly related to the complexity of the problem, the number of neurons in the input and output layers, and the desired error. Currently, there is no definitive formula for determining the number of hidden layer neurons; only some empirical formulas exist. The final determination of the number of neurons still requires experience and multiple experiments.
[0116] In some embodiments, the following formula is used to determine the number of hidden layer neurons:
[0117]
[0118] Where l is the number of neurons, n is the number of nodes in the input layer, m is the number of nodes in the output layer, and a is the adjustment parameter.
[0119] Optionally, a is an adjustment constant between [1, 10].
[0120] Step 402: Obtain the BP neural network model based on the input layer, the output layer, and the hidden layer.
[0121] It is understandable that a BP neural network model consists of an input layer, an output layer, and a hidden layer. By concatenating these three network layers, a BP neural network model can be obtained.
[0122] In some embodiments, the fault location method further includes: training the label recommendation model and the fault location model.
[0123] Figure 5 This is a second flowchart illustrating the fault location method provided in the embodiments of this application, as shown below. Figure 5 As shown, the fault location method includes steps 500, 501, 502, 503 and 504.
[0124] Step 500: Obtain historical fault and problem data.
[0125] Optionally, the alarm problem description text in the collected historical fault problem data needs to be preprocessed, including word segmentation, stop word removal and data cleaning.
[0126] Word segmentation means dividing the alarm problem description into individual words or phrases according to rules; stop word removal means removing meaningless articles, prepositions, and modal particles from the segmented words or phrases; and data cleaning means removing abnormal data from the alarm problem description text.
[0127] Step 501: Extract fault label samples, fault description statement samples, and fault root cause samples from historical fault problem data.
[0128] Optionally, fault label samples, fault description sentence samples, and fault root cause samples can be extracted from historical fault problem data by manual annotation, or text recognition can be used to extract keywords from historical fault problem data to obtain fault label samples, fault description sentence samples, and fault root cause samples.
[0129] Step 502: Divide the fault label samples, fault description statement samples, and fault root cause samples into training sets and test sets.
[0130] Alternatively, the training and test sets can be divided in a 2:1 ratio. The training set is used to train the model, and the test set is used to test the training results.
[0131] Step 503: Train the model using the training set and the test set to obtain the label recommendation model and the fault location model.
[0132] Optionally, the fault location model is trained using a BP neural network.
[0133] Backpropagation (BP) neural networks typically use the sigmoid differentiable function and linear functions as activation functions. Optionally, the sigmoid tangent function (tansig) is chosen as the activation function for hidden layer neurons. However, since the output of a BP neural network needs to be normalized to the range [-1, 1], the prediction model selects the sigmoid logarithmic function (tansig) as the activation function for output layer neurons.
[0134] After normalizing the training sample data, input it into the BP neural network. Set the activation functions for the hidden layer and output layer to tansig and logsig, respectively. The network training function is traingdx, and the network performance function is mse. The initial number of hidden layer neurons is set to 2. Configure the network parameters: 5000 epochs, an expected error goal of 0.00000001, and a learning rate lr of 0.01. After setting the parameters, begin training the network.
[0135] The BP neural network completes its learning process after reaching the desired error through 24 repetitions.
[0136] After the BP neural network is trained, a fault location model is obtained. Simply input the alarm label into the fault location model to obtain the predicted alarm root cause data corresponding to the alarm label.
[0137] Step 504: Use the model obtained after training to locate the fault.
[0138] Input the fault description statement into the label recommendation model to obtain the fault label output by the label recommendation model. Input the fault label into the fault location model to obtain the fault root cause output by the fault location model.
[0139] The work order matching method provided in this application constructs a tag recommendation model and a fault location model through machine learning. When a fault occurs, the maintenance personnel only need to input the description data of the alarm problem into the tag recommendation model to obtain the alarm tag corresponding to the alarm problem description. Then, the alarm tag can be input into the fault location model, thereby quickly finding the root cause of the fault, saving fault location time, solving the problem of fault location that previously relied on experience, thereby shortening the fault handling time, reducing the impact on the production system, reducing economic losses caused by faults, and improving the stability of the system.
[0140] The fault location device provided in the embodiments of this application is described below. The fault location device described below and the fault location method described above can be referred to each other.
[0141] Figure 6 This is a schematic diagram of the fault location device provided in the embodiments of this application, as shown below. Figure 6 As shown, the fault location device 600 includes a fault tag determination module 610 and a fault root cause determination module 620.
[0142] The fault label determination module 610 is used to: obtain a target fault description statement, input the target fault description statement into a label recommendation model, and obtain at least one fault label output by the label recommendation model;
[0143] The fault root cause determination module 620 is used to: input the at least one fault label into the fault location model to obtain the fault root cause corresponding to the target fault description statement output by the fault location model;
[0144] The tag recommendation model is obtained by training a long short-term neural network (LSTM) model based on fault description statement samples, fault tag samples corresponding to the fault description statement samples, and a weighted binary cross-entropy loss function.
[0145] The fault location model is obtained by training a backpropagation BP neural network model based on the fault label samples and the corresponding fault root cause samples.
[0146] Optionally, the tag recommendation model includes a word embedding layer, a bidirectional LSTM layer, and an attention layer;
[0147] The step of inputting the target fault description statement into the label recommendation model to obtain at least one fault label output by the label recommendation model includes:
[0148] The target fault description statement is input into the word embedding layer to obtain the word vectors corresponding to each word in the target fault description statement output by the word embedding layer.
[0149] The word vectors are input into the bidirectional LSTM layer, and forward LSTM and backward LSTM processing are performed to obtain the hidden layer states corresponding to each word output by the bidirectional LSTM layer.
[0150] The hidden layer state is input into the attention layer to determine the weight corresponding to each hidden layer state, and the vector representation of the target fault description statement is determined based on the hidden layer state and the weight corresponding to each hidden layer state.
[0151] The confidence probability of each candidate fault label is determined based on the vector representation of the target fault description statement, and the at least one fault label output by the label recommendation model is obtained based on the magnitude of the confidence probability of each candidate fault label.
[0152] Optionally, determining the weights corresponding to each of the hidden layer states includes:
[0153] Based on the weight matrix, bias vector, and each hidden layer state, the energy value of each hidden layer state is determined;
[0154] The weights corresponding to each hidden layer state are determined based on the energy value of each hidden layer state and the initial attention matrix.
[0155] Optionally, the fault location device 600 also includes a model training module for:
[0156] Obtain the fault label sample and the corresponding fault root cause sample;
[0157] The fault label samples and the corresponding fault root cause samples are input into the LSTM model for training to obtain the positive and negative fault labels output by the LSTM model.
[0158] Based on the positive category fault label and the negative category fault label, the positive category loss and the negative category loss are calculated using the weighted binary cross-entropy loss function, and the total loss is obtained based on the positive category loss and the negative category loss.
[0159] If the total loss is less than a first threshold, the parameters of the LSTM model are saved to obtain the label recommendation model.
[0160] Optionally, the fault location device 600 further includes a model determination module for:
[0161] Determine the input layer and the output layer;
[0162] Based on the number of nodes in the input layer and the number of nodes in the output layer, the number of neurons is determined using the first formula, and the hidden layer is determined based on the number of neurons.
[0163] The BP neural network model is obtained based on the input layer, the output layer, and the hidden layer;
[0164] The first formula is:
[0165]
[0166] Where l is the number of neurons, n is the number of nodes in the input layer, m is the number of nodes in the output layer, and a is the adjustment parameter.
[0167] It should be noted that the apparatus provided in this embodiment of the invention can implement all the method steps implemented in the above method embodiment and can achieve the same technical effect. Therefore, the parts and beneficial effects that are the same as those in the method embodiment will not be described in detail here.
[0168] Figure 7 An example is a schematic diagram of the physical structure of an electronic device, such as... Figure 7 As shown, the electronic device may include: a processor 710, a communication interface 720, a memory 730, and a communication bus 740, wherein the processor 710, the communication interface 720, and the memory 730 communicate with each other via the communication bus 740. The processor 710 can call a computer program in the memory 730 to execute the steps of the work order matching method, such as including:
[0169] Obtain the target fault description statement, input the target fault description statement into the label recommendation model, and obtain at least one fault label output by the label recommendation model;
[0170] Input the at least one fault label into the fault location model to obtain the root cause of the fault corresponding to the target fault description statement output by the fault location model;
[0171] The tag recommendation model is obtained by training a long short-term neural network (LSTM) model based on fault description statement samples, fault tag samples corresponding to the fault description statement samples, and a weighted binary cross-entropy loss function.
[0172] The fault location model is obtained by training a backpropagation BP neural network model based on the fault label samples and the corresponding fault root cause samples.
[0173] Furthermore, the logical instructions in the aforementioned memory 730 can be implemented as software functional units and, when sold or used as independent products, can be stored in a computer-readable storage medium. Based on this understanding, the technical solution of this application, in essence, or the part that contributes to the prior art, or a portion of the technical solution, can be embodied in the form of a software product. This computer software product is stored in a storage medium and includes several instructions to cause a computer device (which may be a personal computer, server, or network device, etc.) to execute all or part of the steps of the methods described in the various embodiments of this application. The aforementioned storage medium includes various media capable of storing program code, such as USB flash drives, portable hard drives, read-only memory (ROM), random access memory (RAM), magnetic disks, or optical disks.
[0174] On the other hand, embodiments of this application also provide a computer program product, which includes a computer program that can be stored on a non-transitory computer-readable storage medium. When the computer program is executed by a processor, the computer can perform the steps of the fault location method provided in the above embodiments, such as including:
[0175] Obtain the target fault description statement, input the target fault description statement into the label recommendation model, and obtain at least one fault label output by the label recommendation model;
[0176] Input the at least one fault label into the fault location model to obtain the root cause of the fault corresponding to the target fault description statement output by the fault location model;
[0177] The tag recommendation model is obtained by training a long short-term neural network (LSTM) model based on fault description statement samples, fault tag samples corresponding to the fault description statement samples, and a weighted binary cross-entropy loss function.
[0178] The fault location model is obtained by training a backpropagation BP neural network model based on the fault label samples and the corresponding fault root cause samples.
[0179] On the other hand, embodiments of this application also provide a processor-readable storage medium storing a computer program for causing a processor to perform the steps of the methods provided in the above embodiments, such as including:
[0180] Obtain the target fault description statement, input the target fault description statement into the label recommendation model, and obtain at least one fault label output by the label recommendation model;
[0181] Input the at least one fault label into the fault location model to obtain the root cause of the fault corresponding to the target fault description statement output by the fault location model;
[0182] The tag recommendation model is obtained by training a long short-term neural network (LSTM) model based on fault description statement samples, fault tag samples corresponding to the fault description statement samples, and a weighted binary cross-entropy loss function.
[0183] The fault location model is obtained by training a backpropagation BP neural network model based on the fault label samples and the corresponding fault root cause samples.
[0184] The processor-readable storage medium can be any available medium or data storage device that the processor can access, including but not limited to magnetic memory (e.g., floppy disk, hard disk, magnetic tape, magneto-optical disk (MO)), optical memory (e.g., CD, DVD, BD, HVD), and semiconductor memory (e.g., ROM, EPROM, EEPROM, non-volatile memory (NAND FLASH), solid-state drive (SSD)).
[0185] The device embodiments described above are merely illustrative. The units described as separate components may or may not be physically separate. The components shown as units may or may not be physical units; that is, they may be located in one place or distributed across multiple network units. Some or all of the modules can be selected to achieve the purpose of this embodiment according to actual needs. Those skilled in the art can understand and implement this without any creative effort.
[0186] Through the above description of the embodiments, those skilled in the art can clearly understand that each embodiment can be implemented by means of software plus necessary general-purpose hardware platforms, and of course, it can also be implemented by hardware. Based on this understanding, the above technical solutions, in essence or the part that contributes to the prior art, can be embodied in the form of a software product. This computer software product can be stored in a computer-readable storage medium, such as ROM / RAM, magnetic disk, optical disk, etc., and includes several instructions to cause a computer device (which may be a personal computer, server, or network device, etc.) to execute the methods described in the various embodiments or some parts of the embodiments.
[0187] Finally, it should be noted that the above embodiments are only used to illustrate the technical solutions of this application, and are not intended to limit them. Although this application has been described in detail with reference to the foregoing embodiments, those skilled in the art should understand that modifications can still be made to the technical solutions described in the foregoing embodiments, or equivalent substitutions can be made to some of the technical features. Such modifications or substitutions do not cause the essence of the corresponding technical solutions to deviate from the spirit and scope of the technical solutions of the embodiments of this application.
Claims
1. A fault location method, characterized in that, include: Obtain the target fault description statement, input the target fault description statement into the word embedding layer of the label recommendation model, and obtain the word vectors corresponding to each word in the target fault description statement output by the word embedding layer; The word vectors are input into the bidirectional LSTM layer of the tag recommendation model, and forward LSTM and backward LSTM processing are performed to obtain the hidden layer states corresponding to each word output by the bidirectional LSTM layer. The hidden layer states are input into the attention layer of the label recommendation model to determine the weights corresponding to each hidden layer state, and the vector representation of the target fault description statement is determined based on the hidden layer states and the weights corresponding to each hidden layer state. The confidence probability of each candidate fault label is determined based on the vector representation of the target fault description statement, and the at least one fault label output by the label recommendation model is obtained based on the magnitude of the confidence probability of each candidate fault label. Input the at least one fault label into the fault location model to obtain the root cause of the fault corresponding to the target fault description statement output by the fault location model; The tag recommendation model is obtained by training a long short-term neural network (LSTM) model based on fault description statement samples, fault tag samples corresponding to the fault description statement samples, and a weighted binary cross-entropy loss function. The fault location model is obtained by training a backpropagation BP neural network model based on the fault label samples and the corresponding fault root cause samples.
2. The fault location method according to claim 1, characterized in that, Determining the weights corresponding to each hidden layer state includes: Based on the weight matrix, bias vector, and each hidden layer state, the energy value of each hidden layer state is determined; The weights corresponding to each hidden layer state are determined based on the energy value of each hidden layer state and the initial attention matrix.
3. The fault location method according to claim 1, characterized in that, Also includes: Obtain the fault label sample and the corresponding fault root cause sample; The fault label samples and the corresponding fault root cause samples are input into the LSTM model for training to obtain the positive and negative fault labels output by the LSTM model. Based on the positive category fault label and the negative category fault label, the positive category loss and the negative category loss are calculated using the weighted binary cross-entropy loss function, and the total loss is obtained based on the positive category loss and the negative category loss. If the total loss is less than a first threshold, the parameters of the LSTM model are saved to obtain the label recommendation model.
4. The fault location method according to claim 1, characterized in that, The method further includes: Determine the input layer and the output layer; Based on the number of nodes in the input layer and the number of nodes in the output layer, the number of neurons is determined using the first formula, and the hidden layer is determined based on the number of neurons. The BP neural network model is obtained based on the input layer, the output layer, and the hidden layer; The first formula is: ; in, The number of neurons, The number of nodes in the input layer. The number of nodes in the output layer. To adjust the parameters.
5. A fault location device, characterized in that, include: The fault label determination module is used to: obtain a target fault description statement, input the target fault description statement into the word embedding layer of the label recommendation model, and obtain the word vectors corresponding to each word in the target fault description statement output by the word embedding layer; The word vectors are input into the bidirectional LSTM layer of the tag recommendation model, and forward LSTM and backward LSTM processing are performed to obtain the hidden layer states corresponding to each word output by the bidirectional LSTM layer. The hidden layer states are input into the attention layer of the label recommendation model to determine the weights corresponding to each hidden layer state. Based on the hidden layer states and the weights corresponding to each hidden layer state, the vector representation of the target fault description statement is determined. Based on the vector representation of the target fault description statement, the confidence probability of each candidate fault label is determined. Based on the magnitude of the confidence probability of each candidate fault label, the at least one fault label output by the label recommendation model is obtained. The fault root cause determination module is used to: input the at least one fault label into the fault location model to obtain the fault root cause corresponding to the target fault description statement output by the fault location model; The tag recommendation model is obtained by training a long short-term neural network (LSTM) model based on fault description statement samples, fault tag samples corresponding to the fault description statement samples, and a weighted binary cross-entropy loss function. The fault location model is obtained by training a backpropagation BP neural network model based on the fault label samples and the corresponding fault root cause samples.
6. An electronic device comprising a processor and a memory storing a computer program, characterized in that, When the processor executes the computer program, it implements the fault location method according to any one of claims 1 to 4.
7. A non-transitory computer-readable storage medium having a computer program stored thereon, characterized in that, When the computer program is executed by the processor, it implements the fault location method as described in any one of claims 1 to 4.
8. A computer program product, comprising a computer program, characterized in that, When the computer program is executed by the processor, it implements the fault location method according to any one of claims 1 to 4.