Fault detection in onboard computing systems
The method of logging and analyzing messages on an on-board computing system's data bus using predefined rules quickly identifies and diagnoses faults, addressing the challenge of delayed fault detection in standalone systems by reducing analysis time and enabling continuous operation.
Patent Information
- Authority / Receiving Office
- JP · JP
- Patent Type
- Applications
- Current Assignee / Owner
- BAE SYSTEMS PLC
- Filing Date
- 2024-04-29
- Publication Date
- 2026-06-11
AI Technical Summary
On-board computing systems, particularly in vehicles like aircraft, face challenges in real-time fault detection and diagnosis due to their standalone nature, leading to delayed identification of software faults that can be dangerous, and existing methods require extensive post-fault analysis of large data logs.
A method involving logging messages on a data bus over a rolling time span based on predefined rules, detecting target message transmission events, and analyzing logs for conformity to valid digital chains to quickly identify and diagnose faults, with the option to notify users or external devices.
Enables real-time fault identification and diagnosis, reducing the time and resources needed for post-fault analysis by focusing on relevant data, allowing continuous operation and immediate corrective actions.
Smart Images

Figure 2026518999000001_ABST
Abstract
Description
【Technical Field】 【0001】 The present invention relates to fault detection in an on-board computing system. 【Background Art】 【0002】 On-board computing systems such as the mission systems of on-board vehicles sometimes cause faults in their software. Such faults generally fall into two categories: a first category where a user inputs a command to the on-board computing system and the expected output cannot be realized or an output different from what is expected is output by the on-board computing system, and a second category of faults where an output not commanded is issued from the on-board computing system. Such faults can be benign or extremely dangerous depending on the use case of the on-board computing system. 【0003】 Typically, on-board computing systems, especially those used on vehicles such as aircraft, are stand-alone, i.e., not networked to maintain the integrity of the computing system. The integrity of an on-board computing system is particularly important in the context of a computing system that handles safety-related decisions, such as an aircraft's mission system or autopilot system. Therefore, real-time fault diagnosis by the system user or an external non-on-board device, such as a ground station, is only possible after an error has occurred, usually only when the vehicle on which the on-board computing system is installed returns to the maintenance base for post-fault analysis. 【0004】 To identify and diagnose failures in an onboard computing system, the system may be configured to create a data log that shows all messages sent by various system components within the system. Such data logs can be very large and often require a considerable amount of time to pinpoint the source of a failure within the system and diagnose its root cause. 【0005】 An exemplary objective of the exemplary arrangement configuration of the present invention is to at least partially avoid or overcome one or more drawbacks of the prior art, whether specified herein or elsewhere, or to at least provide an alternative form of an existing fault detection method. [Overview of the project] 【0006】 According to an aspect of the present invention, a computer implementation method for detecting a fault in an onboard computing system is provided, the method comprising: recording a log of messages transmitted on the data bus of the onboard computing system over a rolling time span in accordance with a rule stored in the memory system of the onboard computing system; detecting a target message transmission event associated with the rule; retrieving a message log in response to the detection of the target event; determining whether the log fits into a valid digital chain associated with the rule, wherein the valid digital chain comprises a group of messages associated with the target message transmission event, and storing or maintaining the log in a log database in response to the log not fitting into the valid digital chain of the rule. 【0007】 A rule, or each rule, may comprise a target message sending event, a valid digital chain comprising a group of messages associated with the target message sending event, and a rolling time span. 【0008】 Optionally, this method further includes the step of discarding logs from the log database in response to the logs conforming to a valid digital chain of rules. 【0009】 Optionally, the method further comprises a step of determining whether the log conforms to a valid digital chain of a second rule in response to a failure to conform to the first rule. 【0010】 Optionally, the method further comprises logging messages transmitted on the data bus over a rolling time span according to either the first or second rule, wherein the time span is configured to be the larger of the two time spans associated with the first and second rules. 【0011】 Optionally, this method further includes a step of notifying the user of the onboard computing system if the log does not conform to the valid digital chain of the rule. 【0012】 Optionally, if the log does not fit into the valid digital chain of the rule, the method further includes the step of identifying one or more messages in the log that do not fit into the valid digital chain of the rule. 【0013】 Optionally, the method further comprises the step of logging messages sent over the data bus, and including creating a new entry in the log each time a message is sent over the data bus. 【0014】 Optionally, this method further includes the step of assigning a timestamp to each new entry in the log. 【0015】 Optionally, the timestamp can be configured to be recorded as Coordinated Universal Time (UTC), local time, or a predetermined zero-time. 【0016】 Optionally, in the step of determining whether the log fits into a valid digital chain, the method further comprises checking the order of groups of messages in the log according to their respective timestamps. 【0017】 Optionally, the logging step further comprises the step of generating a memory stack in the log for each type of message transmitted over the data bus. 【0018】 According to a second embodiment, a fault detection system is provided which comprises a data processing device and a memory system having computer-readable instructions that, when executed by the data processing device, cause the data processing device to perform the method of the first embodiment. 【0019】 According to a third aspect, an onboard computing system is provided having a plurality of system components and a data bus, wherein the system components are configured to transmit messages via the data bus, and the system includes a data processing unit and a memory system having computer-readable instructions that, when executed by the data processing unit, cause the data processing unit to perform the method of the first aspect. 【0020】 According to the fourth aspect, a vehicle is provided which is equipped with the onboard computing system of the third aspect, and optionally the vehicle is an aircraft. 【0021】 According to the fifth aspect, a computer program is provided that, when executed by a computer, includes instructions causing the computer to perform the method described in the first aspect. 【0022】 According to a sixth aspect, a computer implementation method for detecting failures in an onboard computing system is provided, the method comprising: determining a rule comprising a target message transmission event, a valid digital chain comprising a group of messages associated with the target message transmission event, and a rolling time span (116); uploading the rule to the memory system of the onboard computing system; logging messages transmitted on the data bus (101) of the onboard computing system over a rolling time span (116) in accordance with the rule (124); detecting a target message transmission event associated with the rule; retrieving a message log in response to the detection of the target event; determining whether the log fits into a valid digital chain associated with the rule, wherein the valid digital chain comprises a group of messages associated with the target message transmission event, and storing or maintaining the log in a log database in response to the log not fitting into the valid digital chain of the rule. 【0023】 Optionally, this method further includes a step of downloading recorded logs from the onboard computing system. 【0024】 Optionally, if the log does not fit into the valid digital chain of the rule, the method further includes the step of identifying one or more messages in the log that do not fit into the valid digital chain of the rule. 【0025】 Optionally, the method further comprises a step of restarting the system components of the onboard computing system associated with a message that did not conform to the valid digital chain of rules. 【0026】 Optionally, the method further comprises disabling system components of the onboard computing system associated with messages that did not conform to a valid digital chain of rules. 【0027】 Optionally, the method further comprises discarding the log from the log database in response to the log conforming to a valid digital chain of rules. 【0028】 Optionally, the onboard computing system is a mission system, and the rules are uploaded to a fault detection module as part of a mission data pack. 【0029】 Optionally, the method further comprises determining whether the log conforms to a valid digital chain of a second rule in response to not conforming to a first rule. 【0030】 Optionally, the method further comprises recording a log of messages transmitted on the data bus over a rolling time span according to a first rule or a second rule, the time span being configured to be the larger of two time spans associated with the first rule and the second rule. 【0031】 Optionally, in response to the log not conforming to the rule, the method further comprises notifying a user of the onboard computing system. 【0032】 Optionally, in response to the recorded log not conforming to the rule, the method comprises an additional step of sending the log to an external off-board device. 【0033】 According to the seventh aspect, an onboard computing system is provided having a plurality of system components and a data bus, wherein the system components are configured to transmit messages via the data bus, and the system includes a data processing unit and a memory system having computer-readable instructions that, when executed by the data processing unit, cause the data processing unit to perform the method described in the sixth aspect. 【0034】 According to the eighth aspect, a vehicle is provided which is equipped with the onboard computing system of the seventh aspect, and optionally the vehicle is an aircraft. 【0035】 According to the ninth aspect, a computer program is provided that, when executed by a computer, includes instructions causing the computer to perform the method described in the sixth aspect. 【0036】 Next, embodiments of the present invention will be described simply as examples with reference to the figures. [Brief explanation of the drawing] 【0037】 [Figure 1] A schematic diagram illustrating an effective digital chain is shown. [Figure 2] A schematic diagram of an example of a computing system incorporating the present invention is shown. [Figure 3] A schematic diagram of an exemplary fault detection system is shown. [Figure 4] This shows an alternative configuration for the fault detection system. [Figure 5] An exemplary memory stack according to the present invention is shown. [Figure 6] This shows multiple memory stacks of the fault detection system. [Figure 7] An example of an invalid digital chain is shown. [Figure 8] This indicates a vehicle equipped with an onboard computing system. [Figure 9]This diagram shows a schematic representation of the rule tailoring stage and the fault analysis stage in the context of the onboard computing system. [Figure 10] This diagram shows an example of how the onboard computing system operates. [Figure 11] This diagram shows an exemplary workflow for detecting faults in an onboard computing system. [Modes for carrying out the invention] 【0038】 This specification describes methods, apparatus, and systems for detecting faults in an onboard computing system. Generally, an onboard computing system may comprise multiple system components (such as computing devices) arranged to send messages via a data bus to other system components or memory systems within the onboard computing system. The messages are discrete data fragments and may relate to the status of the transmitting system component or to functions, queries, etc., directed to other system components or memory systems within the onboard computing system. Generally, the present invention monitors these messages transmitted on the data bus within the onboard computing system to check for faults. The onboard computing system may be integrated into a platform, such as a vehicle. The onboard computing system may be standalone and may not be routinely connected to an external network. 【0039】 System components can be configured to periodically transmit messages on the data bus. This is known as the refresh rate or refresh cycle. For example, one system component may be configured to transmit a message every 80ms, while another system component may be configured to transmit a message every 50ms. The refresh rate of each system component is determined according to the manufacturer of the individual system component, the needs of the system in which the system component is used, or the protocol / standard on which the system component operates. For example, in the context of aircraft systems, a system component handling avionics control inputs may refresh at a faster rate than a system component handling radar information. Generally, messages transmitted by a system component remain the same until a command is received by the onboard computing system from either a user of the computing system or another system component within the onboard computing system. For example, a system component may periodically transmit on the data bus that it is in an "OFF" state until a message in the form of a query requests that the system component be turned "ON," and then, in the next refresh cycle, the system component transmits "ON" on the data bus. This change in message is an example of a message transmission event. Whenever a message changes its status, for example, from "OFF" to "ON," there is a digital chain of messages associated with the change in that particular message. For example, a system component requested to be "ON" will have received a message from another system component (typically a few milliseconds earlier) querying whether that particular system component was previously "ON." Such a chain of messages transmitted across the data bus is called a "valid digital chain" or "digital thread" if it reflects the expected behavior of the system.Every message transmission event in the onboard computing system has a valid digital chain. In other words, if every message transmission event is a "result," then there is a known, predetermined "cause." In an alternative configuration, a message transmission event could be a pattern of messages sent on the data bus. In another configuration, a message transmission event could be the absence of one or more messages sent on the data bus. 【0040】 Figure 1 schematically illustrates an active digital chain 100 comprising a first system component 102, a second system component 104, a third system component 106, a fourth system component 108, and a fifth system component 110. At the end of chain 100 is a message transmission event 112. In this example, the event is a message that changes the status in the onboard computing system, for example, a system component changing from "OFF" to "ON". In this example, event 112 is associated with the fifth system component 110. Event 112 has a known cause 114 that follows the active digital chain. In this example, the cause is associated with the first system component 102. For event 112 to become active in this example, the first system component 102 must send a message to the second system component 104, then the message is sent to the third system component 106, then to the fourth system component 108, and finally to the fifth system component 110, which then triggers event 112. Therefore, a valid digital chain 100 effectively represents the group of messages associated with event 112 for event 112 to become valid. In some examples, there may be more than one valid digital chain associated with a particular event 112. In other words, there may be more than one valid cause 114 or paths through various system components within the onboard computing system that can cause a particular event 112. 【0041】 An error occurs in the onboard computing system if a message transmission event occurs that does not conform to a valid digital chain, i.e., if event 112 occurs without an obvious “cause” 114, and / or if the path associated between cause 114 and event 112 does not conform to a valid digital chain 100. 【0042】 The inventors recognized that when attempting to determine whether a valid digital chain 100 exists in the onboard computing system in response to a message transmission event 112, it is necessary to record all messages transmitted on the data bus over a continuous rolling time span, and when an event occurs, to further analyze all messages transmitted on the data bus immediately before the message transmission event 112 (i.e., in the time leading up to the event) to determine whether the digital chain 100 is valid for event 112. This presents a technical challenge because recording all messages transmitted on the data bus (often multiple data buses) from multiple system components / memory systems within the onboard computing system generates a very large amount of data. Furthermore, post-processing of this data by forensic analysis is extremely time-consuming in detecting whether a valid digital chain exists for every message transmission event in the onboard computing system. 【0043】 Accordingly, the inventors have devised a method for detecting failures in an onboard computing system. Generally, the method comprises logging messages transmitted on the data bus of the onboard computing system over a rolling time span according to rules stored in the memory system of the onboard computing system. The rules may be predefined before the onboard computing system is put into operation. The rules determine that certain message transmission events (referred to herein as target message transmission events) should be monitored within the onboard computing system. To avoid doubt, the target message transmission events of the rules correspond to at least one type of message known to be transmitted on the data bus by multiple system components. As will be discussed later, several rules can be implemented simultaneously, and therefore several target message transmission events can be monitored simultaneously within the onboard computing system. 【0044】 The inventors have recognized that every message transmission event in an onboard computing system has a time span 116 associated with a valid digital chain 100. The time span 116 is the maximum time required from the time a command is received by the onboard computing system until a message transmission event occurs and is transmitted on the data bus in the next refresh cycle of the relevant system components. For example, the time span 116 associated with each message from “cause” 114 to event 112 is always known. Thus, a rule may determine that messages transmitted on the data bus 101 of the onboard computing system should be logged over a specific rolling time span 116 associated with the rule’s target message transmission event and stored in the memory system within the onboard computing system. In the example shown in Figure 1, the time span 116 associated with the digital thread 100 is 30ms between cause 114 and event 112. In this way, all messages transmitted on the data bus over the last time span 116 (30ms) are logged; that is, they are logged over the rolling time span according to the rule. 【0045】 When a target message transmission event associated with a rule occurs, a log of all messages transmitted on the data bus over the rolling time span 116 associated with the rule is retrieved from the memory system and analyzed to determine whether the log fits into a valid digital chain 100 associated with the rule. If the log does not fit into the rule, the log is stored in the log database for further evaluation. In this way, errors within the onboard computing system can be quickly identified and diagnosed. 【0046】 Figure 2 shows a schematic diagram of an example of an onboard computing system 200 according to the present invention. The onboard computing system 200 comprises a data bus 101 that is communicatively connected to a plurality of system components, including a first system component 102, a second system component 104, a third system component 106, a fourth system component 108, and a fifth system component 110. In this example, five system components are shown, but the exact number of system components is not considered to be limited to the present invention. In general, at least one system component may be provided that is communicatively connected to the data bus 101. Similarly, more than one data bus 101 may be provided within the onboard computing system 200. For example, the first system component 102 and the second system component 104 may be connected by a first data bus, and the second system component 104 and the third system component 106 may be connected by a second data bus. Data buses may overlap with certain system components so that one system component may be connected to more than one data bus simultaneously. The precise placement of system components and data buses is determined by the design of the onboard computing system. 【0047】 Each of the multiple system components 102-110 may be associated with a line-replaceable item (LRI). LRIs may be referred to as line-replaceable units (LRUs) without distinction. An LRI is a standalone device or "black box" configured to perform a specific function within the onboard computing system 200. LRIs may be selectively removed from the onboard computing system 200 and may therefore be referred to as "plug and play". In the context of an onboard computing system on an aircraft, the first system component 102 may be a radar-handling LRI, the second system component 104 may be an autopilot system, and the third system component 106 may be an environmental control system, etc. 【0048】 Each of the multiple system components may have its own computer processing unit (CPU) (not shown) and / or memory (not shown). In some examples, one or more of the multiple system components may share a common CPU and / or memory. 【0049】 In this example, a separate memory system 118 is provided, distinct from the memory systems of multiple system components 102-110. In other examples, the memory system may be included as a system component or may be included within a system component. The memory system 118 is communicatively connected to the data bus 101. The memory system 118 may be a central global recorder configured to store information such as messages from multiple system components via the data bus 101. Similarly, multiple system components may be able to retrieve information from the memory system 118 via the data bus 101. The memory system 118 may be an LRI. The memory system 118 may be removable from the onboard computing system 200. 【0050】 Next, the operation of the onboard computing system 200 will be described. As previously mentioned, the multiple system components are arranged to periodically transmit messages on the data bus 101. Each of the multiple system components may be arranged to transmit one or more types of messages on the data bus 101. Each of the multiple system components may be arranged to periodically transmit messages on the data bus 101 at the same refresh rate or at different refresh rates. In this example, the first system component 102 is configured to refresh at a rate of 30ms, the second system component 104 and the third system component 106 at 100ms, the fourth system component 108 at 30ms, and the fifth system component 110 at 300ms. 【0051】 The data bus 101 is a communication system arranged to transfer messages between multiple system components and / or memory systems 118. The data bus may comprise multiple wires and connectors that provide the transport of messages between the multiple system components and / or memory systems 118. The data bus may be a parallel bus or a serial bus. The data bus may be an avionics data bus, for example, a data bus manufactured in accordance with ARINC429 or MIL-STD1553. 【0052】 The onboard computing system 200 may also include a fault detection system 120 that is communicatively connected to the data bus 101. The fault detection system 120 is configured to monitor messages transmitted on the data bus 101. The fault detection system 120 may be separate from or included in the system components 102-110. For example, the fault detection system 120 may consist of a standalone unit comprising a data processing unit and a memory system, or alternatively, may be configured to use the data processing unit and memory system of a broader onboard computing system 200 via the data bus 101. For example, the fault detection system 120 may be configured to use the data processing unit and / or memory store 118 of one or more of the system components. The memory system may include computer-readable instructions that, when executed by the data processing unit, cause the data processing unit to perform the actions described herein. 【0053】 Although the fault detection system 120 is described as implementing the method for illustrative purposes, this is not limiting, and the method may be performed by an onboard computing system 200 in which the fault detection system 120 is not present. For example, one or more system components may be configured to implement the fault detection method described herein. 【0054】 The fault detection system 120 may be a LRI configured to be selectively removed from the onboard computing system 200. In some configurations with multiple data buses 101, the fault detection system 120 is further configured to be communicatively connected to the multiple data buses 101 and to monitor messages transmitted on the multiple data buses 101. The operation of the fault detection system 120 will now be described. 【0055】 Figure 3 shows a schematic diagram of an exemplary fault detection system 120. The fault detection system 120 comprises a data processing device 122 and a memory system 124. The fault detection system may further include a log database 126. The log database is configured to store logs that do not conform to a valid digital chain defined by rules. In this example, the log database 126 is located within the fault detection system 120, but as will be discussed later, the log database 126 may be located outside the fault detection system 120. 【0056】 The memory system 124 stores rules. A rule has three elements. First, the rule defines a target message transmission event to be detected on the data bus 101. The second element of the rule is the time span 116 associated with the target message transmission event. The third element of the rule is a valid digital chain comprising a group of messages associated with the target message transmission event. The rule may be predefined before the onboard system 200 is run. The rule may optionally be uploaded to the memory system 124 in the fault detection system 120 before the onboard system 200 is run. As mentioned above, the time span 116 of the valid digital chain 100 associated with each target message transmission event is always known. Therefore, only messages transmitted on the data bus 101 during the rolling time span 116 need to be recorded. 【0057】 For example, a rule might require monitoring a target message transmission event that has an active digital chain 100 known to take up to 120ms. Therefore, messages transmitted on the data bus 101 older than 120ms are not related to the active digital chain of the target message transmission event. Consequently, messages transmitted on the data bus 101 older than the rolling time span 116 can be discarded. In this way, the memory requirements of the fault detection system are significantly reduced. Furthermore, the inventors have recognized that the fault detection system 120 retains only the relevant messages necessary for fault identification and diagnosis, rather than all messages recorded since the onboard computing system 200 began transmitting messages on the data bus 101, for example, during the initial startup of the onboard computing system, thus significantly reducing the post-event analysis of occurring faults. 【0058】 Next, the operation of the fault detection system 120 will be described in detail. As mentioned above, the fault detection system 120 is configured to detect target message transmission events on the data bus 101 associated with the rule. When the fault detection system 120 detects a target message transmission event associated with the rule, it is configured to retrieve a log from the memory system 124 and determine whether the log fits into a valid digital chain associated with the rule. If the log does not fit into the rule, that is, if the recorded log does not match the group of messages associated with the target message transmission event, a fault has occurred within the onboard computing system 200. 【0059】 If a log does not conform to the rules (for example, only if it does not conform), the log may be stored in log database 126. Alternatively, a log may be stored in log database 126 when it is first recorded and maintained (i.e., removed) from the log database if the log does not conform to the rules. If a log is stored in log database 126 when it is first recorded, it may be discarded from log database 126 if it is found to conform to the rules. 【0060】 More generally, in response to a log not conforming to a rule, the fault detection system 120 is configured to store or maintain the log in the log database 126, thereby enabling forensic analysis of the recorded log to identify and diagnose a fault. Optionally, when the log conforms to a rule, i.e., when the log contains a group of valid digital chain messages associated with a target message transmission event, the log may be discarded from the log database 126, i.e., there is no fault in the onboard computing system 200. 【0061】 In this way, the memory requirements of the log database 126 are significantly minimized, and only logs that identify faults within the onboard computing system 200 are taken for further analysis. 【0062】 In one configuration, the user of the onboard computing system 200 is notified in response to a log not fitting into a valid digital chain of rules. This notification to the user may be performed by a fault detection system 120 or another system component within the onboard computing system 200. For example, the onboard computing system 200 may have a graphical user interface (GUI), and when a fault is detected by the fault detection system 120, a message or warning is presented to the user. Alternatively or additionally, messages may be sent from the onboard computing system 200 to an external, non-onboard device outside the computing system 200. For example, a message may be sent to a base station or ground station to alert a mission controller or maintenance operator of a fault in the onboard computing system 200. In an alternative configuration, logs may be sent to an external, non-onboard device outside the onboard computing system 200. In this way, real-time fault identification and diagnosis, and potentially correction, can be performed while the onboard computing system 200 continues to operate. 【0063】 Figure 4 shows an alternative configuration of the fault detection system 120. In this example, the data bus 101 and memory system 124 are as described in relation to Figure 3, and for brevity, their functions will not be repeated here. In this configuration, the log database 126 is located outside of the fault detection system 120. For example, the log database 126 may be part of one of several system components, e.g., the first system component 102. Alternatively, the log database 126 may be part of the memory store 118. By using externally available memory, e.g., system component 102 or memory store 118, the computation and / or power requirements of the fault detection system 120 can be significantly reduced. 【0064】 The method may further include a step of determining whether the log conforms to a valid digital chain of a second rule in response to non-conformity with a first rule. In this way, multiple target message transmission events can be monitored on the data bus 101. In this way, in response to the occurrence of a first failure, further searching for other failures within the onboard computing system 200 can be initiated. The inventors have recognized that this so-called "if this then that" function can be useful when it is known that a failure in one system component may cause one or more failures in other related system components within the onboard computing system 200. 【0065】 In a single configuration, there may be multiple rules. There may be multiple rules corresponding to the number of distinct types of message transmission events that may occur on the data bus 101. In some cases, all message transmission events on the data bus 101 of the onboard computing system 200 may have associated rules so that the onboard computing system 200 fully monitors them. 【0066】 The method may further include a step of logging messages transmitted on the data bus 101 over a rolling time span according to a first or second rule, the time span being configured to be the larger of two time spans associated with the first and second rules. In this way, messages transmitted on the data bus can be logged over a rolling time span 116 associated with the longest digital chain among the two or more rules. For example, if the first rule is configured to monitor a first target message transmission event with a rolling time span of 60 ms (this value is the expected time for an active digital chain 100 to occur throughout the onboard computing system 200 for a first target message transmission event), and the second rule is configured to monitor a second target message transmission event with a time span of 120 ms, the log will record messages transmitted on the data bus 101 over 120 ms, thereby ensuring that all relevant messages are fully captured by the fault detection system 120 and preventing "clipping" of the active digital chain. 【0067】 Figure 5 shows a log comprising a single memory stack 300 according to the present invention. The memory stack 300 is a series of recorded entries that capture messages transmitted on the data bus 101. The method may include the step of generating a memory stack 300 in the log for each type of message transmitted on the data bus. In an alternative arrangement, a single memory stack 300 may record multiple types of messages transmitted on the data bus. 【0068】 Each time a message is sent over the data bus, for example, each time the associated system component is refreshed, a new entry is created in the memory stack 300. Figure 5 shows a memory stack 300 with six entries: the first entry 302, the second entry 304, the third entry 306, the fourth entry 308, the fifth entry 310, and the sixth entry 312. The entries are stored in chronological order, with the first entry 302 being the most recent entry and the sixth entry 312 being the oldest. The six entries cover a rolling time span 116 defined by a rule. Each time a new entry is recorded, the oldest entry, which is older than the rolling time span 116, is discarded. In this way, the memory stack 300 records and stores only messages within the rolling time span 116. 【0069】 This method may include assigning a timestamp 314 to each new entry in the log. The timestamp 314 may be recorded as Coordinated Universal Time (UTC), i.e., Greenwich Mean Time, local time, e.g., the time zone in which the onboard computing system 200 is operating, or a time starting from a predetermined zero time, e.g., mission start time, where the time when the onboard computing system starts operating is zero time, and subsequent time measurements are taken from this zero time. In this example, the memory stack 300 is configured to record one type of message corresponding to the status of system components in the onboard computing system 200. The sixth (oldest) entry 312 records time 314 and the message as "OFF". Similarly, the fifth entry 310 and the fourth entry 308 also record time 314 and the message as "OFF". In the third entry 306, time 314 is recorded and the message changes the status to "Operation". In the second entry 304, time 314 is also recorded and the message changes the status to "OFF". Similarly, in the most recent first entry 302, time 314 is recorded, and the message remains "OFF". The memory stack 300 is shown as recording both time 314 and message 316, although in some configurations only the message may be stored. To better understand the operation of the fault detection system 120, the operation of the memory stack 300 and the active digital threads 100 are described next. 【0070】 Figure 6 shows a log 600 comprising multiple memory stacks 300a to 300d. In particular, a first memory stack 300a, a second memory stack 300b, a third memory stack 300c, and a fourth memory stack 300d are shown. Each of the multiple memory stacks 300a, 300b, 300c, and 300d is assigned a specific type of message to be recorded and stored, transmitted on the data bus 101. In this example, the first memory stack 300a is assigned to record messages transmitted by the first system component 102, the second memory stack 300b is assigned to record messages transmitted by the second system component 104, the third memory stack 300c is assigned to record messages transmitted by the third system component 106, and the fourth memory stack is configured to record messages transmitted by the fourth system component 108. 【0071】 Multiple memory stacks 300a to 300d continuously record and store messages over a rolling time span 116 according to a rule. In this example, each of the multiple memory stacks has a different number of entries due to the respective refresh rates of the multiple system components. For example, the first memory stack 300a has 6 entries (corresponding to 6 refresh cycles by the first system component 102 within the rolling time span 116), the second memory stack 300b has 9 entries (corresponding to 9 refresh cycles by the second system component 104 within the rolling time span 116), the third memory stack 300c has 6 entries (corresponding to 6 refresh cycles by the third system component 106 within the rolling time span 116), and the fourth memory stack 300d has 4 entries (corresponding to 4 refresh cycles by the fourth system component 108 within the rolling time span 116). Entries recorded by all of the multiple memory stacks 300a to 300d over a rolling time span of 116 are stored in the log database 126 as log 600. 【0072】 As mentioned above, the rule determines the target message transmission event to be detected. When the fault detection system 120 detects the target message transmission event on the data bus 101, indicated by the digit 602, the fault detection system 120 retrieves log 600 from memory. The log is compared with a valid digital chain associated with the rule. Figure 6 shows a valid digital chain 100 comprising a group of messages associated with the target message transmission event (shown as shaded cells) spanning multiple memory stacks 300a to 300d. In this way, the fault detection system 120 can trace the message changes backward from the fourth memory stack 300d to the first memory stack 300a. In this example, the digital chain is valid, and therefore the target message transmission event occurred during normal operation of the onboard computing system 200, i.e., no error occurred. Optionally, at this point, log 600 is discarded from the log database 126. 【0073】 Figure 7 shows an example of an invalid digital chain. In this example, the multiple memory stacks (300a-300d) are the same as in Figure 6, and their functions and respective message / system components remain unchanged; for brevity, they are not described again here. In response to the detection of a target message transmission event, log 600 is retrieved from memory and compared with the rule. In this example, it can be seen that the valid digital chain 100 extends only from the fourth memory stack 300d to the third memory stack 300c. Two messages from the group of messages associated with the target message transmission event are missing from the first and second memory stacks. Therefore, an error occurs in the onboard computing system 200 because the log does not contain the group of messages associated with the target message transmission event. 【0074】 In some configurations, in response to log 600 not conforming to a rule, the fault detection system 120 is configured to identify which message in the log did not conform to the rule. In this example, the second entry in the third memory stack 300c is the one that did not conform to the rule. In this way, faults can be quickly identified and corrected. As previously mentioned, such information may be sent to the user or an external, non-onboard device outside the onboard computing system 200. 【0075】 In some configurations, the step of checking whether log 600 fits into a valid digital chain further includes checking the order of groups of messages within log 600 according to their respective timestamps 314. 【0076】 In some configurations, the method includes a further step of restarting a system component of the onboard computing system 200 associated with a message that did not conform to the valid digital chain 100 of the rules. For example, if a message sent by a second system component 104 did not conform to the valid digital chain 100 of the rules, the second system component 104 may be restarted. By restarting the component in question, the fault can be removed from the onboard computing system 200. 【0077】 In an alternative configuration, the method further includes the step of disabling system components of the onboard computing system 200 associated with messages that did not conform to the valid digital chain 100 of the rules. This may be advantageous if a component is permanently compromised due to physical damage or malfunction, for example, a malware bug in the component in question. 【0078】 Figure 8 shows a vehicle 800 equipped with the onboard computing system 200 described herein. In this example, the vehicle 800 is an aircraft. The onboard computing system may be part of the vehicle's mission system. The vehicle may be provided with one or more onboard computing systems 200. Although an aircraft is shown in Figure 8, this is not limiting, and the present invention is applicable to any type of land, sea, air, or space vehicle. 【0079】 Figure 9 shows a schematic diagram 900 illustrating the rule adjustment and fault analysis stages in the context of the onboard computing system 200. In particular, the first non-onboard rule adjustment stage 902, the second onboard deployment stage 904, and the third non-onboard fault analysis stage 906 are shown. The first non-onboard rule adjustment stage 902 comprises a mission data toolset 908 configured to generate a mission data pack 910 for uploading to the onboard computing system 200. Generally, the mission data toolset 908 is a software package. The toolset 908 is configured to define rules (or a set of rules) used in the onboard computing system 200. In particular, the toolset 908 may be configured to define target message transmission events to be detected in the onboard computing system 200 and to define a valid digital chain 100 comprising a group of messages associated with the target message transmission events. The toolset 908 may generate other instructions for the onboard computing system 200, such as instructions for multiple system components, or other relevant data, such as navigation information, targeting information, and store information. Once the mission data pack 910 is generated, the pack is uploaded to the onboard computing system 200 during the onboard deployment phase 904. At this stage, any rules held in the mission data pack 910 are uploaded to the memory system of the onboard computing system. 【0080】 After a failure occurs in the onboard computing system 200, the log 600 is transferred from the log database 126 to an external non-onboard device 914 in a third non-onboard failure analysis stage 906. In this example, the external non-onboard device is a forensic toolset 914. The forensic toolset 914 is a computer program configured to analyze the recorded log 600 and identify and diagnose the failure within the onboard computing system 200. The recorded log 600 may be transferred from the onboard computing system 200 after the system has stopped moving, or it may be sent immediately after the failure is detected. For example, in the context of an onboard computing system 200 on an aircraft, the recorded log 600 may be transferred to an external non-onboard device after landing. 【0081】 The forensic toolset 914 may have a feedback loop 916 to the mission data toolset 908, so that in response to diagnosing a fault in the onboard system, the mission data toolset 908 can create new rules for uploading to the fault detection system 120. 【0082】 Figure 10 shows a flowchart 1000 of an exemplary method for detecting failures in an onboard computing system. 【0083】 In step s1002, the onboard computing system is executed. 【0084】 In step s1004, a log of messages transmitted over the data bus is recorded over a rolling time span and stored in the memory system of the onboard computing system. The log consists of multiple memory stacks, each memory stack recording a specific type of message transmitted over the data bus. 【0085】 In step s1006, the fault detection system checks whether a target message transmission event has occurred. In response to the absence of a target message transmission event, the method proceeds to step s1004, continuing to record messages over a rolling time span and discarding entries older than the rolling time span. 【0086】 In response to detecting a target message transmission event, the method proceeds to step s1008, and the log is retrieved from the memory system. 【0087】 In step s1010, the log is compared to the valid digital chain associated with the rule. 【0088】 In step s1012, a determination is made as to whether the log conforms to the rule. 【0089】 In response to the log not conforming to the rule, in step s1014, the log is saved to the log database. Optionally, in this step, in step s1016, in response to the log not conforming to the first rule, it is compared to a valid digital chain of the second rule. 【0090】 Optionally, in step s1018, the logs are sent to an external non-onboard device outside the onboard computing system for further analysis, as described above, or downloaded from the onboard computing system. 【0091】 Optionally, in step s1020, if the log conforms to the rule, the log is discarded from the log database. 【0092】 Figure 11 shows a flowchart 1100 of an exemplary operation method for detecting a fault in an onboard computing system, and this method is 1102 Determining a rule comprising a target message sending event, a valid digital chain having a group of messages associated with the target message sending event, and a rolling time span, Uploading rules to the memory system of the 1104-mounted computing system, 1106 To log messages transmitted on the data bus of the onboard computing system over a rolling time span in accordance with rules stored in the memory system of the onboard computing system, 1108 Using multiple memory stacks to record each message over a time span according to a rule, wherein the multiple memory stacks are configured to store a new entry for each message each time each message is transmitted on the data bus. Detecting target message sending events associated with the 1110 rule, 1112 In response to detecting a target event, retrieve the message log from memory, 1114 Determine whether the log fits into a valid digital chain associated with the rule, where a valid digital chain comprises a group of messages associated with the target message sending event. 1116 The procedure includes the step of storing the log in a log database in response to the log not fitting into a valid digital chain of rules. 【0093】 Although system components are described as individual devices within an onboard computing system, this is considered non-limiting, and in some configurations, system components may be discrete objects or functions held within a single system component and connected via a data bus. Embodiments of components described herein can be implemented using any suitable software application, programming language, data editor, etc., and can be represented / stored / processed using any suitable data structure, etc. 【0094】 Attention is paid to any documents and literature filed concurrently with or prior to this Specification and made available to the public together with this Specification, and the contents of all such documents and literature are incorporated herein by reference. 【0095】 All of the features disclosed herein (including any appended claims, abstract, and drawings) and / or all of the steps of any method or process so so disclosed may be combined in any combination, except for any combination in which at least some of such features and / or steps are mutually exclusive. 【0096】 Each feature disclosed herein (including any attached claims, abstract, and drawings) may be replaced by an alternative feature serving the same, equivalent, or similar purpose unless otherwise specified. Therefore, unless otherwise specified, each disclosed feature is merely one example of a comprehensive set of equivalent or similar features. 【0097】 The present invention is not limited to the details of the embodiments described above. The present invention extends to any novel one or any novel combination of features disclosed herein (including any appended claims, abstract, and drawings), or any novel one or any novel combination of steps of any method or process so so disclosed.
Claims
[Claim 1] A computer implementation method for detecting a fault in an onboard computing system (200), Records a log of messages transmitted on the data bus (101) of the onboard computing system over a rolling time span (116) in accordance with rules stored in the memory system of the onboard computing system, To detect the target message sending event associated with the aforementioned rule, In response to detecting the aforementioned target event, the log of the aforementioned message is retrieved, Determining whether the log fits into a valid digital chain associated with the rule, wherein the valid digital chain comprises a group of messages associated with the target message sending event. In response to the log not conforming to the valid digital chain of the rule, The aforementioned logs are stored or maintained in a log database, A method that includes [a certain feature]. [Claim 2] In response to the log conforming to the valid digital chain of the rule, The system further comprises discarding the aforementioned log from the log database. The method according to claim 1. [Claim 3] The method according to claim 1 or 2, further comprising determining whether the log conforms to a valid digital chain of a second rule in response to the failure to conform to the first rule. [Claim 4] The method according to claim 4, further comprising logging the messages transmitted on the data bus over a rolling time span in accordance with the first rule or the second rule, wherein the time span is configured to be the larger of two time spans associated with the first rule and the second rule. [Claim 5] The method according to any one of claims 1 to 4, further comprising notifying the user of the onboard computing system if the log does not conform to the valid digital chain of the rule. [Claim 6] The method according to any one of claims 1 to 5, further comprising identifying one or more messages in the log that do not conform to the valid digital chain of the rule if the log does not conform to the valid digital chain of the rule. [Claim 7] The method according to any one of claims 1 to 6, wherein the step of logging the messages transmitted on the data bus comprises creating a new entry in the log each time a message is transmitted on the data bus. [Claim 8] The method according to claim 7, further comprising the step of assigning a timestamp to each new entry in the log. [Claim 9] The method according to claim 8, wherein the timestamp is configured to be recorded as Coordinated Universal Time (UTC), local time, or a predetermined zero-hour start time. [Claim 10] The method according to claim 8 or 9, wherein the step of determining whether the log fits the valid digital chain further comprises checking the order of the group of messages in the log according to each of the timestamps. [Claim 11] The method according to any one of claims 1 to 10, wherein the step of recording the log further comprises the step of generating a memory stack in the log for each type of message transmitted on the data bus. [Claim 12] A fault detection system comprising a data processing device and a memory system having a computer-readable instruction that, when executed by the data processing device, causes the data processing device to perform the method according to any one of claims 1 to 11. [Claim 13] An onboard computing system having a plurality of system components and a data bus, wherein the system components are configured to transmit messages via the data bus, and the system includes a data processing device and a memory system having computer-readable instructions that, when executed by the data processing device, cause the data processing device to perform the method according to any one of claims 1 to 11. [Claim 14] A vehicle comprising the onboard computing system described in claim 13, wherein the vehicle is optionally an aircraft. [Claim 15] A computer program comprising, when the program is executed by a computer, an instruction causing the computer to carry out the method described in any one of claims 1 to 11.