Runtime Observer Pairs and Bayesian Network Reasoners On-board FPGAs: Flight-Certifiable System Health Management for Embedded Systems

Runtime Observer Pairs and Bayesian Network Reasoners On-board FPGAs: Flight-Certifiable System Health Management for Embedded Systems

Johannes Geist, Kristin Yvonne Rozier, and Johann Schumann


These pages contain further details of the experiments described in "Runtime Observer Pairs and Bayesian Network Reasoners On-board FPGAs: Flight-Certifiable System Health Management for Embedded Systems" by Johannes Geist, Kristin Yvonne Rozier, and Johann Schumann. The BibTeX for this paper is available here.


Safety-critical systems, like Unmanned Aerial Systems (UAS) that must operate totally autonomously, e.g., to support ground-based emergency services, must also provide assurance they will not endanger human life or property in the air or on the ground. Previously, a theoretical construction for paired synchronous and asynchronous runtime observers with Bayesian reasoning was introduced that demonstrated the ability to handle runtime assurance within the strict operational constraints to which the system must adhere. In this paper, we show how to instantiate and implement temporal logic runtime observers and Bayesian network diagnostic reasoners that use the observers' outputs, on-board a field-standard Field Programmable Gate Array (FPGA) in a way that satisfies the strict flight operational standards of Realizability, Responsiveness, and Unobtrusiveness. With this type of compositionally constructed diagnostics framework we can develop compact, hierarchical, and highly expressive health management models for efficient, on-board fault detection and system monitoring. We describe an instantiation of our System Health Management (SHM) framework, rt-R2U2, on standard FPGA hardware, which is suitable to be deployed on-board a UAS. We run our system with a full set of real flight data from NASA's Swift UAS, and highlight a case where our runtime SHM framework would have been able to detect and diagnose a fault from subtle evidence that initially eluded traditional real-time diagnosis procedures.

Results from this paper were featured in the NASA Ames 2014 Director's Colloquium Summer Series "Dr. Kristin Yvonne Rozier - No More Helicopter Parenting: Intelligent, Autonomous UAS" on the NASA Ames YouTube Channel, June 10, 2014.

Our data files are compiled using the SamIam and Ace tools, which are free and open source and can be downloaded here. Documentation is packaged in the tool downloads.

Disclaimer: The files distributed on this page contain research prototype code and examples published in the paper above. The files are compatible with SamIam release 3.0 and Ace version 2.0; we make no claims regarding compatibility with any other versions. Please feel free to email me concerning clarifications, bugs, or other corrections.

Case Study: Fluxgate Magnetometer Buffer Overflow:

Assuming a standard set of sensors sending input signals along the bus to the embedded FPGA...

Step 0:

Define a set of Linear/Metric Temporal Logic specifications over the set of signals.

ltl.txt Hand-coded file of temporal logic formulas for the buffer overflow example. Future work would be to automate this task.

Step 1:

Use SamIam to design Bayes Net reasoners that utilize the set of temporal logic formulas as inputs. Output from SamIam. Input to Ace; compile with default compiler options.

Step 2:

Compile the resulting arithmetic circuit using Ace.
Compilation with Ace: Output from SamIam. Input to Ace; compile with default compiler options. This is the main Bayes Net reasoner used in our case study. Input to Ace; needed to compile This is the instantiation file needed for BN evaluation and it can be created automatically, but a predefined state of any node would influence the resulting arithmetic circuit. This version was hand-coded to ensure control over the structure of the resulting arithmetic circuit.* Output from Ace, one instance (since algorithm is not deterministic and the number of nodes and the structure both change with every compilation cycle). Output from Ace, one instance (since algorithm is not deterministic). Visualization of above using DOT.

* Ace is programmed to heavily use optimasation. So, if the .inst file states that a certain Bayes node is always in a certain state, then this is exploited and the AC will get smaller. However, in our implementation, we cannot use this feature, because our BN will always have to adapt to new indicators. Therefore, Ace's pre-compilation optimizations would create a non-general arithmetic circuit that would be incorrect, but only for our specific framework. Here is a specific example that illustrates why an automatically generated a .inst file it might contain states that don't change, that we need to change, and that would cause the resulting AC to be wrong in the context of the R2U2 framework because the unchanging states that should actually be changing states would be 'optimized' aka eliminated. For example, in our evaluation of the barometric altimeter sensor data, a .inst file could look like this: <?xml version="1.0" encoding="UTF-8"?> <instantiation date="Aug 25, 2006 9:23:45 AM"> <inst id="S_BARO_ALT" value="increasing"/> </instantiation>

This information is used by the Ace to evaluate the barometric altimeter is in the state "increasing." As Ace sees in this file that this state will not change, therefore Ace eliminates all nodes correlated to this information during compilation of the arithmetic circuit. However, the barometric altimeter will not always indicate an increasing altitude in real life. That is why the .inst file is not that important for our approach and we better make an .inst file like this:

<?xml version="1.0" encoding="UTF-8"?> <instantiation date="Aug 25, 2006 9:23:45 AM"> </instantiation>

Step 3:

Run rt-R2U2 code to combine together all of the artifacts from previous steps, creating a combination of temporal logic runtime observers and Bayes Net reasoners, all in a format suitable for upload to an FPGA to program the gates in the array. (We do not currently have a stand-alone version of this code; the version run for the paper was part of a large and onerous java GUI. A new, more lithe, standalone version is being created as part of 'future work' that should be both scalable and executable on any high-performance Linux platform.)
rt-R2U2 Compilation:

ltl_web_readable.txt Output from rt-R2U2 code, though this version was hand-coded for the experimental evaluation used in this paper. Input file to FPGA containing readable instructions for the rt-R2U2 process.
rtR2U2-quick reference.pdf Quick reference file with basic documentation of our process. This is Johannes Geist's lab notebook and internal manual for BN compilation.
instructions_web.txt Output from rt-R2U2 code. Input file to FPGA containing readable instructions for the rt-R2U2 process.
RV14_code_snippets.html Page of code snippets executing the critical functions in the rt-R2U2 code.

Legend for instructions_web.txtltl_web_readable.txt:

a atomic
i intermediate (i.e. i1 = true and i0 = false)
s instruction result (i.e. s1 = result of evaluating the line labeled 01:...)

Step 4: [Optional]

Preprocess the real test flight data using Matlab for data privacy reasons. In this paper, the real test flight data was normalized, preserving all pertinent properties of the data, but shifting the values to meet minimum requirements for data privacy at NASA.

RV14_atomics_web.csv Preprocessed input data file used for experimental evaluation in paper.

Additional View of Case Study Data
Normalized data for Acceptable Bad Packets (N^R_{tot}) and Bad Packets (N^R_b) used in Fluxgate Magnetometer case study. These values were used in evaluating three specifications: (a) "The fluxgate packet transmission rate is appropriate," (b) "The number of bad packets is low," and (c) "The bad packet rate is not increasing."

Step 5:

Run the FPGA as if it were onboard, in the field, processing the .csv data as an input stream from the onboard sensors, programmed with the bundled binary versions of the program files instructions_web.txt and ltl_web_readable.txt. This .zip contains all of the sources and the binary files used to communicate to the FPGA in our case study: the C program used to send the preprocessed input data file RV14_atomics_web.csv to the FPGA as if it were a sensor stream read off of the bus; all .c and .h files needed to build this code; along with a Makefile.

Kristin Yvonne Rozier 2014-07-12