Soft errors are transient bit-flips mostly caused by cosmic radiation, package radioactivity, or signal integrity issues. Soft errors may affect hardware reliability, a major focus in the landscape of high-end microprocessors in which data integrity and long mean time between failures are desired. As part of this activity, a set of tools and methodologies are being developed that will aid the design team achieve their hardware reliability requirements more easily.
As part of this activity, we have developed a run-time executable (RTX) for Fusion which enables performing soft error injections efficiently. This RTX exploits the Mesa capability of simulating multiple copies of a given model in a single run. By instantiating a secondary copy of the main model and creating a so-called dual model, this RTX can keep track of injected errors and discover when an error has vanished, e.g., by logical masking. This leads to a higher utilization level of simulation resources by enabling large amounts of injections in a single test run.
Other directions of this activity are methods for providing insight regarding those parts of the design that are protected/unprotected for soft errors, as well as methods for finding parts that are most vulnerable for soft error hits.