# Testing Fault Tolerance Properties: Soft-core Processor-based Experimental Robot Controller ### Jakub Podivinsky, Zdenek Kotasek Brno University of Technology, Faculty of Information Technology, Centre of Excellence IT4Innovations Bozetechova 2, 612 66 Brno, Czech Republic {ipodivinsky, kotasek}@fit.vutbr.cz ### **Abstract** Various electronic systems play an important role in our everyday lives. Some of them serve for fun or to make our lives easier. These systems are useful but not necessary; when they malfunction, the consequences are not critical. On the other hand, there are systems which are more or less critical, and their failure can cause undesirable consequences. For example, a failure in medicine, aviation, the army or automotive systems can cause high economic losses and/or endanger human health. These systems must be protected against the impact of faults, and flawless operation must be ensured. Fault tolerance is one of the techniques that will ensure this. There are many fault-tolerance methodologies targeted towards various systems and technologies, and new methodologies are being investigated. There have been many fault-tolerant methodologies inclined, among others, to *Field Programmable Gate Arrays* (FPGAs) developed and new ones are under investigation, because FPGAs are becoming more popular due to their flexibility and re-configurability. The second reason why so many techniques are inclined to FPGAs is their sensitivity to faults and ability to be reconfigured in the case of fault occurrence. The configuration of FPGAs is stored as a *bitstream* in SRAM memory. The problem is that FPGAs are quite sensitive to faults caused by charged particles. This particle can induce inversion of a bit in bitstream and this may lead to a change in its behaviour. This event is called *Single Event Upset* (SEU). It is also important to verify these techniques. An evaluation platform for testing fault-tolerance methodologies targeted towards SRAM-based FPGAs (Field Programmable Gate Arrays) was presented and demonstrated in our previous work. Our evaluation platform is based on *Functional Verification*. The main task of functional verification is to check whether a verified circuit meets its specifications. It compares the outputs of a verified circuit running in an RTL simulator with those of a reference model. In the case of the fault injection, the verified circuit must be implemented into the FPGA, so we do not use classical simulation-based functional verification, but modified FPGA-based functional verification. Our platform uses functional verification as a tool for monitoring the impacts of faults injected into an electronic controller implemented into the FPGA. The use of an FPGA development board where an electronic controller is implemented allows us to inject faults directly into the FPGA. A robot for seeking a path through a maze and the processor-based robot controller serve as an experimental system case study. Experimental results with the unhardened and hardened versions of the new processor-based robot controller are presented and discussed. Two different strategies of fault injection are used in these experiments: *Multiple faults* and *single faults*. Experiments are done for the unhardened version and the TMR version of the processor-based robot controller. The number of verification runs that were performed for each version of the robot controller and each fault injection strategy is 5000 verification runs. Experimental results are compared with the same experiments with the original hard-coded robot controller. The experimental results for multiple fault injection strategy are summarized in Table 1. It shows the results of both the unhardened and the TMR versions of the processor-based robot controller and it contains a comparison with the original hard-coded robot controller. One can see that the unhardened electronic version failed in 44.02% and the TMR version failed in 8.14% of the cases. This confirms that TMR is a beneficial approach, even though the increase in resource consumption is high. The table also shows the impact of faults on the mechanical robot; a large number of electronic failures leads to the robot stopping in a place which is less critical than a collision with a wall. In comparison with the original hard-coded robot controller, the processor-based robot controller is more susceptible to faults. This fact is evident both for the unhardened and the TMR version. This phenomenon was expected, because the processor represents a more complex design with lots of partial components. These experiments confirmed our expectations. Table 1: A comparison of the impact of *multiple* faults injected into the unhardened and hardened versions of the processor-based robot controller and the original hard coded robot controller. | Monitored impact | Processor-based RC | | Original hard-coded RC | | |-----------------------------|--------------------|-------|------------------------|-------| | | noft | tmr | noft | tmr | | Electronic OK [-] | 2751 | 4593 | 3544 | 4839 | | Electronic failed [-] | 2201 | 407 | 1456 | 161 | | Electronic failed [%] | 44.02% | 8.14% | 29.12% | 3.22% | | Finish not reached [-] | 2179 | 403 | 1429 | 161 | | Collision with wall $[-]$ | 55 | 7 | 11 | 0 | | Robot stop on place [-] | 2124 | 396 | 1418 | 161 | | Reliability improvement [%] | 81.5% | | 88.9% | | As a future work, we plan to apply some sophisticated fault tolerance techniques on the presented experimental electro-mechanical system and repeat the complete evaluation process. One of the possible improvements is the use of reconfiguration for faulty module recovery and synchronization of the recovered module (processor in our case study) with failure-free modules. ## Paper origin The original paper has been accepted for presentation at Euromicro Conference on Digital System Design (DSD 2018) in Prague [1]. ### Acknowledgement This work was supported by The Ministry of Education, Youth and Sports from the National Programme of Sustainability (NPU II); project IT4Innovations excellence in science - LQ1602 and the BUT project FIT-S-17-3994. #### References [1] J. Podivinsky, J. Lojda, O. Cekan, R. Panek and Z. Kotasek. Evaluation Platform for Testing Fault Tolerance Properties: Soft-core Processor-based Experimental Robot Controller. Accepted to the 21th Euromicro Conference on Digital System Design. Prague, 2018.