Chapter 1 . Introduction.- Chapter 2. Background on Soft Errors.- Chapter 3. Fault Injection Framework Using Virtual Platforms.- Chapter 4. Performance and Accuracy Assessment of Fault Injection Frameworks Based on VPs.- Chapter 5. Extensive Soft Error Evaluation.- Chapter 6. Machine Learning Applied to Soft Error Assessment in Multicoresystems.
Felipe Rocha da Rosa holds a Ph.D. in microelectronics and a Bachelor of Computer Engineering degree by the Federal University of Rio Grande do Sul. For the past six years, he has been researching and developing tools for performance and reliability analysis of arm-based processors.
Felipe also researched applying data science and machine learning techniques to improve the soft error investigations during early design space exploration process. Currently, he works as a Modelling Engineer at Arm Cambridge for its Virtual Platforms development group.
Luciano Ost received his Ph.D. degree in Computer Science from PUCRS, Brazil in 2010. During his Ph.D., Luciano worked as an invited researcher at the Microelectronic Systems Institute of the Technische Universitaet Darmstadt (from 2007 to 2008) and at the University of York (October 2009). After the completion of his doctorate, he worked as a research assistant and then as an assistant professor at the University of Montpellier II/LIRMM in France, until joining the University of Leicester as a Lecturer in 2014. Dr. Ost is a faculty member of Loughborough University – UK, and he is author of more than 70 scientific papers, published in peer-reviewed international journals and conferences. His primary research directions comprise design and exploration of reliable and performance-efficient multi/many-core systems.
Ricardo Reis (M’81–SM’06) received the Electrical Engineering degree from the Federal University of Rio Grande do Sul (UFRGS), Porto Alegre, Brazil, in 1978, and the Ph.D. degree in informatics, option microelectronics from the Institut National Polytechnique de Grenoble, Grenoble, France, in 1983. He received the Doctor Honoris Causa from University of Montpellier, France, in 2016. He has been a Full Professor with UFRGS since 1981. He is at research level 1A of the CNPq (Brazilian National Science Foundation), Brazil, and the Head of several research projects supported by government agencies and industry. He has published more than 600 papers in journals and conference proceedings and authored or co-authored several books. His current research interests include physical design, physical design automation, design methodologies, digital design, EDA, circuits tolerant to radiation, and microelectronics education. Prof. Reis was a recipient of the IEEE Circuits and Systems Society (CASS) Meritorious Service Award 2015. He was the Vice President of the IEEE CASS and of the Brazilian Microelectronic Society and President of the Brazilian Computer Society (SBC). He was the chair of IFIP TC10. He has also organized several international conferences. He was a member of the CASS Distinguished Lecturer Program 2014–2015. He is member of the IEEE CASS BoG (2018-2020) and IEEE CEDA BoG.
This book describes the benefits and drawbacks inherent in the use of virtual platforms (VPs) to perform fast and early soft error assessment of multicore systems. The authors show that VPs provide engineers with appropriate means to investigate new and more efficient fault injection and mitigation techniques. Coverage also includes the use of machine learning techniques (e.g., linear regression) to speed-up the soft error evaluation process by pinpointing parameters (e.g., architectural) with the most substantial impact on the software stack dependability. This book provides valuable information and insight through more than 3 million individual scenarios and 2 million simulation-hours. Further, this book explores machine learning techniques usage to navigate large fault injection datasets.
Describes the most suitable and efficient virtual platforms to include fault injection capabilities, aiming to support the soft error analysis of state-of-the-art processor models;
Includes analysis and port of several benchmarks from embedded and HPC domains, including the Rodinia and NASA NAS Parallel Benchmark (NPB) suites;
Introduces four novel, non-intrusive FI techniques enabling software engineers to perform in-depth and relevant soft error evaluation, addressing the gap between the available FI tools and the industry requirements;
Explores machine learning techniques that can be used to enable the identification of individual (or combinations of) microarchitectural and software parameters that present the most substantial relation relationship with each detected soft error or failure.