Treffer: MEEK: Re-thinking Heterogeneous Parallel Error Detection Architecture for Real-World OoO Superscalar Processors

Title:
MEEK: Re-thinking Heterogeneous Parallel Error Detection Architecture for Real-World OoO Superscalar Processors
Publisher Information:
Institute of Electrical and Electronics Engineers (IEEE)
Department of Computer Science and Technology
//doi.org/10.1109/dac63849.2025.11132986
Publication Year:
2025
Collection:
Apollo - University of Cambridge Repository
Document Type:
Konferenz conference object
File Description:
application/pdf
Language:
English
DOI:
10.17863/CAM.118304
Rights:
Attribution 4.0 International ; https://creativecommons.org/licenses/by/4.0/
Accession Number:
edsbas.EDCD15B9
Database:
BASE

Weitere Informationen

Heterogeneous parallel error detection is an approach to achieving fault-tolerant processors, leveraging multiple power-efficient cores to re-execute software originally run on a high-performance core. Yet, its complex components, gathering data cross-chip from many parts of the core, raise questions of how to build it into commodity cores without heavy design invasion and extensive re-engineering. We build the first full-RTL design, MEEK, into an open-source SoC, from microarchitecture and ISA to the OS and programming model. We identify and solve bottlenecks and bugs overlooked in previous work, and demonstrate that MEEK offers microsecond-level detection capacity with affordable overheads. By trading off architectural functionalities across codesigned hardware-software layers, MEEK features only light changes to a mature out-of-order superscalar core, simple coordinating software layers, and a few lines of operating-system code. The Repo. of MEEK's source code: https://github.com/SEU-ACAL/reproduce-MEEK-DAC-25 ; National Key Research and Development Program (Grant No. 2024YFB4405600), the National Natural Science Foundation of China (Grant No. 62472086, 62204036) and the Basic Research Program of Jiangsu (Grants No. BK20243042).