Treffer: RICH: implementing reductions in the cache hierarchy

Title:

RICH: implementing reductions in the cache hierarchy

Authors:

Dimic, Vladimir, Moretó Planas, Miquel, Casas, Marc, Ciesko, Jan, Valero Cortés, Mateo

Contributors:

Universitat Politècnica de Catalunya. Doctorat en Arquitectura de Computadors, Universitat Politècnica de Catalunya. Departament d'Arquitectura de Computadors, Barcelona Supercomputing Center, Universitat Politècnica de Catalunya. CAP - Grup de Computació d'Altes Prestacions

Publisher Information:

Association for Computing Machinery (ACM)

Publication Year:

2020

Collection:

Universitat Politècnica de Catalunya, BarcelonaTech: UPCommons - Global access to UPC knowledge

Subject Terms:

Àrees temàtiques de la UPC::Informàtica::Arquitectura de computadors, High performance computing, Memory management (Computer science), Parallel programming (Computer science), Shared memory, Caches, Reductions, Task-based programming model, Superordinadors, Gestió de memòria (Informàtica), Programació en paral·lel (Informàtica)

Document Type:

Konferenz conference object

File Description:

13 p.; application/pdf

Language:

English

Relation:

info:eu-repo/grantAgreement/MINECO//TIN2015-65316-P/ES/COMPUTACION DE ALTAS PRESTACIONES VII/; info:eu-repo/grantAgreement/AGAUR/2017 SGR 1414; info:eu-repo/grantAgreement/AGAUR/2017-SGR-1328; info:eu-repo/grantAgreement/AEI/RYC-2016-21104; info:eu-repo/grantAgreement/MINECO/2PE/RYC-2017-23269; info:eu-repo/grantAgreement/EC/FP7/321253/EU/Riding on Moore's Law/ROMOL; http://hdl.handle.net/2117/192495

DOI:

10.1145/3392717.3392736

Availability:

http://hdl.handle.net/2117/192495
https://doi.org/10.1145/3392717.3392736

Rights:

Open Access

Accession Number:

edsbas.98A61DA5

Database:

BASE

Weitere Informationen

Reductions constitute a frequent algorithmic pattern in high-performance and scientific computing. Sophisticated techniques are needed to ensure their correct and scalable concurrent execution on modern processors. Reductions on large arrays represent the most demanding case where traditional approaches are not always applicable due to low performance scalability. To address these challenges, we propose RICH, a runtime-assisted solution that relies on architectural and parallel programming model extensions. RICH updates the reduction variable directly in the cache hierarchy with the help of added in-cache functional units. Our programming model extensions fit with the most relevant parallel programming solutions for shared memory environments like OpenMP. RICH does not modify the ISA, which allows the use of algorithms with reductions from pre-compiled external libraries. Experiments show that our solution achieves the performance improvements of 11.2% on average, compared to the state-of-the-art hardware-based approaches, while it introduces 2.4% area and 3.8% power overhead. ; This work has been supported by the RoMoL ERC Advanced Grant (GA 321253), by the European HiPEAC Network of Excellence, by the Spanish Ministry of Economy and Competitiveness (contract TIN2015-65316-P), and by Generalitat de Catalunya (contracts 2017- SGR-1414 and 2017-SGR-1328). V. Dimić has been partially supported by the Agency for Management of University and Research Grants (AGAUR) of the Government of Catalonia under Ajuts per a la contractació de personal investigador novell fellowship number 2017 FI_B 00855. M. Moretó has been partially supported by the Spanish Ministry of Economy, Industry and Competitiveness under Ramón y Cajal fellowship number RYC-2016-21104. M. Casas has been partially supported by the Spanish Ministry of Economy, Industry and Competitiveness under Ramon y Cajal fellowship number RYC-2017-23269. This manuscript has been co-authored by National Technology & Engineering Solutions of Sandia, LLC. under ...

Treffer: RICH: implementing reductions in the cache hierarchy

Weitere Informationen

Links

Zusatz-Funktionen