Treffer: Reducing data movement on large shared memory systems by exploiting computation dependencies

Title:

Reducing data movement on large shared memory systems by exploiting computation dependencies

Authors:

Barrera, I.S., Ayguadé Parra, Eduard, Valero Cortés, Mateo, Moretó Planas, Miquel, Labarta Mancho, Jesús José, Casas, Marc

Contributors:

Universitat Politècnica de Catalunya. Departament d'Arquitectura de Computadors, Barcelona Supercomputing Center, Universitat Politècnica de Catalunya. CAP - Grup de Computació d'Altes Prestacions

Publisher Information:

Association for Computing Machinery (ACM)

Publication Year:

2018

Collection:

Universitat Politècnica de Catalunya, BarcelonaTech: UPCommons - Global access to UPC knowledge

Subject Terms:

Àrees temàtiques de la UPC::Informàtica::Sistemes d'informació::Emmagatzematge i recuperació de la informació, Parallel programming (Computer science), NUMA, Scheduling, Shared memory, Task-based programming model Data transfer, Graph theory, Intelligent control, Memory architecture, Virtual storage, Graph Partitioning, Non uniform memory access, Parallel application, Performance improvements, Shared memory system, Task-based programming, Data reduction, Programació en paral·lel (Informàtica)

Document Type:

Konferenz conference object

File Description:

11 p.; application/pdf

Language:

English

Relation:

https://dl.acm.org/citation.cfm?id=3205310; info:eu-repo/grantAgreement/MINECO//TIN2015-65316-P/ES/COMPUTACION DE ALTAS PRESTACIONES VII/; info:eu-repo/grantAgreement/AGAUR/PRI2010-2013/2014 SGR 1051; info:eu-repo/grantAgreement/AGAUR/PRI2010-2013/2014 SGR 1272; info:eu-repo/grantAgreement/EC/H2020/779877/EU/Mont-Blanc 2020, European scalable, modular and power efficient HPC processor/Mont-Blanc 2020; info:eu-repo/grantAgreement/EC/H2020/671697/EU/Mont-Blanc 3, European scalable and power efficient HPC platform based on low-power embedded technology/Mont-Blanc 3; https://hdl.handle.net/2117/125137

DOI:

10.1145/3205289.3205310

Availability:

https://hdl.handle.net/2117/125137
https://doi.org/10.1145/3205289.3205310

Rights:

Open Access

Accession Number:

edsbas.608B5F40

Database:

BASE

Weitere Informationen

Shared memory systems are becoming increasingly complex as they typically integrate several storage devices. That brings different access latencies or bandwidth rates depending on the proximity between the cores where memory accesses are issued and the storage devices containing the requested data. In this context, techniques to manage and mitigate non-uniform memory access (NUMA) effects consist in migrating threads, memory pages or both and are generally applied by the system software. We propose techniques at the runtime system level to further mitigate the impact of NUMA effects on parallel applications' performance. We leverage runtime system metadata expressed in terms of a task dependency graph, where nodes are pieces of serial code and edges are control or data dependencies between them, to efficiently reduce data transfers. Our approach, based on graph partitioning, adds negligible overhead and is able to provide performance improvements up to 1.52× and average improvements of 1.12× with respect to the best state-of-the-art approach when deployed on a 288-core shared-memory system. Our approach reduces the coherence traffic by 2.28× on average with respect to the state-of-the-art. ; This work has been supported by the RoMoL ERC Advanced Grant (GA 321253), by the European HiPEAC Network of Excellence, by the Spanish Ministry of Economy and Competitiveness (contract TIN2015-65316-P), by the Generalitat de Catalunya (contracts 2014-SGR-1051 and 2014-SGR-1272) and by the European Union’s Horizon 2020 research and innovation programme (grant agreements 671697 and 779877). I. Sánchez Barrera has been partially supported by the Spanish Ministry of Education, Culture and Sport under Formación del Profesorado Universitario fellowship number FPU15/03612. M. Moretó has been partially supported by the Spanish Ministry of Economy, Industry and Competitiveness under Ramón y Cajal fellowship number RYC-2016-21104. ; Peer Reviewed ; Postprint (published version)

Treffer: Reducing data movement on large shared memory systems by exploiting computation dependencies

Weitere Informationen

Links

Zusatz-Funktionen