Treffer: Benefits of MPI Sessions for GPU MPI applications

Title:
Benefits of MPI Sessions for GPU MPI applications
Contributors:
Laboratoire d'Informatique en Calcul Intensif et Image pour la Simulation (LICIIS UR 3690 LRC DIGIT), Université de Reims Champagne-Ardenne (URCA)-Commissariat à l'énergie atomique et aux énergies alternatives (CEA), Laboratoire en Informatique Haute Performance pour le Calcul et la simulation (LIHPC), DAM Île-de-France (DAM/DIF), Direction des Applications Militaires (DAM), Commissariat à l'énergie atomique et aux énergies alternatives (CEA)-Commissariat à l'énergie atomique et aux énergies alternatives (CEA)-Direction des Applications Militaires (DAM), Commissariat à l'énergie atomique et aux énergies alternatives (CEA)-Commissariat à l'énergie atomique et aux énergies alternatives (CEA)-Université Paris-Saclay, Commissariat à l'énergie atomique et aux énergies alternatives (CEA)-Commissariat à l'énergie atomique et aux énergies alternatives (CEA), The ATOS BXI team
Source:
EuroMPI '21 - 28th European MPI Users' Group Meeting ; https://cea.hal.science/cea-03322976 ; EuroMPI '21 - 28th European MPI Users' Group Meeting, Sep 2021, Leibniz, Germany
Publisher Information:
CCSD
Publication Year:
2021
Collection:
Université de Reims Champagne-Ardenne: Archives Ouvertes (HAL)
Subject Geographic:
Document Type:
Konferenz conference object
Language:
English
Rights:
info:eu-repo/semantics/OpenAccess
Accession Number:
edsbas.805045CE
Database:
BASE

Weitere Informationen

International audience ; Heterogeneous supercomputers are now considered the most valuable solution to reach the Exascale. Nowadays, we can frequentlyobserve that compute nodes are composed of more than one GPUaccelerator. Programming such architectures efficiently is challenging.MPI is the defacto standard for distributed computing. CUDAaware libraries were introduced to ease GPU inter-nodes communications. However, they induce some overhead that can degradeoverall performances. MPI 4.0 Specification draft introduces theMPI Sessions model which offers the ability to initialize specificresources for a specific component of the application.In this paper, we present a way to reduce the overhead inducedby CUDA-aware libraries with a solution inspired by MPI Sessions.In this way, we minimize the overhead induced by GPUs in an MPIcontext and allow to improve CPU + GPU programs efficiency. Weevaluate our approach on various micro-benchmarks and someproxy applications like Lulesh, MiniFE, Quicksilver, and Cloverleaf.We demonstrate how this approach can provide up to a 7x speedupcompared to the standard MPI model.