Treffer: Enabling high-level parallel programming on multi-FPGA clusters

Title:
Enabling high-level parallel programming on multi-FPGA clusters
Contributors:
Universitat Politècnica de Catalunya. Doctorat en Arquitectura de Computadors, Universitat Politècnica de Catalunya. Departament d'Arquitectura de Computadors, Barcelona Supercomputing Center, Universitat Politècnica de Catalunya. PM - Programming Models
Publisher Information:
Association for Computing Machinery (ACM)
Publication Year:
2024
Collection:
Universitat Politècnica de Catalunya, BarcelonaTech: UPCommons - Global access to UPC knowledge
Document Type:
Konferenz conference object
File Description:
9 p.; application/pdf
Language:
English
Relation:
info:eu-repo/grantAgreement/EC/H2020/946002/EU/The MareNostrum Experimental Exascale Platform/MEEP; info:eu-repo/grantAgreement/EC/H2020/956831/EU/Towards EXtreme scale Technologies and Accelerators for euROhpc hw%2FSw Supercomputing Applications for exascale/TEXTAROSSA; info:eu-repo/grantAgreement/AEI/Plan Estatal de Investigación Científica y Técnica y de Innovación 2017-2020/PCI2021-121964/ES/TOWARDS EXTREME SCALE TECHNOLOGIES AND ACCELERATORS FOR EUROHPC HW%2FSW SUPERCOMPUTING APPLICATIONS FOR EXASCALE/; info:eu-repo/grantAgreement/AEI/Plan Estatal de Investigación Científica y Técnica y de Innovación 2021-2023/PDC2022-133323-I00/ES/PROCESADOR FUERA DE ORDEN MULTINUCLEO CONSCIENTE DE LA APLICACION BASADO EN INSTRUCCIONES ABIERTAS RISC-V/; info:eu-repo/grantAgreement/AEI/Plan Estatal de Investigación Científica y Técnica y de Innovación 2017-2020/PID2019-107255GB-C21/ES/BSC - COMPUTACION DE ALTAS PRESTACIONES VIII/; https://hdl.handle.net/2117/411960
DOI:
10.1145/3665283.3665292
Rights:
Open Access
Accession Number:
edsbas.1584132A
Database:
BASE

Weitere Informationen

Field Programmable Gate Arrays (FPGA) are still relatively new in the High Performance Computing (HPC) field. Hence, they still lack a mature ecosystem that allows non-FPGA experts to scale an application with many devices operating in parallel. In this paper, we add support for message passing inspired by the Message Passing Interface (MPI) to the Marenostrum Exascale Emulation Platform (MEEP) cluster, a state-of-the-art FPGA cluster. We use the OmpSs@FPGA programming model, which allows C/C++ code to run on the FPGA and call functions that behave like the well-known MPI\_Send/Recv. For that, we implement the message passing runtime over the MEEP 100Gb Ethernet network. This network includes a switch connected to the QSFP port of the FPGA cards. The switch enables all-to-all connectivity without adding routers in the FPGA fabric. We also introduce a method to manage FPGAs that are PCIe-hosted by remote CPU nodes. I.e. from any CPU node we can load the bitstream, configure it, and transfer data to FPGAs that are attached to a different node. Finally, we evaluate the bandwidth of FPGA-FPGA, and CPU-FPGA local and remote communication, as well as the performance of benchmarks scaling from 1 to 64 FPGAs using the infrastructure presented in this paper. The benchmarks are N-body, Heat with Gauss-Seidel solver and Cholesky. We compare the results with the MareNostrum 4 supercomputer and get 2.3x and 3.5x better performance per power for the N-body and Heat. ; We thank the MEEP project [Horizon 2020 grant number 946002] for providing us the FPGA cluster used in this work. This work was supported by the TEXTAROSSA project [Horizon 2020 grant number 956831]; the Spanish Government [grant numbers PCI2021121964, PDC2022-133323-I00, PID2019-107255GB-C21, MICIU/AEI/ 10.13039/501100011033]; and the Generalitat de Catalunya [grant number 2021 SGR 01007]. ; Peer Reviewed ; Postprint (author's final draft)