Treffer: A Distributed‐Heterogeneous Design for Explicit Hyperbolic Solvers. Application to Tsunami Urban Run‐Up Modelling.

Title:
A Distributed‐Heterogeneous Design for Explicit Hyperbolic Solvers. Application to Tsunami Urban Run‐Up Modelling.
Authors:
Conde, Daniel A. S.1,2 (AUTHOR), Ferreira, Rui M. L.1 (AUTHOR) ruimferreira@tecnico.ulisboa.pt, Canelas, Ricardo3 (AUTHOR), Ricardo, Ana Margarida4 (AUTHOR), Mendes, Luís1 (AUTHOR)
Source:
Journal of Advances in Modeling Earth Systems. Dec2025, Vol. 17 Issue 12, p1-23. 23p.
Company/Entity:
Database:
GreenFILE

Weitere Informationen

A distributed multi‐architecture design for massively parallel hyperbolic solvers is herein introduced and benchmarked. A unified object‐oriented central processing unit (CPU) + graphics processing unit (GPU) approach is complemented with an inter‐device communication layer, enabling both coarse and fine‐grain parallelism on hyperbolic solvers. The approach involves the combination of three different programming platforms, namely OpenMP, CUDA and MPI. The efficiency of this distributed‐heterogeneous approach is quantified under static and dynamic loads on consumer and professional grade CPUs and GPUs. An asynchronous communications scheme is implemented and described, showing very reduced overheads and a nearly linear scalability for multiple device combinations. For simulations (or systems) with non‐homogeneous workloads (or devices) the domain decomposition algorithm incorporates a low‐frequency load‐to‐device fitting function to ensure computational balance. A real‐world application to high‐resolution hydrodynamic modelling is presented: the propagation of a tsunami in the estuary a large river and its run‐up in an urban mesh. The proposed implementation shows speedups of up to two orders of magnitude, opening new perspectives for solvers with high‐demand requirements but relatively simple hardware in multi‐architecture machines. Plain Language Summary: A new approach to implement in computer code the methods for solving the mathematical equations that describe shallow flows is presented and tested. This approach seeks computational speed on readily available hardware, although it can work on powerful and dedicated servers too. It uses a combined set of protocols and instructions that allow both standard processors (CPUs) and high performance graphics processors (GPUs) to work together efficiently. These include instructions in OpenMP (tool for parallel programming on CPUs), CUDA (a tool developed by NVIDIA for programming on GPUs) and MPI (a tool for communication between different processors). The communication between devices is managed in a way that minimizes delays and scales well as more devices are added. For systems with uneven workloads or different types of devices, the implementation ensures that the work is evenly distributed. The resulting mathematical model has been tested on various types of CPUs and GPUs, both for everyday use and professional purposes, under different workloads. A real‐world example of application is the propagation of a tsunami in the estuary of River Tagus and its run‐up in the Lisbon waterfront. The earthquake that triggers the tsunami and the main tsunami generation features are similar to those of the 1755 Lisbon earthquake. The implementation is much faster than traditional methods, achieving up to 100 times the speed of a single CPU, making it a powerful tool for complex simulations using relatively simple hardware. Key Points: A new efficient implementation of a finite volume discretization of hyperboic PDEs, using both central processing units (CPUs) and graphics processing units (GPUs), is presentedUsing OpenMP, CUDA, and MPI, efficient communication and load balancing between workers (CPUs or GPUS) allows for large speedupsA real‐world application, high‐resolution tsunami run‐up in a dense urban mesh, showcases the performance capabilities on simple hardware [ABSTRACT FROM AUTHOR]

Copyright of Journal of Advances in Modeling Earth Systems is the property of Wiley-Blackwell and its content may not be copied or emailed to multiple sites without the copyright holder's express written permission. Additionally, content may not be used with any artificial intelligence tools or machine learning technologies. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)