Submitted as: development and technical paper
13 Jan 2022
Submitted as: development and technical paper | 13 Jan 2022
Status: this preprint is currently under review for the journal GMD.

Assessing the robustness and scalability of the accelerated pseudo-transient method towards exascale computing

Ludovic Räss1,2, Ivan Utkin1,3, Thibault Duretz4,5, Samuel Omlin6, and Yuri Y. Podladchikov7,8 Ludovic Räss et al.
  • 1Laboratory of Hydraulics, Hydrology and Glaciology (VAW), ETH Zurich, Zurich, Switzerland
  • 2Swiss Federal Institute for Forest, Snow and Landscape Research (WSL), Birmensdorf, Switzerland
  • 3Faculty of Mechanics and Mathematics, Lomonosov Moscow State University, Moscow, Russia
  • 4Institut für Geowissenschaften, Geothe-Universität Frankfurt, Frankfurt, Germany
  • 5Univ. Rennes, CNRS, Géosciences Rennes UMR 6118, F-35000 Rennes, France
  • 6Swiss National Supercomputing Centre (CSCS), ETH Zurich, Lugano, Switzerland
  • 7Institute of Earth sciences, University of Lausanne, Lausanne, Switzerland
  • 8Swiss Geocomputing Centre, University of Lausanne, Lausanne, Switzerland

Abstract. The development of highly efficient, robust and scalable numerical algorithms lags behind the rapid increase in massive parallelism of modern hardware. We address this challenge with the accelerated pseudo-transient iterative method and present here a physically motivated derivation. We analytically determine optimal iteration parameters for a variety of basic physical processes and confirm the validity of theoretical predictions with numerical experiments. We provide an efficient numerical implementation of pseudo-transient solvers on graphical processing units (GPUs) using the Julia language. We achieve a parallel efficiency over 96 % on 2197 GPUs in distributed memory parallelisation weak scaling benchmarks. 2197 GPUs allow for unprecedented terascale solutions of 3D variable viscosity Stokes flow on 49953 grid cells involving over 1.2 trillion degrees of freedom. We verify the robustness of the method by handling contrasts up to 9 orders of magnitude in material parameters such as viscosity, and arbitrary distribution of viscous inclusions for different flow configurations. Moreover, we show that this method is well suited to tackle strongly nonlinear problems such as shear-banding in a visco-elasto-plastic medium. A GPU-based implementation can outperform CPU-based direct-iterative solvers in terms of wall-time even at relatively low resolution. We additionally motivate the accessibility of the method by its conciseness, flexibility, physically motivated derivation and ease of implementation. This solution strategy has thus a great potential for future high-performance computing applications, and for paving the road to exascale in the geosciences and beyond.

Ludovic Räss et al.

Status: final response (author comments only)

Comment types: AC – author | RC – referee | CC – community | EC – editor | CEC – chief editor | : Report abuse
  • RC1: 'Comment on gmd-2021-411', Lawrence Hongliang Wang, 10 Feb 2022
    • AC1: 'Reply on RC1', Ludovic Räss, 16 May 2022
  • RC2: 'Comment on gmd-2021-411', Anonymous Referee #2, 29 Apr 2022
    • AC2: 'Reply on RC2', Ludovic Räss, 16 May 2022

Ludovic Räss et al.

Ludovic Räss et al.


Total article views: 1,127 (including HTML, PDF, and XML)
HTML PDF XML Total BibTeX EndNote
870 237 20 1,127 12 8
  • HTML: 870
  • PDF: 237
  • XML: 20
  • Total: 1,127
  • BibTeX: 12
  • EndNote: 8
Views and downloads (calculated since 13 Jan 2022)
Cumulative views and downloads (calculated since 13 Jan 2022)

Viewed (geographical distribution)

Total article views: 1,016 (including HTML, PDF, and XML) Thereof 1,016 with geography defined and 0 with unknown origin.
Country # Views %
  • 1
Latest update: 26 Jun 2022
Short summary
Continuum mechanics-based modelling of physical processes at large scale requires huge computational resources provided by massively parallel hardware such as graphical processing units. We present a suite of numerical algorithms, implemented using the Julia language, that efficiently leverage that parallelism. We demonstrate that our implementation is efficient, scalable and robust, and showcase applications to various geophysical problems.