High performance BLAS formulation of the multipole-to-local operator in the Fast Multipole Method
COULAUD, Olivier
Algorithms and high performance computing for grand challenge applications [SCALAPPLIX]
Algorithms and high performance computing for grand challenge applications [SCALAPPLIX]
FORTIN, Pierre
Algorithms and high performance computing for grand challenge applications [SCALAPPLIX]
Algorithms and high performance computing for grand challenge applications [SCALAPPLIX]
ROMAN, Jean
Algorithms and high performance computing for grand challenge applications [SCALAPPLIX]
Laboratoire Bordelais de Recherche en Informatique [LaBRI]
Algorithms and high performance computing for grand challenge applications [SCALAPPLIX]
Laboratoire Bordelais de Recherche en Informatique [LaBRI]
COULAUD, Olivier
Algorithms and high performance computing for grand challenge applications [SCALAPPLIX]
Algorithms and high performance computing for grand challenge applications [SCALAPPLIX]
FORTIN, Pierre
Algorithms and high performance computing for grand challenge applications [SCALAPPLIX]
Algorithms and high performance computing for grand challenge applications [SCALAPPLIX]
ROMAN, Jean
Algorithms and high performance computing for grand challenge applications [SCALAPPLIX]
Laboratoire Bordelais de Recherche en Informatique [LaBRI]
< Réduire
Algorithms and high performance computing for grand challenge applications [SCALAPPLIX]
Laboratoire Bordelais de Recherche en Informatique [LaBRI]
Langue
en
Article de revue
Ce document a été publié dans
Journal of Computational Physics. 2008, vol. 227, n° 3, p. 1836-1862
Elsevier
Résumé en anglais
The multipole-to-local (M2L) operator is the most time-consuming part of the far field computation in the Fast Multipole Method for Laplace equation. Its natural expression, though commonly used, does not respect a sharp ...Lire la suite >
The multipole-to-local (M2L) operator is the most time-consuming part of the far field computation in the Fast Multipole Method for Laplace equation. Its natural expression, though commonly used, does not respect a sharp error bound: we here first prove the correctness of a second expression. We then propose a matrix formulation implemented with BLAS (Basic Linear Algebra Subprograms) routines in order to speed up its computation for these two expressions. We also introduce special data storages in memory to gain greater computational efficiency. This BLAS scheme is finally compared, for uniform distributions, to other M2L improvements such as block FFT, rotations and plane wave expansions. When considering runtime, extra memory storage, numerical stability and common precisions for Laplace equation, the BLAS version appears as the best one.< Réduire
Mots clés en anglais
Fast Multipole Methods
Laplace equation
BLAS routines
error bound
Fast Fourier Transform
rotations
plane waves
uniform distribution
Origine
Importé de halUnités de recherche