Dynamic Thread Pinning for Phase-Based OpenMP Programs
TOUATI, Sid
Models and methods of analysis and optimization for systems with real-time and embedding constraints [AOSTE]
Models and methods of analysis and optimization for systems with real-time and embedding constraints [AOSTE]
BARTHOU, Denis
Laboratoire Bordelais de Recherche en Informatique [LaBRI]
Efficient runtime systems for parallel architectures [RUNTIME]
Laboratoire Bordelais de Recherche en Informatique [LaBRI]
Efficient runtime systems for parallel architectures [RUNTIME]
TOUATI, Sid
Models and methods of analysis and optimization for systems with real-time and embedding constraints [AOSTE]
Models and methods of analysis and optimization for systems with real-time and embedding constraints [AOSTE]
BARTHOU, Denis
Laboratoire Bordelais de Recherche en Informatique [LaBRI]
Efficient runtime systems for parallel architectures [RUNTIME]
< Réduire
Laboratoire Bordelais de Recherche en Informatique [LaBRI]
Efficient runtime systems for parallel architectures [RUNTIME]
Langue
en
Communication dans un congrès
Ce document a été publié dans
Euro-Par 2013 Parallel processing, Euro-Par 2013 Parallel processing, The Euro-Par 2013 conference, 2013-08-26, Aachen. 2013-08-26, vol. 8097, p. 53-64
Springer
Résumé en anglais
Thread affinity has appeared as an important technique to improve the overall program performance and for better performance stability. However, if we consider a program with multiple phases, it is unlikely that a single ...Lire la suite >
Thread affinity has appeared as an important technique to improve the overall program performance and for better performance stability. However, if we consider a program with multiple phases, it is unlikely that a single thread affinity produces the best program performance for all these phases. If we consider the case of OpenMP, applications may have multiple parallel regions, each with a distinct inter-thread data sharing pattern. In this paper, we propose an approach that allows to change thread affinity dynamically (thread migrations) between parallel regions at runtime to account for these distinct inter-thread data sharing patterns. We demonstrate that as far as cache sharing is concerned for SPEC OMP01, not all the tested OpenMP applications exhibit a distinct phase behavior. However, we show that while fixing thread affinity for the whole execution may improve performance by up to 30%, allowing dynamic thread pinning may improve performance by up to 40%. Furthermore, we provide an analysis about the required conditions to improve the effectiveness of the approach< Réduire
Mots clés en anglais
OpenMP
thread level parallelism
thread affinity
multicores.
multicores
Origine
Importé de halUnités de recherche