Parallel scheduling of task trees with limited memory
EYRAUD-DUBOIS, Lionel
Laboratoire Bordelais de Recherche en Informatique [LaBRI]
Reformulations based algorithms for Combinatorial Optimization [Realopt]
Laboratoire Bordelais de Recherche en Informatique [LaBRI]
Reformulations based algorithms for Combinatorial Optimization [Realopt]
MARCHAL, Loris
Laboratoire de l'Informatique du Parallélisme [LIP]
Optimisation des ressources : modèles, algorithmes et ordonnancement [ROMA]
Voir plus >
Laboratoire de l'Informatique du Parallélisme [LIP]
Optimisation des ressources : modèles, algorithmes et ordonnancement [ROMA]
EYRAUD-DUBOIS, Lionel
Laboratoire Bordelais de Recherche en Informatique [LaBRI]
Reformulations based algorithms for Combinatorial Optimization [Realopt]
Laboratoire Bordelais de Recherche en Informatique [LaBRI]
Reformulations based algorithms for Combinatorial Optimization [Realopt]
MARCHAL, Loris
Laboratoire de l'Informatique du Parallélisme [LIP]
Optimisation des ressources : modèles, algorithmes et ordonnancement [ROMA]
Laboratoire de l'Informatique du Parallélisme [LIP]
Optimisation des ressources : modèles, algorithmes et ordonnancement [ROMA]
VIVIEN, Frédéric
Optimisation des ressources : modèles, algorithmes et ordonnancement [ROMA]
Laboratoire de l'Informatique du Parallélisme [LIP]
< Réduire
Optimisation des ressources : modèles, algorithmes et ordonnancement [ROMA]
Laboratoire de l'Informatique du Parallélisme [LIP]
Langue
en
Article de revue
Ce document a été publié dans
ACM Transactions on Parallel Computing. 2015-07, vol. 2, n° 2, p. 36
Association for Computing Machinery
Résumé en anglais
This paper investigates the execution of tree-shaped task graphs using multiple processors. Each edge of such a tree represents some large data. A task can only be executed if all input and output data fit into memory, and ...Lire la suite >
This paper investigates the execution of tree-shaped task graphs using multiple processors. Each edge of such a tree represents some large data. A task can only be executed if all input and output data fit into memory, and a data can only be removed from memory after the completion of the task that uses it as an input data. Such trees arise in the multifrontal method of sparse matrix factorization. The peak memory needed for the processing of the entire tree depends on the execution order of the tasks. With one processor the objective of the tree traversal is to minimize the required memory. This problem was well studied and optimal polynomial algorithms were proposed. Here, we extend the problem by considering multiple processors, which is of obvious interest in the application area of matrix factorization. With multiple processors comes the additional objective to minimize the time needed to traverse the tree, i.e., to minimize the makespan. Not surprisingly, this problem proves to be much harder than the sequential one. We study the computational complexity of this problem and provide inapproximability results even for unit weight trees. We design a series of practical heuristics achieving different trade-offs between the minimization of peak memory usage and makespan. Some of these heuristics are able to process a tree while keeping the memory usage under a given memory limit. The different heuristics are evaluated in an extensive experimental evaluation using realistic trees.< Réduire
Mots clés en anglais
multi-criteria optimization
memory usage
Approximation algorithms
scheduling
task graphs
pebble-game
Project ANR
Solveurs pour architectures hétérogènes utilisant des supports d'exécution - ANR-13-MONU-0007
Origine
Importé de halUnités de recherche