Computer Science Technical Reports
CS at VT

A Framework to Analyze the Performance of Load Balancing Schemes for Ensembles of Stochastic Simulations

Ahn, Tae-Hyuk and Sandu, Adrian and Watson, Layne T. and Shaffer, Clifford A. and Cao, Yang and Baumann, William T. (2012) A Framework to Analyze the Performance of Load Balancing Schemes for Ensembles of Stochastic Simulations. Technical Report TR-12-06, Computer Science, Virginia Tech.

Full text available as:
PDF - Requires Adobe Acrobat Reader or other PDF viewer.
LoadBalancingTPDS12.pdf (4394570)

Abstract

Ensembles of simulations are employed to estimate the statistics of possible future states of a system, and are widely used in important applications such as climate change and biological modeling. Ensembles of runs can naturally be executed in parallel. However, when the CPU times of individual simulations vary considerably, a simple strategy of assigning an equal number of tasks per processor can lead to serious work imbalances and low parallel efficiency. This paper presents a new probabilistic framework to analyze the performance of dynamic load balancing algorithms for ensembles of simulations where many tasks are mapped onto each processor, and where the individual compute times vary considerably among tasks. Four load balancing strategies are discussed: most-dividing, all-redistribution, random-polling, and neighbor-redistribution. Simulation results with a stochastic budding yeast cell cycle model is consistent with the theoretical analysis. It is especially significant that there is a provable global decrease in load imbalance for the local rebalancing algorithms due to scalability concerns for the global rebalancing algorithms. The overall simulation time is reduced by up to 25%, and the total processor idle time by 85%.

Item Type:Departmental Technical Report
Keywords:Dynamic load balancing (DLB), probabilistic framework analysis, ensemble simulations, stochastic simulation algorithm (SSA), high-performance computing (HPC), budding yeast cell cycle.
Subjects:Computer Science > Parallel Computation
ID Code:1189
Deposited By:Ahn, Tae-Hyuk
Deposited On:24 March 2012