Search-based Job Scheduling for Parallel Computer Workloads
Sangsuree Vasupongayya, Su-Hui Chiang, and Bart Massey
IEEE International Conference on Cluster Computing (Cluster 2005)
Boston, Massachusetts, USA, September 27 - 30, 2005
Abstract
To balance different performance goals and to allow administrators to declaratively specify high-level performance goals, we apply complete search algorithms to design on-line job scheduling policies for workloads that run on parallel computer systems. A two-level objective that contains two conflicting criteria is chosen for our search-based policies: (1) minimizing the {\it{total excessive wait time}}; (2) minimizing the average bounded slowdown. Ten monthly workloads that ran on a Linux cluster (IA-64) from NCSA are used for evaluating the performance of policies. A wide range of measures are used for performance evaluation, including the average slowdown, average wait, maximum wait, and new measures based on total excessive wait time. Our resuls show that the best search-based scheduling policy (i.e., DDS/lxf/dynB) reported here {\it{simultaneously}} beats both LXF-backfill and FCFS-backfill, each providing a rough lower bound on the average slowdown and on the maximum wait, respectively, for the workloads studied.