Evaluating per-user Fairness of Goal-Oriented Parallel Computer Job Scheduling Policies

Fair share objective has been included into the goaloriented parallel computer job scheduling policy recently. However, the previous work only presented the overall scheduling performance. Thus, the per-user performance of the policy is still lacking. In this work, the details of per-user fair share performance under the Tradeoff-fs(Tx:avgX) policy will be further evaluated. A basic fair share priority backfill policy namely RelShare(1d) is also studied. The performance of all policies is collected using an event-driven simulator with three real job traces as input. The experimental results show that the high demand users are usually benefited under most policies because their jobs are large or they have a lot of jobs. In the large job case, one job executed may result in over-share during that period. In the other case, the jobs may be backfilled for performances. However, the users with a mixture of jobs may suffer because if the smaller jobs are executing the priority of the remaining jobs from the same user will be lower. Further analysis does not show any significant impact of users with a lot of jobs or users with a large runtime approximation error.




References:
[1] S. Vasupongayya, "Achieving fair share objectives via goal-oriented
parallel computer job scheduling policies", Proc. WASET ICCSE'09,
Bangkok, Thailand, December 25-27, 2009.
[2] S.-H. Chiang, A. Arpaci-Dusseau and M. Vernon. "The impact of more
accurate request runtimes on production job scheduling performance". In
Lecture Notes in Computer Science (2537):103-127, 2002.
[3] S. Vasupongayya, "Impact of User Runtime Estimates on Achieving
Fair Share Objectives", Proc. TISD, Nong Khai, Thailand, March 4-6,
2010.
[4] S. Vasupongayya, "Impact of Workloads on Fair Share Policies", Proc.
ANSCSE14, Chiang Rai, Thailand, March 23-26, 2010.
[5] S.-H. Chiang and C. Fu. "Benefit of limited time-sharing in the presence
of very large parallel jobs". In proceedings of the IEEE International
Parallel and Distributed Processing Symposium, 2005.
[6] S.-H. Chiang and M. Vernon. "Production job scheduling for parallel
shared memory systems". In proceeding of the IEEE International
Parallel and Distributed Processing Symposium, 2001.
[7] D. Talby and D. Feitelson, "Supporting priorities and improving
utilization of the IBM SP2 scheduler using slack-based backfilling". In
proceeding of the International Parallel Processing Symposium, 1999.
[8] OpenPBS, http://www.nas.nasa.gov/Software/PBS/
[9] PBS pro, http://www.pbspro.com
[10] LSF, http://www.platform.com/product/ lsffamily.
[11] D. Jackson, Q. Snell & M. Clement. "Core algorithms of the MAUI
scheduler". In proceeding of the Workshop on Job Scheduling Strategies
for Parallel Processing, 2001.
[12] Maui scheduler, http://www.supercluster.org/maui
[13] Moab scheduler, http://www.clusterresources.com/products/mwm/
docs/moabadminguide450.pdf
[14] S. Vasupongayya, S.-H Chiang and B. Massey, "Search-based job
scheduling for parallel computer workloads", In proceeding of the IEEE
Cluster, Boston, MA, 2005.
[15] S.-H. Chiang and S. Vasupongayya, "Design and potential performance
of goal-oriented job scheduling policies for parallel computer
workloads". In the IEEE Transaction on Parallel and Distributed
Systems. 19(12):1642-1656, 2009.
[16] A. Prasitsupparote & S. Vasupongayya, "Impact of Multi-partition
Systems on Goal-oriented Parallel Computer Job Scheduling Policies"
JCSSE2010, Bangkok, Thailand, 2010.
[17] T. Walsh, "Depth-bounded discrepancy search" Proc. Of International
joint conference in Artificial Intelligence, 1997.
[18] S. Vasupongayya and S.-H. Chiang. "Multi-objective models for
scheduling jobs on parallel computer systems". In proceeding of IEEE
Cluster, Barcelona, Spain, 2006.
[19] Parallel workload archieve, available at http://www.cs.huji.ac.il/labs/
parallel/workload.
[20] S. Vasupongayya, "Impact of fair share and its configurations on parallel
job scheduling algorithms". (to appear). In proceeding of the 2009
WASET International Conference on High Performance Computing,
Venice, Italy, October 2009.
[21] D.Lifka, "The ANL/IBM SP Scheduling System", Proc. First Job
Scheduling Strategies for Parallel Processing (JSSP-95), 1995.