HPC Server Dynamic Job Scheduling: when jobs spawn jobs

Posted by JoshReuben on Geeks with Blogs See other posts from Geeks with Blogs or by JoshReuben
Published on Wed, 10 Oct 2012 11:34:14 GMT Indexed on 2012/10/10 15:38 UTC
Read the original article Hit count: 619

Filed under:

HPC Job Types

HPC has 3 types of jobs http://technet.microsoft.com/en-us/library/cc972750(v=ws.10).aspx

· Task Flow – vanilla sequence

clip_image002

· Parametric Sweep – concurrently run multiple instances of the same program, each with a different work unit input

clip_image004

· MPI – message passing between master & slave tasks

clip_image006

But when you try go outside the box – job tasks that spawn jobs, blocking the parent task – you run the risk of resource starvation, deadlocks, and recursive, non-converging or exponential blow-up.

The solution to this is to write some performance monitoring and job scheduling code. You can do this in 2 ways:

  1. manually control scheduling - allocate/ de-allocate resources, change job priorities, pause & resume tasks , restrict long running tasks to specific compute clusters
  2. Semi-automatically - set threshold params for scheduling.

How – Control Job Scheduling

In order to manage the tasks and resources that are associated with a job, you will need to access the ISchedulerJob interface - http://msdn.microsoft.com/en-us/library/microsoft.hpc.scheduler.ischedulerjob_members(v=vs.85).aspx

This really allows you to control how a job is run – you can access & tweak the following features:

  • max / min resource values
  • clip_image007whether job resources can grow / shrink, and whether jobs can be pre-empted, whether the job is exclusive per node
  • clip_image007[1]the creator process id & the job pool
  • timestamp of job creation & completion
  • clip_image007[2]job priority, hold time & run time limit
  • Re-queue count
  • Job progress
  • Max/ min Number of cores, nodes, sockets, RAM
  • Dynamic task list – can add / cancel jobs on the fly
  • Job counters

When – poll perf counters

Tweaking the job scheduler should be done on the basis of resource utilization according to PerfMon counters – HPC exposes 2 Perf objects: Compute Clusters, Compute Nodes

http://technet.microsoft.com/en-us/library/cc720058(v=ws.10).aspx

You can monitor running jobs according to dynamic thresholds – use your own discretion:

  • Percentage processor time
  • Number of running jobs
  • Number of running tasks
  • Total number of processors
  • Number of processors in use
  • Number of processors idle
  • Number of serial tasks
  • Number of parallel tasks

Design Your algorithms correctly

Finally , don’t assume you have unlimited compute resources in your cluster – design your algorithms with the following factors in mind:

· Branching factor - http://en.wikipedia.org/wiki/Branching_factor - dynamically optimize the number of children per node

clip_image009

· cutoffs to prevent explosions - http://en.wikipedia.org/wiki/Limit_of_a_sequence - not all functions converge after n attempts. You also need a threshold of good enough, diminishing returns

· heuristic shortcuts - http://en.wikipedia.org/wiki/Heuristic - sometimes an exhaustive search is impractical and short cuts are suitable

· Pruning http://en.wikipedia.org/wiki/Pruning_(algorithm) – remove / de-prioritize unnecessary tree branches

clip_image011

· avoid local minima / maxima - http://en.wikipedia.org/wiki/Local_minima - sometimes an algorithm cant converge because it gets stuck in a local saddle – try simulated annealing, hill climbing or genetic algorithms to get out of these ruts

clip_image013

 

watch out for rounding errorshttp://en.wikipedia.org/wiki/Round-off_error - multiple iterations can in parallel can quickly amplify & blow up your algo ! Use an epsilon, avoid floating point errors,  truncations, approximations

Happy Coding !

© Geeks with Blogs or respective owner