SLURM Scheduler Prioritization Policy


Democratizing SLURM Scheduler Policy

Cluster: Athena

Date: April 18, 2025

The Democratizing SLURM Scheduler defines the operational policy for the Athena HPC cluster. It enables public users to utilize idle private nodes while ensuring fairness for contributors as partial owners, promoting shared resources.

Cluster Configuration

  • Nodes:
    • athena: Public, special-purpose
    • n314: benisty_prj
  • Partitions:
    • post: Public (athena, PriorityJobFactor=500, Default=NO, AllowQOS=2h_2g)
    • l40s: benisty_prj (n314, PriorityJobFactor=500, Default=YES, AllowQOS=benisty,2h_2g)
  • Accounts:
    • technion: Root (FairShare=parent)
    • contrib: Contributor parent (FairShare=parent)
    • benisty_prj: Contributor (FairShare=500, Priority=100)
    • bitton_prj, kovacsi-katz_prj, slavin_prj: Non-contributors (FairShare=0, Priority=10)
  • Users: rayb, hadasbe, orik, jenyas (inherit FairShare, Priority)
  • QoS:
    • 2h_2g: Public (2h, 64 CPUs, 240G RAM, 2 GPUs, Priority=0)
    • benisty: benisty_prj (Priority=100, Preempt=2h_2g)
  • SLURM Settings:
    • PriorityType=priority/multifactor, PreemptType=preempt/qos, PreemptMode=CANCEL
    • Weights: PriorityWeightAssoc=100, PriorityWeightQOS=2000, PriorityWeightPartition=500, PriorityWeightAge=80, PriorityWeightFairshare=1000, PriorityWeightJobSize=0

Core Principles

  • Golden Ticket: Contributors (e.g., hadasbe) receive priority on their nodes via QoS and preemption.
  • Public Access: All users access partitions with 2h_2g.
  • Ownership-Based Priority: FairShare and Priority reflect contributions.
  • Automation: Scripts ensure compliance.

Policy Procedures

Adhere to these procedures:

  1. Add Account
    sacctmgr create account eyal_prj parent=contrib description="Eyal Project" organization=cs priority=100
    ./update_ownership.sh

    Reasoning: Creates contributor account with Priority=100. update_ownership.sh calculates FairShare.

  2. Add User
    sacctmgr create user eyal defaultaccount=eyal_prj partition=l40s defaultqos=eyal qos=eyal

    Reasoning: Links users to accounts with correct partition and QoS.

  3. Add Public QoS
    sacctmgr add qos 4h_4g MaxWall=04:00:00 Priority=0 MaxTRESPU=cpu=128,mem=480G,gres/gpu=4 MaxTRESPerJob=cpu=128,mem=480G,gres/gpu=4,gres/shard=32 MaxJobsPU=2 MaxSubmitJobsPU=6 GraceTime=600
    sacctmgr modify account where parent=technion set qos+=4h_4g

    Reasoning: Creates 4h_4g, ensuring universal access.

  4. Add Owner QoS
    sacctmgr add qos eyal Priority=100 MaxWall=7-00:00:00 Preempt=2h_2g,4h_4g
    sacctmgr modify account where account=eyal_prj set qos+=eyal,2h_2g,4h_4g

    Reasoning: Assigns Priority=100, Preempt=2h_2g,4h_4g.

  5. Add Partition
    # Add to /etc/slurm/slurm.conf:
    PartitionName=newpart Nodes=new_node Default=YES DefMemPerCPU=1024 DefCpuPerGPU=8 MaxMemPerNode=2313080 OverSubscribe=NO AllowQOS=eyal,2h_2g,4h_4g DefaultTime=02:00:00 PriorityJobFactor=500
    scontrol reconfigure
    ./update_ownership.sh

    Reasoning: Configures partitions with contributor QoS default.

  6. Update Ownership
    ./update_ownership.sh

    Reasoning: Updates FairShare.

Scripts

  • calculate_ownership.sh: Calculates FairShare.
  • update_ownership.sh: Updates FairShare.

Job Priority Formula

Job_priority = (PriorityWeightAssoc * assoc_priority) + (PriorityWeightQOS * qos_priority) + (PriorityWeightPartition * partition_factor) + (PriorityWeightAge * age_factor) + (PriorityWeightFairshare * fairshare_factor)

  • Weights:
    • PriorityWeightAssoc=100
    • PriorityWeightQOS=2000
    • PriorityWeightPartition=500
    • PriorityWeightAge=80
    • PriorityWeightFairshare=1000
    • PriorityWeightJobSize=0
  • Parameters:
    • assoc_priority: 100 (benisty_prj), 10 (non-contributors)
    • qos_priority: 100 (benisty), 0 (2h_2g)
    • partition_factor: 500 (post, l40s)
    • age_factor: 86,400 (1 day)
    • fairshare_factor: 500 (benisty_prj), 0 (non-contributors)
  • Examples:
    • hadasbe on l40s (benisty): ~7,872,000
    • rayb on post (2h_2g): ~7,163,000
    • rayb on l40s (2h_2g): ~7,163,000

Policy Adherence

This documentation defines the Athena cluster’s operational policy, ensuring equitable access and fairness.