Democratizing SLURM Scheduler Policy
Cluster: Athena
Date: April 18, 2025
The Democratizing SLURM Scheduler defines the operational policy for the Athena HPC cluster. It enables public users to utilize idle private nodes while ensuring fairness for contributors as partial owners, promoting shared resources.
Cluster Configuration
- Nodes:
athena
: Public, special-purposen314
:benisty_prj
- Partitions:
post
: Public (athena
,PriorityJobFactor=500
,Default=NO
,AllowQOS=2h_2g
)l40s
:benisty_prj
(n314
,PriorityJobFactor=500
,Default=YES
,AllowQOS=benisty,2h_2g
)
- Accounts:
technion
: Root (FairShare=parent
)contrib
: Contributor parent (FairShare=parent
)benisty_prj
: Contributor (FairShare=500
,Priority=100
)bitton_prj
,kovacsi-katz_prj
,slavin_prj
: Non-contributors (FairShare=0
,Priority=10
)
- Users:
rayb
,hadasbe
,orik
,jenyas
(inheritFairShare
,Priority
) - QoS:
2h_2g
: Public (2h, 64 CPUs, 240G RAM, 2 GPUs,Priority=0
)benisty
:benisty_prj
(Priority=100
,Preempt=2h_2g
)
- SLURM Settings:
PriorityType=priority/multifactor
,PreemptType=preempt/qos
,PreemptMode=CANCEL
- Weights:
PriorityWeightAssoc=100
,PriorityWeightQOS=2000
,PriorityWeightPartition=500
,PriorityWeightAge=80
,PriorityWeightFairshare=1000
,PriorityWeightJobSize=0
Core Principles
- Golden Ticket: Contributors (e.g.,
hadasbe
) receive priority on their nodes via QoS and preemption. - Public Access: All users access partitions with
2h_2g
. - Ownership-Based Priority:
FairShare
andPriority
reflect contributions. - Automation: Scripts ensure compliance.
Policy Procedures
Adhere to these procedures:
- Add Account
sacctmgr create account eyal_prj parent=contrib description="Eyal Project" organization=cs priority=100 ./update_ownership.sh
Reasoning: Creates contributor account with
Priority=100
.update_ownership.sh
calculatesFairShare
. - Add User
sacctmgr create user eyal defaultaccount=eyal_prj partition=l40s defaultqos=eyal qos=eyal
Reasoning: Links users to accounts with correct partition and QoS.
- Add Public QoS
sacctmgr add qos 4h_4g MaxWall=04:00:00 Priority=0 MaxTRESPU=cpu=128,mem=480G,gres/gpu=4 MaxTRESPerJob=cpu=128,mem=480G,gres/gpu=4,gres/shard=32 MaxJobsPU=2 MaxSubmitJobsPU=6 GraceTime=600 sacctmgr modify account where parent=technion set qos+=4h_4g
Reasoning: Creates
4h_4g
, ensuring universal access. - Add Owner QoS
sacctmgr add qos eyal Priority=100 MaxWall=7-00:00:00 Preempt=2h_2g,4h_4g sacctmgr modify account where account=eyal_prj set qos+=eyal,2h_2g,4h_4g
Reasoning: Assigns
Priority=100
,Preempt=2h_2g,4h_4g
. - Add Partition
# Add to /etc/slurm/slurm.conf: PartitionName=newpart Nodes=new_node Default=YES DefMemPerCPU=1024 DefCpuPerGPU=8 MaxMemPerNode=2313080 OverSubscribe=NO AllowQOS=eyal,2h_2g,4h_4g DefaultTime=02:00:00 PriorityJobFactor=500 scontrol reconfigure ./update_ownership.sh
Reasoning: Configures partitions with contributor QoS default.
- Update Ownership
./update_ownership.sh
Reasoning: Updates
FairShare
.
Scripts
calculate_ownership.sh
: CalculatesFairShare
.update_ownership.sh
: UpdatesFairShare
.
Job Priority Formula
Job_priority = (PriorityWeightAssoc * assoc_priority) + (PriorityWeightQOS * qos_priority) + (PriorityWeightPartition * partition_factor) + (PriorityWeightAge * age_factor) + (PriorityWeightFairshare * fairshare_factor)
- Weights:
PriorityWeightAssoc=100
PriorityWeightQOS=2000
PriorityWeightPartition=500
PriorityWeightAge=80
PriorityWeightFairshare=1000
PriorityWeightJobSize=0
- Parameters:
assoc_priority
: 100 (benisty_prj
), 10 (non-contributors)qos_priority
: 100 (benisty
), 0 (2h_2g
)partition_factor
: 500 (post
,l40s
)age_factor
: 86,400 (1 day)fairshare_factor
: 500 (benisty_prj
), 0 (non-contributors)
- Examples:
hadasbe
onl40s
(benisty
): ~7,872,000rayb
onpost
(2h_2g
): ~7,163,000rayb
onl40s
(2h_2g
): ~7,163,000
Policy Adherence
This documentation defines the Athena cluster’s operational policy, ensuring equitable access and fairness.