slurm

Table of Contents

  1. Overview
  2. Module Description
  3. Setup
  4. Usage
  5. Limitations
  6. Development

Overview

Install and configure Slurm. A job scheduler and resource manager for HPC.

Module Description

This module installs Slurm and generates an usable configuration. The module is tested to work with the Debian packages published by the EDF-HPC team. The sources with the Debian packaging files can be found at:

This module sets up all the components of Slurm in different classes:

Setup

What slurm affects

Setup Requirements

This module needs a working authentication mechanism in the cluster. This system is usually munge. It should be setup on every node.

The dbd class needs access to a MySQL database.

This module depends on:

Beginning with slurm

You should setup the different components of Slurm, nodes generally include one or more features:

The difference to keep in mind between a submission host and an administration node is that the submission host must have srun prologs. Features can be confused on smaller configurations.

It's possible to separate dbd and ctld.

The main module class (slurm) does not install anything but handles the configuration file that is common to all other classes (slurm.conf).

Usage

Configuration

The configuration is built by setting values in the $config_options parameter. The values are merged with the default configuration.

The parameter $partitions_options can be used to define the nodes and partition.

A minimum configuration would look like this:

class { '::slurm':
  config_options     => {
    'ClusterName' => {
      value   => 'hpccluster',
      comment => 'The name by which this SLURM managed cluster is known in the accounting database',
    },
    'ControlMachine' => {
      value   => 'master',
      comment => 'Hostname of the machine where SLURM control functions are executed',
    }
  },
  partitions_options => [
    "NodeName=node[001-010] CPUs=1 State=UNKNOWN",
    "PartitionName=std Nodes=node[001-010] Default=YES MaxTime=INFINITE State=UP",
  ],
}

See the hpclib::print_configdocumentation for the$config_options` hash syntax.

Every entry in slurm.conf can be defined, it is also possible to add Include entries.

Cgroups

Cgroups are configured in a separate file, only used by the exec class. This class provides the parameters to configure the cgroups in Slurm. The main parameters are: $enable_cgroup and cgroup_options.

Limitations

This module is mainly tested on Debian, but it is meant to also work with RHEL and derivatives.

Development

Patches and issues can be submitted on GitHub: https://github.com/edf-hpc/puppet-hpc