GPU Resources: Difference between revisions
(Add partition and time) |
|||
(6 intermediate revisions by 2 users not shown) | |||
Line 1: | Line 1: | ||
When submitting jobs, you can ask for GPUs in one of two ways. One is: | When submitting jobs, you can ask for GPUs in one of two ways. One is: | ||
#SBATCH --partition=gpu | |||
#SBATCH --gres=gpu:1 | #SBATCH --gres=gpu:1 | ||
That will ask for 1 GPU generically on a node with a free GPU. This request is more specific: | That will ask for 1 GPU generically on a node with a free GPU. This request is more specific: | ||
#SBATCH --partition=gpu | |||
#SBATCH --gres=gpu:A5500:3 | #SBATCH --gres=gpu:A5500:3 | ||
Line 13: | Line 15: | ||
nVidia RTX A5500 : 24GB RAM | nVidia RTX A5500 : 24GB RAM | ||
nVidia A100 : 80GB RAM | nVidia A100 : 80GB RAM | ||
For the most part, Slurm takes care of making sure that each job only sees and used the GPUs assigned to it. Within the job, '''CUDA_VISIBLE_DEVICES''' will be set in the environment, but it will always be set to a list of your requested number of GPUs, starting at 0. Slurm re-numbers the GPUs assigned to each job to appear to start at 0, within the job. If you need access to the "real" GPU numbers (to log or to pass along to Docker), they are available in the '''SLURM_JOB_GPUS''' (for '''sbatch''') or '''SLURM_STEP_GPUS''' (for '''srun''') environment variable. | |||
==Running GPU Workloads== | ==Running GPU Workloads== | ||
Line 38: | Line 40: | ||
====Running Containers in Singularity==== | ====Running Containers in Singularity==== | ||
You can run containers on the cluster using Singularity, and give them access to GPUs using the '''--nv''' option. For example: | You can run containers on the cluster using Singularity, and give them access to the GPUs that Slurm has selected using the '''--nv''' option. For example: | ||
singularity pull docker://tensorflow/tensorflow:latest-gpu | singularity pull docker://tensorflow/tensorflow:latest-gpu | ||
srun -c 8 --mem 10G --gres=gpu: | srun -c 8 --mem 10G --partition=gpu --time=00:20:00 --gres=gpu:1 singularity run --nv docker://tensorflow/tensorflow:latest-gpu python -c 'from tensorflow.python.client import device_lib; print(device_lib.list_local_devices())' | ||
This will produce output showing that the Tensorflow container is indeed able to talk to one GPU: | This will produce output showing that the Tensorflow container is indeed able to talk to one GPU: | ||
Line 68: | Line 70: | ||
xla_global_id: 416903419 | xla_global_id: 416903419 | ||
] | ] | ||
Slurm's containment of the Slurm job to the correct set of GPUs is also passed through to the Singularity container; there is no need to specifically direct Singularity to use the right GPUs unless you are doing something unusual. | |||
====Running Containers in Slurm==== | ====Running Containers in Slurm==== | ||
Line 81: | Line 85: | ||
You might be used to running containers with Docker, or containerized GPU workloads with the nVidia Container Runtime or Toolkit. Docker is installed on all the nodes and the daemon is running; if the '''docker''' command does not work for you, ask cluster-admin to add you to the right groups. | You might be used to running containers with Docker, or containerized GPU workloads with the nVidia Container Runtime or Toolkit. Docker is installed on all the nodes and the daemon is running; if the '''docker''' command does not work for you, ask cluster-admin to add you to the right groups. | ||
The '''nvidia''' runtime is set up, | The '''nvidia''' runtime is set up and will automatically be used. | ||
While Slurm configures each Slurm job with a cgroup that directs it to the correct GPUs, '''using Docker to run another container escapes Slurm's confinement''', and using '''--gpus=1''' will ''always'' use the ''first'' GPU in the system, whether that GPU is assigned to your job or not. When using Docker, you ''must'' consult the '''SLURM_JOB_GPUS''' (for '''sbatch''') or '''SLRUM_STEP_GPUS''' (for '''srun''') environment variable and pass that along to your container. You should also impose limits on all other resources used by your Docker container, so that your whole job stays within the resources allocated by Slurm's scheduler. (TODO: find out how cgroups handles oversubscription between a Docker container and the Slurm container that launched it). | |||
An example of a working command is: | |||
srun -c 1 --mem 4G --partition=gpu --time=00:20:00 --gres=gpu:2 bash -c 'docker run --rm --gpus=\"device=$SLURM_STEP_GPUS\" nvidia/cuda:12.1.1-base-ubuntu22.04 nvidia-smi' | |||
Note that the double-quotes are included in the argument to '''--gpus''' as seen by the Docker client, and that '''bash''' and single-quotes are used to ensure that '''$SLURM_STEP_GPUS''' is evaluated within the job itself, and not on the head node. |
Latest revision as of 16:32, 28 June 2024
When submitting jobs, you can ask for GPUs in one of two ways. One is:
#SBATCH --partition=gpu #SBATCH --gres=gpu:1
That will ask for 1 GPU generically on a node with a free GPU. This request is more specific:
#SBATCH --partition=gpu #SBATCH --gres=gpu:A5500:3
That requests 3 A5500 GPUs only.
We have several GPU types on the cluster which may fit your specific needs:
nVidia RTX A5500 : 24GB RAM nVidia A100 : 80GB RAM
For the most part, Slurm takes care of making sure that each job only sees and used the GPUs assigned to it. Within the job, CUDA_VISIBLE_DEVICES will be set in the environment, but it will always be set to a list of your requested number of GPUs, starting at 0. Slurm re-numbers the GPUs assigned to each job to appear to start at 0, within the job. If you need access to the "real" GPU numbers (to log or to pass along to Docker), they are available in the SLURM_JOB_GPUS (for sbatch) or SLURM_STEP_GPUS (for srun) environment variable.
Running GPU Workloads
To actually use an nVidia GPU, you need to run a program that uses the CUDA API. There are a few ways to obtain such a program.
Prebuilt CUDA Applications
The Slurm cluster nodes have the nVidia drivers installed, as well as basic CUDA tools like nvidia-smi.
Some projects, such as tensorflow, may ship pre-built binaries that can use CUDA. You should be able to run these binaries directly, if you download them.
Building CUDA Applications
The cluster nodes do not have the full CUDA Toolkit. In particular, they do not have the nvcc CUDA-enabled compiler. If you want to compile applications that use CUDA, you will need to install the development environment yourself for your user.
Once you have nvcc available to your user, building CUDA applications should work. To run them, you will have to submit them as jobs, because the head node does not have a GPU.
Containerized GPU Workloads
Instead of directly installing binaries, or installing and using the CUDA Toolkit, it is often easiest to use containers to download a prebuilt GPU workload and all of its libraries and dependencies. THere are a few options for running containerized GPU workloads on the cluster.
Running Containers in Singularity
You can run containers on the cluster using Singularity, and give them access to the GPUs that Slurm has selected using the --nv option. For example:
singularity pull docker://tensorflow/tensorflow:latest-gpu srun -c 8 --mem 10G --partition=gpu --time=00:20:00 --gres=gpu:1 singularity run --nv docker://tensorflow/tensorflow:latest-gpu python -c 'from tensorflow.python.client import device_lib; print(device_lib.list_local_devices())'
This will produce output showing that the Tensorflow container is indeed able to talk to one GPU:
INFO: Using cached SIF image 2023-05-15 11:36:33.110850: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations. To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags. 2023-05-15 11:36:38.799035: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1635] Created device /device:GPU:0 with 22244 MB memory: -> device: 0, name: NVIDIA RTX A5500, pci bus id: 0000:03:00.0, compute capability: 8.6 [name: "/device:CPU:0" device_type: "CPU" memory_limit: 268435456 locality { } incarnation: 8527638019084870106 xla_global_id: -1 , name: "/device:GPU:0" device_type: "GPU" memory_limit: 23324655616 locality { bus_id: 1 links { } } incarnation: 1860154623440434360 physical_device_desc: "device: 0, name: NVIDIA RTX A5500, pci bus id: 0000:03:00.0, compute capability: 8.6" xla_global_id: 416903419 ]
Slurm's containment of the Slurm job to the correct set of GPUs is also passed through to the Singularity container; there is no need to specifically direct Singularity to use the right GPUs unless you are doing something unusual.
Running Containers in Slurm
Slurm itself also supports a --container option for jobs, which allows a whole job to be run inside a container. If you are able to convert your container to OCI Bundle format, you can pass it directly to Slurm instead of using Singularity from inside the job. However, Docker-compatible image specifiers can't be given to Slurm, only paths to OCI bundles on disk.
Stnad-alone tools to download a Docker image from Docker Hub in OCI bundle format (skopeo and umoci) are not yet installed on the cluster. But the method using the docker command should work.
Slurm containers should have access to their assigned GPUs, but it is not clear if tools like nvidia-smi are injected into the container, as they would be with Singularity or the nVidia Container Runtime.
Running Containers in Docker
You might be used to running containers with Docker, or containerized GPU workloads with the nVidia Container Runtime or Toolkit. Docker is installed on all the nodes and the daemon is running; if the docker command does not work for you, ask cluster-admin to add you to the right groups.
The nvidia runtime is set up and will automatically be used.
While Slurm configures each Slurm job with a cgroup that directs it to the correct GPUs, using Docker to run another container escapes Slurm's confinement, and using --gpus=1 will always use the first GPU in the system, whether that GPU is assigned to your job or not. When using Docker, you must consult the SLURM_JOB_GPUS (for sbatch) or SLRUM_STEP_GPUS (for srun) environment variable and pass that along to your container. You should also impose limits on all other resources used by your Docker container, so that your whole job stays within the resources allocated by Slurm's scheduler. (TODO: find out how cgroups handles oversubscription between a Docker container and the Slurm container that launched it).
An example of a working command is:
srun -c 1 --mem 4G --partition=gpu --time=00:20:00 --gres=gpu:2 bash -c 'docker run --rm --gpus=\"device=$SLURM_STEP_GPUS\" nvidia/cuda:12.1.1-base-ubuntu22.04 nvidia-smi'
Note that the double-quotes are included in the argument to --gpus as seen by the Docker client, and that bash and single-quotes are used to ensure that $SLURM_STEP_GPUS is evaluated within the job itself, and not on the head node.