Slurm Tips for vg: Difference between revisions

From UCSC Genomics Institute Computing Infrastructure Information

(Created page with "This page explains how to set up a development environment for [https://github.com/vgteam/vg vg] on the Phoenix cluster. 1. Make yourself a user directory under '''/private/g...")
 
 
(5 intermediate revisions by the same user not shown)
Line 1: Line 1:
This page explains how to set up a development environment for [https://github.com/vgteam/vg vg] on the Phoenix cluster.
This page explains how to set up a development environment for [https://github.com/vgteam/vg vg] on the Phoenix cluster.


1. Make yourself a user directory under '''/private/groups''', which is where large data must be stored. For example, if you are in the Paten lab:
==Setting Up==
 
1. After connecting to the VPN, connect to an interactive node:
 
ssh razzmatazz.prism
 
This node is relatively small, so you shouldn't run real work on it, but it is the place you need to be to submit Slurm jobs.
 
2. Make yourself a user directory under '''/private/groups''', which is where large data must be stored. For example, if you are in the Paten lab:


  mkdir /private/groups/patenlab/$USER
  mkdir /private/groups/patenlab/$USER


2. (Optional) Link it over to your home directory, so it is easy to use storage there to store your repos. The '''/private/groups''' storage may be faster than the home directory storage.
3. (Optional) Link it over to your home directory, so it is easy to use storage there to store your repos. The '''/private/groups''' storage may be faster than the home directory storage.


  mkdir -p /private/groups/patenlab/$USER/workspace
  mkdir -p /private/groups/patenlab/$USER/workspace
  ln -s /private/groups/patenlab/$USER/workspace ~/workspace
  ln -s /private/groups/patenlab/$USER/workspace ~/workspace


3. Make sure you have SSH keys created and add them to Github.
4. Make sure you have SSH keys created and add them to Github.


  cat ~/.ssh/id_ed25519.pub || (ssh-keygen -t dsa && cat  ~/.ssh/id_ed25519.pub)
  cat ~/.ssh/id_ed25519.pub || (ssh-keygen -t ed25519 && cat  ~/.ssh/id_ed25519.pub)
  # Paste into https://github.com/settings/ssh/new
  # Paste into https://github.com/settings/ssh/new


4. Make a place to put your clone, and clone vg:
5. Make a place to put your clone, and clone vg:


  mkdir -p ~/workspace
  mkdir -p ~/workspace
Line 22: Line 30:
  cd vg
  cd vg


5. vg's dependencies should already be installed on the cluster nodes. If any of them seem to be missing, tell cluster-admin@soe.ucsc.edu to install them.
6. vg's dependencies should already be installed on the cluster nodes. If any of them seem to be missing, tell cluster-admin@soe.ucsc.edu to install them.


6. Build vg as a Slurm job. This will send the build out to the cluster as a 64-core, 80G memory job, and keep the output logs in your terminal.
7. Build vg as a Slurm job. This will send the build out to the cluster as a 64-core, 80G memory job, and keep the output logs in your terminal.


  srun -c 64 --mem=80G make -j64
  srun -c 64 --mem=80G --time=00:30:00 make -j64


This will leave your vg binary at '''~/workspace/vg/bin/vg'''.
This will leave your vg binary at '''~/workspace/vg/bin/vg'''.
==Misc Tips==
* For a lightweight job that outputs to your terminal or that can be waited for in a Bash script, run an individual command directly from <code>srun</code>:
srun -c1 --mem 2G --partition short --time 1:00:00 sleep 10
* If you need to run a few commands in the same shell, use <code>sbatch --wrap</code>:
sbatch -c1 --mem 2G --partition short --time 1:00:00 --wrap ". venv/bin/activate; ./script1.py && ./script2.py"
* To watch a batch job's output live, look at the <code>Submitted batch job 5244464</code> line from <code>sbatch</code> and run:
tail -f slurm-5244464.out
* '''Danger!''' If you ''really'' need an interactive session with appreciable resources, you can schedule one with <code>srun --pty</code>. But it is '''very easy''' to waste resources like this, since the job will happily sit there not doing anything until it hits the timeout. Only do this for testing! For real work, use one of the other methods!
srun -c 16 --mem 120G --time=08:00:00 --partition=medium --pty bash -i
* To send out a job without making a script file for it, use '''sbatch --wrap "your command here"'''.
* You can use arguments from SBATCH lines on the command line!
* You can use [https://github.com/CLIP-HPC/SlurmCommander#readme Slurm Commander] to watch the state of the cluster with the '''scom''' command.

Latest revision as of 16:58, 16 January 2025

This page explains how to set up a development environment for vg on the Phoenix cluster.

Setting Up

1. After connecting to the VPN, connect to an interactive node:

ssh razzmatazz.prism

This node is relatively small, so you shouldn't run real work on it, but it is the place you need to be to submit Slurm jobs.

2. Make yourself a user directory under /private/groups, which is where large data must be stored. For example, if you are in the Paten lab:

mkdir /private/groups/patenlab/$USER

3. (Optional) Link it over to your home directory, so it is easy to use storage there to store your repos. The /private/groups storage may be faster than the home directory storage.

mkdir -p /private/groups/patenlab/$USER/workspace
ln -s /private/groups/patenlab/$USER/workspace ~/workspace

4. Make sure you have SSH keys created and add them to Github.

cat ~/.ssh/id_ed25519.pub || (ssh-keygen -t ed25519 && cat  ~/.ssh/id_ed25519.pub)
# Paste into https://github.com/settings/ssh/new

5. Make a place to put your clone, and clone vg:

mkdir -p ~/workspace
cd ~/workspace
git clone --recursive git@github.com:vgteam/vg.git
cd vg

6. vg's dependencies should already be installed on the cluster nodes. If any of them seem to be missing, tell cluster-admin@soe.ucsc.edu to install them.

7. Build vg as a Slurm job. This will send the build out to the cluster as a 64-core, 80G memory job, and keep the output logs in your terminal.

srun -c 64 --mem=80G --time=00:30:00 make -j64

This will leave your vg binary at ~/workspace/vg/bin/vg.

Misc Tips

  • For a lightweight job that outputs to your terminal or that can be waited for in a Bash script, run an individual command directly from srun:
srun -c1 --mem 2G --partition short --time 1:00:00 sleep 10
  • If you need to run a few commands in the same shell, use sbatch --wrap:
sbatch -c1 --mem 2G --partition short --time 1:00:00 --wrap ". venv/bin/activate; ./script1.py && ./script2.py"
  • To watch a batch job's output live, look at the Submitted batch job 5244464 line from sbatch and run:
tail -f slurm-5244464.out
  • Danger! If you really need an interactive session with appreciable resources, you can schedule one with srun --pty. But it is very easy to waste resources like this, since the job will happily sit there not doing anything until it hits the timeout. Only do this for testing! For real work, use one of the other methods!
srun -c 16 --mem 120G --time=08:00:00 --partition=medium --pty bash -i
  • To send out a job without making a script file for it, use sbatch --wrap "your command here".
  • You can use arguments from SBATCH lines on the command line!
  • You can use Slurm Commander to watch the state of the cluster with the scom command.