Firewalled Environment Storage Overview: Difference between revisions

From UCSC Genomics Institute Computing Infrastructure Information

 
(35 intermediate revisions by the same user not shown)
Line 1: Line 1:
== Server Types and Management==
After confirming your VPN software is working, you can ssh into one of the compute servers behind the VPN:
'''crimson.prism''': 256GB RAM, 32 cores, 5.5TB local scratch space, CentOS 7.9
'''razzmatazz.prism''': 256GB RAM, 32 cores, 5.5TB local scratch space, Ubuntu 22.04
'''mustard.prism''': 1.5TB RAM, 160 cores, 9TB local scratch space, Ubuntu 22.04
These servers are managed by the Genomics Institute Cluster Admin group.  If you need software installed on any of these servers, please make your request by emailing cluster-admin@soe.ucsc.edu.
== Storage ==
== Storage ==


These servers mount two types of storage; home directories and group storage directories.
Our servers mount two types of ''shared'' storage; home directories and group storage directories. These home directories will mount over the network to all shared compute servers and the phoenix cluster, so any server you login to will have these filesystems available:


'''Filesystem Specifications'''
'''Filesystem Specifications'''
Line 20: Line 10:
! /private/home
! /private/home
! /private/groups
! /private/groups
! /private/warm-archive
|-
|-
| style="font-weight:bold; text-align:left;" | Default Soft Quota
| style="font-weight:bold; text-align:left;" | Default Quota
| 30 GB
| 100 GB
| 15 TB
| 15 TB
|-
| 50 TB
| style="font-weight:bold; text-align:left;" | Default Hard Quota
| 31 GB
| 16 TB
|-
|-
| style="font-weight:bold; text-align:left;" | Total Capacity
| style="font-weight:bold; text-align:left;" | Total Capacity
| 19 TB
| 19 TB
| 500 TB
| 1.7 PB
| 4.6 PB
|- style="text-align:left;"
|- style="text-align:left;"
| style="font-weight:bold;" | Access Speed
| style="font-weight:bold;" | Access Speed
| Slow - Moderate (Spinning Disk)
| Very Fast (NVMe Flash Media)
| Very Fast (NVMe Flash Media)
| Very Fast (NVMe Flash Media, Distributed Filesystem)
| Slower (Spinning Disks)
|- style="text-align:left;"
|- style="text-align:left;"
| style="font-weight:bold;" | Intended Use
| style="font-weight:bold;" | Intended Use
| This space should be used for login scripts, small bits of code or software repos, etc.  No large data should be stored here.
| This space should be used for login scripts, small bits of code or software repos, etc.  No large data should be stored here.
| This space should be used for large computational/shared data, large software installations and the like.
| This space should be used for large computational/shared (hot) data, large software installations and the like.
| Archival Use Only.  Not meant for active cluster computation.  Warm/Cold data only.
|}
|}


'''Home Directories (/private/home/username)'''
'''Home Directories (/private/home/username)'''


Your home directory will be located as "/private/home/username" and has a 30GB soft quota and a 31GB hard quota.  Your home directory is meant for small scripts and login data, or a git repo.  Please do not try to store large data there or computer on large jobs using data in your home directory.
Your home directory will be located as "/private/home/username" and has a 100GB quota.  Your home directory is meant for small scripts and login data, or a git repo.  Please do not try to store large data there or computer on large jobs using data in your home directory.


'''Groups Directories (/private/group/groupname'''
'''Groups Directories (/private/groups/labname)'''


The group storage directories are created per PI, and each group directory has a default 15TB soft quota and 16TB hard quota.  For example, if David Haussler is the PI that you report to directly, then the directory would exist as /private/groups/hausslerlab.  Request access to that group directory and you will then be able to write to it.  Each of those group directories are shared by the lab it belongs to, so you must be wary of everyone's data usage and share the 15TB available per group accordingly.
The group storage directories are created per PI, and each group directory has a default 15TB hard quota.  For example, if David Haussler is the PI that you report to directly, then the directory would exist as /private/groups/hausslerlab.  Request access to that group directory and you will then be able to write to it.  Each of those group directories are shared by the lab it belongs to, so you must be wary of everyone's data usage and share the 15TB available per group accordingly.


On the compute servers you can check your group's current quota usage by using the '/usr/bin/viewquota' command. You can only check the quota of a group you are part of (you would be a member of the UNIX group of the same name). If you wanted to check the quota usage of /private/groups/hausslerlab for example, you would do:
The groups storage directories have a file in the root of them that contains current quota usage info. These files (.quota_info) are updated once an hour. For example:


  $ viewquota hausslerlab
  cat /private/groups/corbettlab/.quota_info
==================================================
  CephFS Storage Quota Report: corbettlab
==================================================
Filesystem Tier: CephFS
Space Used:      72T
Total Quota:    92T
Available:      21T
   
   
  Project quota on /export (/dev/mapper/export)
  Last Updated:    2026-05-30 09:22:31 PDT
Project ID  Used  Soft  Hard Warn/Grace 
  ==================================================
---------- ---------------------------------  
  hausslerlab  1.8T    15T    16T  00 [------]


Every lab directory has a '''.quota_info''' file in its root.


'''Soft Versus Hard Quotas'''
'''Archive Directories (/private/warm-archive/labname)'''


We use soft and hard quotas for disk space.
The archive storage directories are created per PI '''by request''', and each group directory has a default 50TB hard quota.  For example, if David Haussler is the PI that you report to directly, then the directory would exist as /private/warm-archive/hausslerlab.


Once you exceed a directory's soft quota, a one-week countdown timer starts. When that timer runs out, you will no longer be able to create new files or write more data in that directoryYou can reset the countdown timer by dropping down to under the soft quota limit.
The archive storage directories have a file in the root of them that contains current quota usage info. These files (.quota_info) are updated once an hourFor example:


You will not be permitted to exceed a directory's hard quota at all. Any attempt to try will produce an error; the precise error will depend on how your software responds to running out of disk space.
cat /private/warm-archive/corbettlab/.quota_info
==================================================
  ZFS Storage Quota Report: corbettlab
==================================================
Project ID:    105
Space Used:    148G
Total Quota:    50T
Available:      50T
Last Updated:  2026-05-30 08:45:57 PDT
==================================================


When quotas are first applied to a directory, or are reduced, it is possible to end up with more data or files in the directory than the quota allows for.  This outcome does not trigger deletion of any existing data, but will prevent creation of new data or files.
Every lab directory has a '''.quota_info''' file in its root.


== /scratch Space on the Servers ==
== Storage Quota Alerting ==


Each server will generally have a local /scratch filesystem that you can use to store temporary files.  '''BE ADVISED''' that /scratch is not backed up, and the data there could disappear in the event of a disk failure or anything elseDo not store important data there. If it is important, it should be moved somewhere else very soon after creation.
If you and/or folks in your lab would like an automated alert when the '''/private/groups/labname''' quota is getting to a certain percentage of fullness, we can set that up for you and others in your labJust email '''cluster-admin@soe.ucsc.edu''' with the following information:


== Actually Doing Work and Computing ==
1: Which directory you would like to watch quotas on (i.e. /private/groups/somelab)
2: What % full you would like an email alert at
3: What email addresses you want on the alert list


When doing research, running jobs and the like, please be careful of your resource consumption on the server you are onDon't run too many threads or cores at once if such a thing overruns the RAM available or the disk IO availableIf you are not sure of your potential RAM, CPU or disk impact, start small with one or two processes and work your way up from there.  Also, before running your stuff, check what else is already happening on the server by using the 'top' command to see who else and what else is running and what kind of resources are already being consumed.  If, after starting a process, you realize that the server slows down considerably or becomes unusable, kill your processes and re-evaluate what you need to make things work. These servers are shared resources - be a good neighbor!
After setup, our alerting system will alert folks on that email list ''every 4 hours'' until the quota in question is reduced to an amount under the alerting % threshold you asked forSo it is a bit noisy, but will force folks to delete data in order to stop the alertsWhen the system notices that the quota usage has decreased to under the alert threshold, you will receive one final email with an "OK" notification that things are OK now.


== The Firewall ==
== /data/scratch Space on the Servers ==


All servers are behind a firewall in this environment, and as such, you must connect to the VPN in order to access themThey will not be accessible from the greater Internet without VPNAlthough you will be able to connect outbound from them to other servers on the internet to copy data in, sync git repos, stuff like thatIt is only inbound connections that will be blocked.  All machines behind the firewall have the private domain name suffix of "*.prism".
Each server will generally have a local /data/scratch filesystem that you can use to store temporary files'''BE ADVISED''' that /data/scratch is not backed up, and the data there could disappear in the event of a disk failure or anything elseDo not store important data thereIf it is important, it should be moved somewhere else very soon after creation.


== The Phoenix Cluster ==
== Backups ==


This is a cluster of ~20 Ubuntu 22.04 nodes, some of which have GPUs in themEach node generally has about 2TB RAM and 128 cores, although the cluster is heterogeneous and has multiple node types.
/private/groups is backed up monthly on the first of the month (which usually takes a week to complete)Please note that the following directories in the tree '''WILL NOT''' be backed up:


The cluster head node, from which all jobs are submitted via the SLURM job scheduling framework, is '''phoenix.prism'''. To learn more about how to use Slurm, refer to:
tmp/
temp/
TMP/
TEMP/
cache/
.cache/
scratch/
*.tmp/


  https://giwiki.gi.ucsc.edu/index.php/Genomics_Institute_Computing_Information#Slurm_at_the_Genomics_Institute
So if you have data that you know isn't important and should be excluded from the backups, put them in a directory suffixed with ".tmp". Such as this example:


For scratch on the cluster, TMPDIR will be set to /data/tmp.  That area is cleaned often so don't store any data there that isn't being used by your jobs.
/private/groups/clusteradmin/mybams.tmp/

Latest revision as of 10:43, 30 May 2026

Storage

Our servers mount two types of shared storage; home directories and group storage directories. These home directories will mount over the network to all shared compute servers and the phoenix cluster, so any server you login to will have these filesystems available:

Filesystem Specifications

Filesystem
/private/home /private/groups /private/warm-archive
Default Quota 100 GB 15 TB 50 TB
Total Capacity 19 TB 1.7 PB 4.6 PB
Access Speed Very Fast (NVMe Flash Media) Very Fast (NVMe Flash Media, Distributed Filesystem) Slower (Spinning Disks)
Intended Use This space should be used for login scripts, small bits of code or software repos, etc. No large data should be stored here. This space should be used for large computational/shared (hot) data, large software installations and the like. Archival Use Only. Not meant for active cluster computation. Warm/Cold data only.

Home Directories (/private/home/username)

Your home directory will be located as "/private/home/username" and has a 100GB quota. Your home directory is meant for small scripts and login data, or a git repo. Please do not try to store large data there or computer on large jobs using data in your home directory.

Groups Directories (/private/groups/labname)

The group storage directories are created per PI, and each group directory has a default 15TB hard quota. For example, if David Haussler is the PI that you report to directly, then the directory would exist as /private/groups/hausslerlab. Request access to that group directory and you will then be able to write to it. Each of those group directories are shared by the lab it belongs to, so you must be wary of everyone's data usage and share the 15TB available per group accordingly.

The groups storage directories have a file in the root of them that contains current quota usage info. These files (.quota_info) are updated once an hour. For example:

cat /private/groups/corbettlab/.quota_info

==================================================
 CephFS Storage Quota Report: corbettlab
==================================================
Filesystem Tier: CephFS
Space Used:      72T
Total Quota:     92T
Available:       21T

Last Updated:    2026-05-30 09:22:31 PDT
==================================================

Every lab directory has a .quota_info file in its root.

Archive Directories (/private/warm-archive/labname)

The archive storage directories are created per PI by request, and each group directory has a default 50TB hard quota. For example, if David Haussler is the PI that you report to directly, then the directory would exist as /private/warm-archive/hausslerlab.

The archive storage directories have a file in the root of them that contains current quota usage info. These files (.quota_info) are updated once an hour. For example:

cat /private/warm-archive/corbettlab/.quota_info

==================================================
 ZFS Storage Quota Report: corbettlab
==================================================
Project ID:     105
Space Used:     148G
Total Quota:    50T
Available:      50T

Last Updated:   2026-05-30 08:45:57 PDT
==================================================

Every lab directory has a .quota_info file in its root.

Storage Quota Alerting

If you and/or folks in your lab would like an automated alert when the /private/groups/labname quota is getting to a certain percentage of fullness, we can set that up for you and others in your lab. Just email cluster-admin@soe.ucsc.edu with the following information:

1: Which directory you would like to watch quotas on (i.e. /private/groups/somelab)
2: What % full you would like an email alert at
3: What email addresses you want on the alert list

After setup, our alerting system will alert folks on that email list every 4 hours until the quota in question is reduced to an amount under the alerting % threshold you asked for. So it is a bit noisy, but will force folks to delete data in order to stop the alerts. When the system notices that the quota usage has decreased to under the alert threshold, you will receive one final email with an "OK" notification that things are OK now.

/data/scratch Space on the Servers

Each server will generally have a local /data/scratch filesystem that you can use to store temporary files. BE ADVISED that /data/scratch is not backed up, and the data there could disappear in the event of a disk failure or anything else. Do not store important data there. If it is important, it should be moved somewhere else very soon after creation.

Backups

/private/groups is backed up monthly on the first of the month (which usually takes a week to complete). Please note that the following directories in the tree WILL NOT be backed up:

tmp/
temp/
TMP/
TEMP/
cache/
.cache/
scratch/
*.tmp/

So if you have data that you know isn't important and should be excluded from the backups, put them in a directory suffixed with ".tmp". Such as this example:

/private/groups/clusteradmin/mybams.tmp/