How to access the public servers: Difference between revisions
No edit summary |
m (→Storage) |
||
(21 intermediate revisions by 2 users not shown) | |||
Line 1: | Line 1: | ||
== How to Gain Access to the Public Genomics Institute Compute Servers == | |||
If you need access to the Genomics Institute compute servers please complete this request form: | |||
https://app.smartsheet.com/b/form/a76dbd90ba0240ab9ea9d39b390586ce. There are two parts in this process. | |||
1. For the user, please fill in ALL required fields and submit. | |||
2. For the Sponsor/PI - you will receive an email from Smartsheet. Please fill in all required fields and submit. | |||
We will receive your completed request and we will create your account and go over the details via a short zoom meeting with you. | |||
== Account and Storage Cost == | |||
You | Costs for having an active UNIX account and for storage (per TB) are listed in this document, under "Genomics Project Support", specifically "Genomics IT Systems User Support per user" and "Genomics Data Storage per TB": | ||
https://planning.ucsc.edu/budget/rates-and-assessments/recharge-rates/docs/2021-22-approved-recharge-rates.pdf | |||
== Account Expiration == | |||
Your UNIX account will have an expiration date associated with it after creation, as requested by your sponsor. Please take note of this expiration date when your account is created. | |||
You will receive notice by email when your account is about to expire. To renew, simply ask the PI that sponsored you (which will be included in the notice) to email '''cluster-admin@soe.ucsc.edu''' requesting that your account to be renewed for another year, or any other requested amount of time. | |||
If your account expires, the account will be suspended and you will no longer be able to login or view any data you may have in our systems. Any automated scripts (owned by you) that run via cron or other mechanisms will cease to function. | |||
== Server Types and Management== | |||
You can ssh into our public compute servers via SSH: | |||
'''courtyard.gi.ucsc.edu''': 1TB RAM, 64 cores, 672GB local scratch space, Ubuntu 22.04.2 | |||
'''park.gi.ucsc.edu''': 256GB RAM, 32 cores, 5TB local scratch space, Ubuntu 22.04.2 | |||
These servers are managed by the Genomics Institute Cluster Admin group. If you need software installed on them, please make your request by emailing cluster-admin@soe.ucsc.edu. | |||
== Storage == | |||
These servers mount two types of storage; home directories and group storage directories. Your home directory will be located as "/public/home/username" and has a 30GB quota. The group storage directories are created per PI, and each group directory has a default 15TB quota (although in some cases the quota is higher). For example, if David Haussler is the PI that you report to directly, then the directory would exist as /public/groups/hausslerlab. Request access to that group directory and you will then be able to write to it. Each of those group directories are shared by the lab it belongs to, so you must be wary of everyone's data usage and share the 15TB available per group accordingly. | |||
On the compute servers you can check your group's current quota usage by using the '/usr/bin/viewquota' command. You can only check the quota of a group you are part of (you would be a member of the UNIX group of the same name). If you wanted to check the quota usage of /public/groups/hausslerlab for example, you would do: | |||
$ viewquota hausslerlab | |||
Project quota on /export (/dev/mapper/export) | |||
Project ID Used Soft Hard Warn/Grace | |||
---------- --------------------------------- | |||
hausslerlab 1.8T 15T 16T 00 [------] | |||
== Actually Doing Work and Computing == | |||
When doing research, running jobs and the like, please be careful of your resource consumption on the server you are on. Don't run too many threads or cores at once if such a thing overruns the RAM available or the disk IO available. If you are not sure of your potential RAM, CPU or disk impact, start small with one or two processes and work your way up from there. Also, before running your stuff, check what else is already happening on the server by using the 'top' command to see who else and what else is running and what kind of resources are already being consumed. If, after starting a process, you realize that the server slows down considerably or becomes unusable, kill your processes and re-evaluate what you need to make things work. These servers are shared resources - be a good neighbor! | |||
== Serving Files to the Public via the Web == | |||
If you want to setup a web page on courtyard, or serve files over HTTP from there, do this: | |||
mkdir /public/home/''your_username''/public_html | mkdir /public/home/''your_username''/public_html | ||
chmod 755 /public/home/''your_username''/public_html | chmod 755 /public/home/''your_username''/public_html | ||
Put data in the public_html directory. The URL will be: | |||
http://public.gi.ucsc.edu/''~username''/ | http://public.gi.ucsc.edu/''~username''/ | ||
== /data/scratch Space on the Servers == | |||
Each server will generally have a local /data/scratch filesystem that you can use to store temporary files. '''BE ADVISED''' that /data/scratch is not backed up, and the data there could disappear in the event of a disk failure or anything else. Do not store important data there. If it is important, it should be moved somewhere else very soon after creation. |
Latest revision as of 17:07, 24 October 2023
How to Gain Access to the Public Genomics Institute Compute Servers
If you need access to the Genomics Institute compute servers please complete this request form: https://app.smartsheet.com/b/form/a76dbd90ba0240ab9ea9d39b390586ce. There are two parts in this process.
1. For the user, please fill in ALL required fields and submit.
2. For the Sponsor/PI - you will receive an email from Smartsheet. Please fill in all required fields and submit.
We will receive your completed request and we will create your account and go over the details via a short zoom meeting with you.
Account and Storage Cost
Costs for having an active UNIX account and for storage (per TB) are listed in this document, under "Genomics Project Support", specifically "Genomics IT Systems User Support per user" and "Genomics Data Storage per TB":
https://planning.ucsc.edu/budget/rates-and-assessments/recharge-rates/docs/2021-22-approved-recharge-rates.pdf
Account Expiration
Your UNIX account will have an expiration date associated with it after creation, as requested by your sponsor. Please take note of this expiration date when your account is created.
You will receive notice by email when your account is about to expire. To renew, simply ask the PI that sponsored you (which will be included in the notice) to email cluster-admin@soe.ucsc.edu requesting that your account to be renewed for another year, or any other requested amount of time.
If your account expires, the account will be suspended and you will no longer be able to login or view any data you may have in our systems. Any automated scripts (owned by you) that run via cron or other mechanisms will cease to function.
Server Types and Management
You can ssh into our public compute servers via SSH:
courtyard.gi.ucsc.edu: 1TB RAM, 64 cores, 672GB local scratch space, Ubuntu 22.04.2 park.gi.ucsc.edu: 256GB RAM, 32 cores, 5TB local scratch space, Ubuntu 22.04.2
These servers are managed by the Genomics Institute Cluster Admin group. If you need software installed on them, please make your request by emailing cluster-admin@soe.ucsc.edu.
Storage
These servers mount two types of storage; home directories and group storage directories. Your home directory will be located as "/public/home/username" and has a 30GB quota. The group storage directories are created per PI, and each group directory has a default 15TB quota (although in some cases the quota is higher). For example, if David Haussler is the PI that you report to directly, then the directory would exist as /public/groups/hausslerlab. Request access to that group directory and you will then be able to write to it. Each of those group directories are shared by the lab it belongs to, so you must be wary of everyone's data usage and share the 15TB available per group accordingly.
On the compute servers you can check your group's current quota usage by using the '/usr/bin/viewquota' command. You can only check the quota of a group you are part of (you would be a member of the UNIX group of the same name). If you wanted to check the quota usage of /public/groups/hausslerlab for example, you would do:
$ viewquota hausslerlab Project quota on /export (/dev/mapper/export) Project ID Used Soft Hard Warn/Grace ---------- --------------------------------- hausslerlab 1.8T 15T 16T 00 [------]
Actually Doing Work and Computing
When doing research, running jobs and the like, please be careful of your resource consumption on the server you are on. Don't run too many threads or cores at once if such a thing overruns the RAM available or the disk IO available. If you are not sure of your potential RAM, CPU or disk impact, start small with one or two processes and work your way up from there. Also, before running your stuff, check what else is already happening on the server by using the 'top' command to see who else and what else is running and what kind of resources are already being consumed. If, after starting a process, you realize that the server slows down considerably or becomes unusable, kill your processes and re-evaluate what you need to make things work. These servers are shared resources - be a good neighbor!
Serving Files to the Public via the Web
If you want to setup a web page on courtyard, or serve files over HTTP from there, do this:
mkdir /public/home/your_username/public_html chmod 755 /public/home/your_username/public_html
Put data in the public_html directory. The URL will be:
http://public.gi.ucsc.edu/~username/
/data/scratch Space on the Servers
Each server will generally have a local /data/scratch filesystem that you can use to store temporary files. BE ADVISED that /data/scratch is not backed up, and the data there could disappear in the event of a disk failure or anything else. Do not store important data there. If it is important, it should be moved somewhere else very soon after creation.