Quick tutorial for how to use PSC Bridges with an XSEDE allocation.
See here for our research proposal and here for examples of startup allocation proposals.
1. Login
There are two ways. (Pick one. If one doesn't work, try the other.)
A. Direct login to bridges
ssh -p 2222 ljcohen@bridges.psc.xsede.org
password select DUO option
Now you're on bridges. The command prompt should look similar to this:
[ljcohen@login006 ~]$
B. SSO (single sign-on hub)
ssh ljcohen@login.xsede.org
password select DUO option
Prompt will then look like this:
[ljcohen@ssohub ~]$
Then
gsissh bridges
Now you're on bridges. The command prompt should look similar to this:
[ljcohen@login018 ~]$
2. Where are we?
Type:
projects
This will show you the allocations we have.
We have three separate PSC bridges allocations. Keep track of how much allocation you're using with this projects
command.
1. Storage
2. RM
3. LM
Under "BRIDGES PYLON STORAGE", it will list the directories path where the Lustre storage is located. All users who are on the allocation will have their own directory within this project. By default, you do not have read or write access to other users' directories, only your own. So, don't worry about making a mistake and deleting someone else's work.
Note that space in your ~/
home directory is limited. PSC says 10GB, but I've run out of space installing miniconda software, so I think it's more like 10 or 100 MB. (but haven't confirmed this)
Because of this limited space in home, all files should go in your Lustre storage directory. For example, I've setup miniconda to install software in my storage directory:
/pylon5/bi5fpmp/ljcohen/miniconda3/bin
If you're on more than one allocation, your working directories should be in account-specific, project-specific storage. For example, I'm on 2 allocations:
Allocation 1:
cd /pylon5/bi5fpmp/ljcohen/
ls -lah
Allocation 2:
cd /pylon5/mc5phkp/ljcohen/
ls -lah
3. Computing Things.
- PSC bridges user guide
- Running jobs
- There are generally 3 compute node partitions available, depending on what you need to do:
- Large Memory (LM)
- Regular Memory (RM)
- Regular Memory-Interactive (RM-interactive), for interactive login to test commands so they don't run on the head node.
Examples:
Run an interactive job for 1 hr to see if commands/software will run. (It might take a little while, ~5-10 min for job to be queued, so be patient):
interact -p RM-shared --ntasks-per-node=4 -N 1 -t 1:00:00
LM job example script:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 | #!/bin/bash -l
#SBATCH -D /pylon5/mc5phkp/ljcohen/kfish_abyss
#SBATCH -J Axen-abyss
#SBATCH -o /pylon5/mc5phkp/ljcohen/kfish_abyss/abyss-Axen%j.o
#SBATCH -e /pylon5/mc5phkp/ljcohen/kfish_abyss/abyss-Axen%j.o
#SBATCH -t 120:00:00
#SBATCH -p LM
#SBATCH --mem=1000GB
#SBATCH --ntasks-per-node 14
source ~/.bashrc
source activate assembly
cd /pylon5/mc5phkp/ljcohen/kfish_abyss
SPECIES=Axen
PROJECTDIR=$LOCAL/$SPECIES/
mkdir $PROJECTDIR
cd $PROJECTDIR
cp /pylon5/mc5phkp/ljcohen/kfish_abyss/Axenica_USPD16092508-N706-AK391_HV3JCCCXY_L6_1.qc.fq.gz .
cp /pylon5/mc5phkp/ljcohen/kfish_abyss/Axenica_USPD16092508-N706-AK391_HV3JCCCXY_L6_2.qc.fq.gz .
cp /pylon5/mc5phkp/ljcohen/kfish_abyss/Axenica_USPD16092508-N706-AK391_HV3JCCCXY_L7_1.qc.fq.gz .
cp /pylon5/mc5phkp/ljcohen/kfish_abyss/Axenica_USPD16092508-N706-AK391_HV3JCCCXY_L7_2.qc.fq.gz .
abyss-pe k=51 name=Axen_abyss in="Axenica_USPD16092508-N706-AK391_HV3JCCCXY_L6_1.qc.fq.gz Axenica_USPD16092508-N706-AK391_HV3JCCCXY_L6_2.qc.fq.gz Axenica_USPD16092508-N706-AK391_HV3JCCCXY_L7_1.qc.fq.gz Axenica_USPD16092508-N706-AK391_HV3JCCCXY_L7_2.qc.fq.gz" contigs
# Grab output assembly files and copy to your storage
cp $PROJECTDIR/Axen_abyss /pylon5/mc5phkp/ljcohen/kfish_abyss/
# remove temp files on $LOCAL
rm -rf $PROJECTDIR
|
RM job example script:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 | #!/bin/bash -l
#SBATCH -D /pylon5/mc5phkp/ljcohen/kfish_Illumina/
#SBATCH -J Axen-diginorm
#SBATCH -o /pylon5/mc5phkp/ljcohen/kfish_Illumina/diginorm-Axen%j.o
#SBATCH -e /pylon5/mc5phkp/ljcohen/kfish_Illumina/diginorm-Axen%j.o
#SBATCH -t 48:00:00
#SBATCH -p RM
#SBATCH --ntasks-per-node 8
#SBATCH --cpus-per-task 1
source /home/ljcohen/.bashrc
source activate sourmash_conda
cd /pylon5/mc5phkp/ljcohen/kfish_Illumina/
(paste \
<(zcat {Axenica_USPD16092508-N706-AK391_HV3JCCCXY_L6_1.qc.fq.gz,Axenica_USPD16092508-N706-AK391_HV3JCCCXY_L7_1.qc.fq.gz} | paste - - - - ) \
<(zcat {Axenica_USPD16092508-N706-AK391_HV3JCCCXY_L6_2.qc.fq.gz,Axenica_USPD16092508-N706-AK391_HV3JCCCXY_L7_2.qc.fq.gz} | paste - - - - ) \
| tr '\t' '\n' ) | \
(trim-low-abund.py -k 20 -C 2 - -o - -x 5e7) | \
(extract-paired-reads.py --gzip -p A_xen.paired.gz -s A_xen.single.gz) > /dev/null
|
Queue management
Submit script to slurm queue as a job:
sbatch yourfile
Check the queue status:
squeue -u ljcohen
Cancel job:
scancel 5280622
In an emergency, if you've (hypothetically) submitted a lot of jobs you don't want, cancel all job ID beginning with numbers 528
:
squeue -u ljcohen | grep “528" | sed -r -e "s/[\t\ ]+/./g" | cut -d"." -f 2 | xargs scancel
Let's do something simple.
From the blast commandline tutorial from angus:
cd /pylon5/mc5phkp/ljcohen
mkdir test
cd test
Download:
curl -o mouse.1.protein.faa.gz -L https://osf.io/v6j9x/download
curl -o mouse.2.protein.faa.gz -L https://osf.io/j2qxk/download
curl -o zebrafish.1.protein.faa.gz -L https://osf.io/68mgf/download
Decompress:
gunzip *.faa.gz
Load module:
module load blast/2.7.1
Make
makeblastdb -in zebrafish.1.protein.faa -dbtype prot
Run interactive job:
interact -p RM-shared --ntasks-per-node=4 -N 1 -t 1:00:00
Run this interactively, or create a script file and submit as an RM job:
cd test
blastp -query mouse.1.protein.faa -db zebrafish.1.protein.faa -out mm.x.zebrafish.tsv -outfmt 6
4. How to Transfer files
scp a directory of files from bridges to your local computer, must use port 2222
:
scp -r -P 2222 ljcohen@data.bridges.psc.edu:/location/ .
Comments
comments powered by Disqus