User access
-
How can I change my password?
-
Use the passwd command once connected to genotoul.toulouse.inra.frhttp://snp.toulouse.inra.fr
-
Which operating system is required to connect via SSH to the platform?
-
All systems implementing SSH can be used to connect your self to genotoul.toulouse.inra.fr (Windows, Linux, MacOS).
-
How to transfer your files from/to the platform?
-
From your Windows desktop to the server Unix (genotoul.toulouse.inra.fr), transfer your files using a secure program such as WinSCP or Filezilla. With Filezilla you have to set the transfer port to 22.
For WinSCP :
Enter the host address, login and password and connection type: SCP
File Transfer:
Frame 1 corresponds to the directory tree on your local desktop.
Frame 2 corresponds to your directory tree on genotoul.toulouse.inra.fr.
To transfer files drag them from one frame to the other. -
How to add auto-completion on your account?
-
Connect yourself to genotoul.toulouse.inra.fr. Edit the .cshrc file and add the following commands:
set autoexpand
set autolist
Then save, log out and then re-connect. -
How to access the platform?
-
You access the platform either through the website (public access) or via a command line SSH connection. To access via SSH, you must have a user account. To request a user account please complete the following form.
-
How to connect to the platform under MS-Windows?
-
Download Xming and Putty. Only the Xming installation requires administrator rights.
Launch putty by double-clicking on the icone. Once started, fill in the Host Name by snp.toulouse.inra.fr.
Then configure the X11 forwarding in Connection> SSH> X11 as show
You can save the configuration using the save button
Once the configuration is complete, click Open to start a session
Enter your unix login and password (use this page to ask for an account)
You are now in your Unix environment.
For more information about the configuration of putty refer to the online documentation.
Job submission
-
Where I can find OpenGridSheduler documentation ?
-
-
How can I know my quota usage on /work directory ?
-
Use the following command line (on genotoul server):
# mmlsquota -u user_name
-
How can I lanch java onto the nodes ?
-
$ java -Xmx4g
-
How can I book more than 1 CPU ?
-
With default parameters, each job is limited to 1 CPU. To book more, use the following options:
# Book n slots on the same node (up to 48)
$ qsub -pe parallel_smp n mysqcript.sh
# Book n slots on any nodes (could be the same)
$ qsub -pe parallel_fill n mysqcript.sh
# Book n slots on strictly different nodes
$ qsub -pe parallel_rr n mysqcript.sh -
Which scheduler is used ?
-
SGE (Sun Grid Engine) version 6.2. A full documentation is available as .pdf here.
-
Which commands can I use to submit my job ?
-
[BATCH]
qsub: submit a batch job to Sun Grid Engine (default "workq").
qarray: submit a batch job-array to Sun Grid Engine.
[INTERACTIVE]
qlogin: submit an interactive X-windows session to Sun Grid Engine (automaticly "interq")
qrsh: submit an interactive login session to Sun Grid Engine.
-
What are the available queues ?
-
Each jobs are submitted to a specific queue (the default one is the workq). Each queue has a different priority considering the maximum time of execution allowed.
Queue
priority
max computing time allowed
workq
300
48h (default)
unlimitq
100
unlimited
hypermemq
0
on demand
-
Which are the disk spaces available for user ?
-
/home/user: 100Mb available to store your configuration files.
/work/user: 1To available as working directory. you have read/write access from any cluster node. Files are automaticly deleted if they have not been accessed within the last 60 days.
/save/user: 200Gb available for data you want to save with 30 retention days. You have read only access on this directory from any cluster nodes.
If you need more space in /work or in /save you are invited to fill the ressources request form.
/usr/local/bioinfo/bin: directory gathering all bioinformatic binaries
/bank: biological banks in different format -
What resources are available on the cluster ?
-
34 cluster nodes (1632 cores): ceri001...ceri034 : 4*12 core with 384Gb of memory.
1 node with 1To of memory (32 cores): cerimem01=4*octo core with 1To of memory. -
With default parameters, what are my job limitations ?
-
Without any parameters, on any queue, all jobs are limited to mem=4Gb, h_vmem=8Gb of memory, 1 CPU.
A user cannot submit more than 25 000 jobs and cannot have more than 816 jobs running. -
How can I submit a simple job on the cluster ?
-
1 - First write a script (ex: myscript.sh) with the command line as following:
#$ -o /work/.../output.t
#$ -e /work/.../error.txt
#$ -q longq
#$ -M my_email@toulouse.inra.fr
#$ -m bea
# My command lines I want to run on the cluster
blastall -d swissprot -p blastx -i /save/.../z72882.fa
2 - To submit the job, use the qsub command line as following:
qsub myscript.sh -
How can I submit a MPI job ?
-
1. First of all, the parallel environment has to be booked on the cluster using the -pe option. As example for a qsub:
#!/bin/bash
#$ -pe parallel_rr 100
2. Then the environment variables and the MPI compiler wanted have to be loaded as following :
module load compiler/intel-2013.1.117
module load mpi/openmpi-1.5.4
mpirun cmd_name -
How can I monitor a running job ?
-
To do so, you can use the qstat command, following are some usefull options:
$ qstat -u user : list only the specified user's jobs.
$ qstat -j job_id : provide several information on the specified job.
$ qstat -s r : list only the running jobs.
You can also have access to a graphical user interface which provides the same informations. This interface is accessible with the qmon command. -
How can I book more than 4Gb of memory ?
-
With default parameters, each jobs are limited to 4Gb of memory. To request more, you can use the -l h_vmem=XG -l mem=YG option. As example:
$ qsub -l h_vmem=10G -l mem=8G myscript.sh -
How can I retrieve information on a finished job ?
-
To do so, use the qacct command line as following:
$ qacct -j job_id
This command line allow as well to make some SGE usage statistics. -
How can I kill my job ?
-
To do so, you can use the qdel command, following are some usefull options:
# Kill the specified job
$ qdel -j job_id
# Kill all job launched by the specified user
$ qdel -u user
Banks
-
How can I access available banks list ?
-
Bank list
The bank list and their updating dates are listed here.
General information
You can access available banks information (version, updates, location, and so on) by connecting on this page.
You can obtain the same information using the following command lines:
1. biomaj.sh --status: provide the status of each available banks
2. biomaj.sh --status bank_name: return the specified bank status
Specific information on NCBI Blast
You can access the bank list available to be used within the NCBI Blast command line or the web site with the following command line:
ls /bank/blastdb/
Bwa index
You can access the bwa index list available with the following command line:
ls /bank/bwadb/
Bowtie index
You can access the bwa index list available with the following command line:
ls /bank/bowtiedb/
NGS data
-
Do you have Newbler user guide ?
-
Yes, here is the pdf.
-
How can I publish my short reads data ?
-
If you want to publish results on short reads you have to publish your data.
For genomic data you can submit it in the Short Read Archive (SRA).
The european mirror of SRA is the ENA, have a look to the video in the paragraph "Submitting public access data using SRA Webin".For RNAseq data you can submit it in MAGE-TAG: http://www.ebi.ac.uk/cgi-bin/microarray/magetab.cgi.
Follow those step :
- create your account
- fill all the needed information
- compute the md5 of your raw file or ask to the bioinformatics platform if your data are in NG6. md5 is a unique string generated fron the content of a file. To compute it on windows use WinMD5, on linux use in the terminal the command md5sum /path/to/file
- transfert your data when it's required or ask to the bioinformatics platform if your data are in NG6.
Self training
-
Where to train yourself in biostatistics ?
-
Here is an online course web site : https://www.coursera.org/course/biostats
-
Where to train yourself in bio-informatics?
-
Here is a site of self-study in bio-informatics...in French.
Cite us
-
How to cite the Genotoul Bio-informatic platform in your publications?
-
Research teams can thank the Toulouse Midi-Pyrenees bioinformatics platform , using in their publications the following sentence : "Project partially supported by the platform bioinformatics Toulouse Midi-Pyrenees".
In cases of collaboration, you can directly quote the person who participated to the project : Name, bioinformatics platform Toulouse Midi-Pyrenees, UBIA, INRA. -
Where can I find the platforms' publications?
-
You can find them following this link.
