Computational Genomics Lab at IICB: October 2013

Wednesday, October 23, 2013

Full length cDNA Technologies

Transcriptome analysis is proving its immense importance day by day as it RNAs are crucial to unterstand the dynamic nature of cellular events. Since RNAs are not stable, they are reverse transcribed to cDNA and the cDNAs are sequence instead of the RNAs to get the sequence of RNAs. But the a mammoth obstacle of this procedure is the incomplete synthesis of cDNA.

There are several reasons why the first strand of cDNA is not fully transcribed during the reverse transcription. Here some of the causes and the ways of overcoming those difficulties will be discussed.

1) Problem: inefficiency in the synthesis of the first cDNA strand
Cause: due to inefficiency of reverse transcriptase enzymes.
Solution: Addition of trehalose.
Mechanism: trehalose gives heat resistance to enzyme including the reverse transcriptional enzyme. From this discovery, it became possible to induce reactions using the reverse transcriptional enzyme with 60°C instead of 42°C as was previously the norm. Under 60°C, the template RNAs' secondary structure gets weaker, and the area with stable secondary structure, which often exists in non-coding 5' end mRNAs, becomes possible to be transcribed in reverse in a very efficient manner.

2) Poblem: absence of actual 3' end of cDNA (i.e. 5' end of the mRNA)
Cause: inefficiency of reverse transcriptase
Solution: Cap-trapper method.
Mechanism: biotinylationof the cap-structure that is specific to mRNA of eukaryotic organisms followed by selective extraction of the biotinylated cap structure of non full-length cDNA (enzymatic degradation of ss mRNA ). The remaining full-length cDNA containing a biotinylated cap can be caught by using magnetic beads coated with streptavidin. The selected full-length cDNA can then be eluted by alkaline treatment. Synthesizing the second strand from the 1st strand cDNA (which became single stranded when separated from the beads) makes it possible to obtain the full-length cDNA selectively.

Fig: Cap Trapper method

Monday, October 21, 2013

Installing Ensembl API

Ensembl has a cool sets of sub routines very useful. For some people installing and testing is just very easy but for some it may be very difficult especially when you are setting up a new server. You just get the API from http://asia.ensembl.org/info/docs/api/core/core_tutorial.html where step by step instructions are given.

Many times running /ensembl/misc-scripts/ping_ensembl.pl tell you what is the problem. In our instance it kep complaining about installation of DBI even though DBI was very much there. Looking into the complaining part:

101 eval {
102 require DBI;
103 require DBD::mysql;
104 require Bio::Perl;
105 require Bio::EnsEMBL::Registry;
106 require Bio::EnsEMBL::ApiVersion;
107 require Bio::EnsEMBL::LookUp if $ensembl_genomes;
108 $api_version = Bio::EnsEMBL::ApiVersion::software_version();

I realized that DBD::mysql is required and is not available in my machine. Then login as root and type cpan at the command mode, then do an 'install DBD::mysql' exits with lot of error. Tracing back it says can't find mysql.h. I looked for the source and never found them since I had only done a yum install mysql-server. In this case, you first have to do a yum install mysql-devel. Then go to cpan and do force install DBD::mysql. So, probably it will install.

Set Path for Ensembl:

As written in the document, set paths for ensembl packages as below:

export PERL5LIB=$PERL5LIB:/home/sutripa/ensembl/modules
export PERL5LIB=$PERL5LIB:/home/sutripa/ensembl-compara/modules
export PERL5LIB=$PERL5LIB:/home/sutripa/ensembl-functgenomic/modules
export PERL5LIB=$PERL5LIB:/home/sutripa/ensembl-variation/modules

export PERL5LIB=$PERL5LIB:/home/sutripa/ensembl-tools/

Then go to:

ensembl/misc-scripts and run ./ping_ensembl.pl and see if you get a message like this:
Installation is good. Connection to Ensembl works and you can query the human co
re database

Then you are good to go....

Thursday, October 17, 2013

Installation of secretome in your server

We are also working on probiotic bacteria and analyzed few secretory proteins from MALDI. As it turns out, some of the secretory proteins are predicted to signalp positive and some are not. Intrigued, we did a secretome analysis on those and found all the secretory proteins to be either signalP positive or secretome positive. We got a copy of secretome from http://www.cbs.dtu.dk/cgi-bin/nph-sw_request?secretomep . Untarred and got the package. However, it has a large number of dependencies and they are:

1. ProP 1.0 http://www.cbs.dtu.dk/services/ProP/
2. PSORT II http://www.psort.org/
3. seg ftp://ftp.ncbi.nih.gov/pub/seg/seg
4. TMHMM 1.0 http://www.cbs.dtu.dk/services/TMHMM/
5. SignalP 3.0 http://www.cbs.dtu.dk/services/SignalP/

ProP 1.0
Prop predicts arginine and lysine cleavage sites using ensemble Neural Network. By default it does a Furin specific prediction. It also does a proprotein convertase prediction and has been integrated into signalp.
For installation of this package, you have to open the 'prop' executable file and start editing the paths in the following section:

else if ( $SYSTEM == "Linux" ) then # typical Linux
setenv AWK /usr/bin/gawk setenv ECHO "/bin/echo -e" setenv GNUPLOT /usr/local/bin/gnuplot # /usr/local/bin/gnuplot-3.7 #setenv PPM2GIF /usr/bin/ppmtogif # I could not find a ppmtogif package in my installation so I did the following:
setenv PPM2GIF /usr/bin/ppm2tiff setenv SIGNALP /usr/cbs/bio/bin/signalp

set appropriate path after checking the path of the above mentioned programs.

Now set PROPHOME correctly in prop script.

setenv PROPHOME /usr/adadata/prop-1.0c

Then just test by running ./prop and see if it runs fine.

PSORT II

I obtained a copy by emailing: Yuki Saito <yuki-s@hgc.jp> secretary to prof Prof. Kenta Nakai

This is a very useful program used for predicting sub-cellular localization of a protein and is a perl script. All that you need to do it in the sha bang line set the correct path for perl!

Test it by running ./psort . If it runs fine, then installation is OK.

Seg:

It comes with the blast distribution.

SignalP and TMHMM :

Can be obtained from the same (http://www.cbs.dtu.dk) place and can be installed.

Once all the installations are done check if all of these programs are working. In case, all are running fine, then go back to secretomep script and set the correct paths to all the existing programs:

Here is a template of my changes:

else if ( `uname` == "Linux" ) then

setenv ECHO "/bin/echo -e"

setenv AWK /usr/bin/gawk

setenv PERL /usr/bin/perl

set prop = /usr/adadata/prop-1.0c/prop

set psort = /usr/adadata/psort/psort

set seg = /usr/adadata/ncbi-blast-2.2.28+/bin/segmasker

set tmhmm = /usr/adadata/tmhmm-2.0c/bin/tmhmm

set signalp = /usr/adadata/signalp-4.1/signalp

endif

Once done just run ./secretomep in silent mode and see if it runs fine...

[our installation was successful]

Computational Genomics Lab at IICB

Followers