Friday, October 24, 2014

Upgrading ubuntu and the consequences

When you upgrade ubuntu, there may be many unpleasant side effects. For instance I got an email about our server not accessible for citation purpose. I checked the web document roots and changed some permissions (which seem to have changed since the upgradation), still the site went blank.

To check ubuntu version do the following:
lsb_release  -a
Mine was 14.04

So I went ahead with a restart of apache and the commands are slightly different from that of red hat linux.

sudo /etc/init.d/apache2 restart

Restarting web server apache2
apache2: Could not reliably determine the server's fully qualified domain name, using for ServerName
... waiting apache2:
Could not reliably determine the server's fully qualified domain name, using for ServerName

Then browsing several web sites I did the following:

Created a file servername.conf inside
sudo vim /etc/apache2/conf-available/servername.conf

Inside this file entered a line

ServerName   MyDomainName
sudo a2enconf servername (Name of the file created)

then did a
service apache2 reload
Then restarted apache using:

/etc/init.d/apache2 restart

The warning message dis-appeared but the web page was still not up.

then you may have to change the document root directory. In our case, it was /var/www earlier, but currently it is /var/www/html

If you are depending upon bioperl modules, you may also see most of your perl modules dis-appearing. Then you search for that particular module using command:

find / -name

You may see your INC path has changed. Now you would like to place your bioperl files in INC path.

Change permissions of some files and then it will start working!!

Thursday, October 16, 2014

Day 2 and 3 Beyond Genome 2014

Talk 2: Genetics of Gleoblastoma:
Different populations are there in glioblastoma and fits to cancer stem cell model
Chipseq and functional elements. In vitro model. Differentiated gleoblastoma cels . xenograft model. Introduced TFS in vitro to induce tumor.
Core TFs bind to active TPC regulatory elements
Single cell RNAseq for glioblastoma. There is receptor diversity inside the glioblastoma tumor. 430 cells are sequenced. Each cell  detected around 6000 genes. PGFRA and TGFR are negatively co-regulated.
Core TFs are highly co-related with stemness. Negative correlation with MES. Then classify cells into high stemness or low stemness. For cells there is a dominant transcrion signature. Cells can switch fromone subclass to another. Tumors are more heterogenous than was thought before
Master regulators of tumor initiation,progression…
SynGen algorithm for predicting synergisim of molecules. ARACNe algorithm was used for reconstructing genetic network.
Transcription profile: regulatory network and functional network
Cell types, perturbations and phenotypic assays.Cell types can be cancer cell lines, any other cell lines, perturbations can be drug related. Assays can be transcriptomics, proteomic. Lincs L1000 data (CMapIII). 22119 genetic reagents, 77 cellular context, 20413 chemical reagents
Genomics, epigenomic…
Myc cell cycle, apoptosis, cellular transformation, cell proliferation. Myc is over expressed in many cancers.
Drugging the cancer interactions
Multifaceted target assessment for druggability
Cncerdrug targets make distinct subnetworks inside a network.

A complete catalogue, identification of drivers. Data sets are fewer for epigenetic data modifications.
40epigenetic marks are there
Understand  the chromatin states between Normal-> Tumor -> Metastasis
Chipseq for 35 chromatin marks. Generated a lot of data. Chromatin state prediction with ChromHMM
Relative changes in chromatin marks.
Loss of acylation in tumorigenic cells.
Six billion reads have been generated. Epigenomic plasticity
Personalized medicine
BRAF is mutated in human melanoma
Everolimus -> Has 17000 somatic mutation for a person who responded well.Map2k1 15 bp deletion.
All the patients with solid tumors have what kind of mutation needs to be determined before assessing their treatment type.
341 genes are listed for assay
Ten trillion bacterial cells, ten times morebacteril cells human genes
100 times more number of genes
Circulating tumor DNA. Cell free DNA 90% are hematopoetic stem cells. Cell free DNA increases in cancer patients.. Plasma-seq . The coverage is .1X depth
Grail is text mining tool
Finding cancer driver genes
Blair cell 2013 co-morbidity studies. 15cancers are there in TCGA hassomaticmutations. Gain,mutated,loss.OMIM has germlinemutations. Genetic links network,pathways.
Cancer is co-morbid with another genetic disease that happens due to mutation. Albinism is associated with some common genes associated with melanoma.
1/3 of the medelian disase have co-morbidity with ancer.
Bacterial-human somatic lateral gene transfer for cancer.
Fourth chromosoma of drosophila has 20% genome from bacteria.
Day 3
Talk1: Anchored Assembly: You can try at
Bioinformatics challenge:
BTG Informatics challenge: Single cell Copy Number analysis

Baslan Nature protocol 2012
Visualizationofmulti dimensionalcancer data Genome Medicine 2013
Copy number prediction using Titan, Ha et al.  Genome Research 2014
Genomic media andclinical cancer medicine: Dana Faber Institute:
Guided visualization exploration of cancer genomics

Jian Ma

Thursday, October 9, 2014

Beyond the Genome meeting 8-10th October,2014, Highlights of first day

The day opened at 11 AM for registration followed by Sumptuous lunch and the sessions started at 12.30 PM at Joseph B Martin conference center Harvard Medical School. The talks were fabulous. While listening to the talks, I was also scribbling down the points. It may not be very co-herent, but makes some sense. Here are the excerpts:

FIRST TALK: MutsigCV: Tool for  correcting the mutation rate – Cancer genes and evolution
1 mut/MB, 5-20 drivers
Driver event: An event that increased the fitness of the cell when it occurred
Cancer genes: That harbor in them.
Gene length plays a role, the longer the gene, the greater the chance to have cancer
Lung cancer 10 mutation /MB -> 450 genes
Tumor types have different mutation rates, blood cancer has the lowest and the lung cancer has the highest
21 tumor types, 4729 tumors > 3 * 10^6 mutations
Mutations are clustered, then they are not by chance, they may have a role. When they are in non-conserved region, they may be by chance. They did a FDR < 0.01 and did a qq plot to analyze the data that is not in a linear axis. Has all new genes in different cancer types.
With more data we get more cancer genes
The genes that mutate at higher rate (>20%)don’t vary much with down sampling. But as the % of mutation per gene went down the sample size mattered, e.g; increase in sample size led to increased number of genes.
2000 tumor samples will be appropriate for discovery for each cancer type, so altogether we will need2000 * 50=10,0000 samples`

Origin and consequences of genomic structural variation
Illumina,followed by Solid and least by454. Depth is 8X,length is 90 bp.
Deletion: breakdancer,Delly,CNVnator,Pindel
Many novel deletions,
Inversions are very complex types in cancer genome in 1000 genomeproject
Using Minion (oxford nanopore ) to evaluate pacBio sequencing.
International cancer genome consortium (ICGC): sequence entire genome for normal and cancer patients.
Chromothripsis is a major cause of cancer occurrence. Conclusion, existing data is not enough need more data.
Some chromosomes do circularize as a result of chromothripsis.
Pan cancer Analysis of whole genomes (PCAWG). Deeply sequenced data from 2000 patients.
Discovery of driver alterations in intergenic regions.
Therapeutic aspects of cancer drivers
Vernurafenib blocks BRAF .
6792 tumor samples covering 28 cancer types.
1.       Identify the drivers,find drugs targeting these drivers and assign drugs to the patients for testing
2.       4068tumors from16 tumors for somatic mutation
3.       Yates and Campbell et al 2012
4.       Finding positive selection signals could be indicativeof driver mutation.MuSIC-SMG/MutSigCV tools are used for detectingpostive selection leading to driver mutation finding. OncoDriveFM (FunctionalImpact bias), based on mutations at synonymous codon, stop codon gain or frame shift getsthe highest scores.
5.       There are some hotspot mutations can be identified by OncodriveClust
6. has the mutationpipeline. You select the cancer type and then lookfor the driver genes.
7.       Pipelines are also available for download and run locally.
8.       Different methods can be usedfor deteing low ly visible drier mutations.
9.       460 cancer drivers identified
10.   Pooled analysis andperproject analysis do bring about some non-overlapping genes.
11.   200 or so new driver genes, including regulatory genes
12.   Act=Gain switch of function(activating) ,Lof- Loss of function (tumor suppressor genes)
13.   207 lof, 170 Act and 83 are unclassified
14.   73 are major drivers
15.   460 are mutational drivers 29 are cnv (copy number variation drivers)
16.   90% of the tumors have atleast one driver event
17.   91 targetted drivers used for testing. 65.3% are in clinical trials
Epigenome Alterations
1.       Third most common pediatric braintumor, 45% incurable. Chemotherapy neverworks for these cases.
2.       Two clear groups identified. Tumor B tumor survives,Tumor A are very aggressive.Sopatients can be reclassified.
3.       Deep exome sequencingforPFA. Recurrent tumors havenoSNV nocnv no mutation. Looked at the epigenome. Cpg methylations patterns are very different.PfA tumors aremuchmorethanpfB mutation.Have a high promoter methylationevents/ gene silencing.inPFB nopatterns converged in pathway.InPFA it converged into a pathway where genes like to stay undifferentiated.
4.       Recapitulating embryonic state
5.       It is non-heritable disease. This ia a denovo disease.
Tumor/normal exome sequencing in dogs forsomatic mutations.
1.       Variant calling was done using MuTect and identifying significantly mutated genes using MUSIC
2.       Human data from tumor portal Lawrence te al Nature 505 (7484), 2014.
Pan cancer gene fusion
Oligo dimerization and tyrosynekinase domain fusions
1.       Cosmic and mitelman database -> 212 fusion genes and >3 samples
2.       Discordant reads and anchor reads are used for fusion discovery. Leads tolarge number of false positives. Minimumoverlap junction optimizer (MOJO).
3.       Less number of false positives. Excluded all fusion genes from GTex dataand also ignored >1% TCGA normals
4.       Most of the gene fusions occur between genes < 1MB apart.
5.       90% of the tumors have atleast one fusion
6.       1578 are recurrent fusions in >2 samples.
7.       38 fusions are recurrent in 10 or more samples
Enriching NA NGS analysis forCNVs, SNVs and gene fusions (sponsored talk)
Human Genome Analysis:
Hubs are more sensitive to mutation. Do they break motif, do they network hub.
Delivering large-scale clinical testing of cancer predisposition genes – what does it take?
Nazneen Rahman
TruSight workflow:
96 sample pipeline
Clonal evolution in breast cancer revealed by single nucleus genome sequencing
TCGA and ICGC (inter and intra tumor heterogeneity)
Mono-clonal, poly-clonal, self seeding, mutator phenotype or cancer stem cells.
For this single cell genome sequencing was pioneered. Nuc-seq. whole exome/genome sequencing using G2/M single cells. Mimimum coverage depth is 60X andcoverage breadth in exomes about 90%.
Randomlypicked cells fromsuspension
Exome sequencing identifies highly recurrent MED12 somatic mutations in breast fibroadenoma:
Med12 driver mutation in young women. Mutations occur at a hotspot and are deletions inframe. Mutation at codon 44. These are benign are in the stromal cells..
These mutations are reported in prostrate cancer, adenocortical carcinoma,uterine leiomyosarcoma. Possible dysregulationof estrogen signaling.
They show higher expression to estrogen response. They rise in stromal regions.
Extensive variation between primary disease and metastatic disease.