Scoring genes in light of their ‘friends’, and a naval approach to science
From the lab Heiko Horn led the development and implementation of network-based statistic that identifies cancer driver genes with high accuracy from cancer genomes. Results are validated using a massively parallel in vivo tumorigenesis assay in mice and by re-analyzing 660 lung adenocarcinoma patients where ~1/3 do not have mutations or copy number changes in known oncogenes identifying two new cancer-driving genes underlying this cancer type.
This project is a collaboration with Jesse Boehm and Gad Getz from the Broad Institute and MGH Cancer Center.
Paper can be found here:
Heiko Horn, Michael S Lawrence, Candace R Chouinard, Yashaswi Shrestha, Jessica Xin Hu, Elizabeth Worstell, Emily Shea, Nina Ilic, Eejung Kim, Atanas Kamburov, Alireza Kashani, William C Hahn, Joshua D Campbell, Jesse S Boehm, Gad Getz & Kasper Lage
Methods that integrate molecular network information and tumor genome data could complement gene-based statistical tests to identify likely new cancer genes; but such approaches are challenging to validate at scale, and their predictive value remains unclear. We developed a robust statistic (NetSig) that integrates protein interaction networks with data from 4,742 tumor exomes. NetSig can accurately classify known driver genes in 60% of tested tumor types and predicts 62 new driver candidates. Using a quantitative experimental framework to determine in vivo tumorigenic potential in mice, we found that NetSig candidates induce tumors at rates that are comparable to those of known oncogenes and are ten-fold higher than those of random genes. By reanalyzing nine tumor-inducing NetSig candidates in 242 patients with oncogene-negative lung adenocarcinomas, we find that two (AKT2 and TFDP2) are significantly amplified. Our study presents a scalable integrated computational and experimental workflow to expand discovery from cancer genomes.
Congratulations to Kasper on being promoted to Associate Professor at Harvard Medical School as of November 1st 2017!!!
For more information about the meeting click this link.
Title of Kasper’s talk:
Pathway modeling of genetic datasets in neurodegenerative and psychiatric diseases using
More information on the conference can be found here: Click this link
Large-scale protein-protein interaction networks in human neurons coalesce schizophrenia risk loci into unexpected pathways
Eugene Nacu (1,3), April Kim (1,2), Edyta Malolepsza (1,2), Taibo Li (1,2) William Crotty (1,3), Natalie Petrossian (1,3), Benjamin Tanenbaum (4), Stephan Ripke(1,2), Jake Jaffe (4), Monica Schenone (4), Mark Daly (1,2), Kevin Eggann(1,3), Kasper Lage (1,2)
1) Stanley Center for Psychiatric Research at the Broad Institute. 2) Massachusetts General Hospital. 3) Harvard Stem Cell Institute. 4) Broad Institute.
The recent genome-wide association studies in schizophrenia have revealed many risk loci encoding genes likely to be involved in this disorder and exciting glimpses of molecular pathways have emerged from the data (e.g., chromatin remodeling, calcium signaling, synaptic pruning and synaptic transmission). Such examples illustrate how some genes associated with schizophrenia interact at the level of proteins to form networks involved in diverse areas of neurobiology. However, most of the identified genes do not connect with each other in well-defined cellular pathways and it is clear that the disease also includes largely uncharted and incomplete networks that are probably unique to the human brain. This is a key bottleneck towards biological insight and therapeutic intervention. Here, we describe a large-scale approach to overcome some of these challenges by executing systematic interaction experiments in human neurons of proteins encoded in schizophrenia risk loci. First, our approach capitalizes on unbiased genetic data to choose corresponding proteins as the starting point of the protein interaction experiments. Second, we developed several parallel workflows, using both manual production and automated approaches on robots, to generate human upper layer cortical excitatory neurons from embryonic stem cells at scale (meaning routinely producing billions of cells). Third, we exploited state-of-the-art proteomics technologies to map quantitative interaction networks of the index proteins at high resolution. Fourth, we developed a new analytical platform (Genoppi) to quality control and integrate cell-type-specific protein interaction experiments and genome-wide association data to identify unexpected pathways relationships between risk loci. Our analysis shows that a large fraction of the high-quality and reproducible protein interactions we identify are unique to human neurons meaning that the interactions have not earlier been reported in the literature and are not identified in non-brain tissues. These observations illustrate the importance of executing the experiments in a human cell type of relevance to the trait being analyzed. Importantly, we uncover many unexpected pathway connections between schizophrenia risk loci. For example, our analysis reveals neuron-specific protein-protein interactions between calcium channels and the classic complement cascade at three different time points of neuronal differentiation. This observation would have been missed in other cell types and provides an unexpected link between calcium signaling and synaptic pruning in schizophrenia. More generally the experimental and analytical approaches we develop here for a neuropsychiatric disease could potentially be applied to functionally annotate loci and provide new pathway insights into other common complex traits.
Title of Kasper’s talk: “Large-scale protein-protein interaction experiments of schizophrenia risk genes in human neurons coalesce GWAS loci into unexpected pathways”
The event will occur September 13 and 14: See more information here.
We’re delighted to announce that the Lage Lab has been awarded a three year grant from the Simons Foundation Autism Research Initiative to to study the protein networks perturbed by genetics in autism. Kasper Lage is thrilled to join the community of Simons Foundation Investigators. The project will enable us to leverage and strengthen our current experimental work in the Stanley Center for Psychiatric Research at the Broad Institute and is a collaboration with Kevin Eggan’s lab in the Stanley Center and the Department of Stem Cell and Regenerative Biology, Harvard University.
For more about the Stanley Center at the Broad Institute click here
For more about Kevin Eggan’s Lab click here
For more about the Simons Foundation Autism Research Initiative click here
The recent explosion in genetics in autism spectrum disorders has revealed many genes likely to be involved in these debilitating disorders. These efforts have resulted in exciting glimpses of molecular pathways emerging from the data (e.g., chromatin remodeling and synaptic transmission). While such examples illustrate how genes linked to autism interact at the level of proteins to form networks involved in diverse areas of neurobiology, most of the identified genes do not fall into any well-defined cellular pathway and it is now clear that the biology also includes largely uncharted and incomplete networks that are probably unique to the human brain. This is a key bottleneck towards biological insight and therapeutic intervention. Here, we propose to overcome these challenges through an integrative approach that leverages recent genetic discoveries with large-scale proteomics experiments to derive human brain networks (of physically interacting proteins) perturbed by genetics in autism. This network will serve as an accelerator of functional insight from current and future psychiatric genetics data and it sits at the inflection point of transformative technology and data that have just become mature: First, we will capitalize on new unbiased genetic data to choose corresponding proteins (termed “index proteins” throughout the proposal) as the starting point of the network analysis. Second, we will exploit new proteomics technologies to map the tissue-specific quantitative interaction networks of these index proteins at high resolution. We believe that this network will be of broad value to interpret current and future studies in psychiatric genetics, and that it will immediately contribute to guiding therapeutic insight and intervention. Third, the proteomics experiments will be coupled to exciting progress in our ability to generate human neurons from induced pluripotent stem cells so that the interactions of index proteins are derived in a biologically meaningful cellular (and human) context. Fourth, we will experimentally follow up on discoveries from our analyses using reductionist neurobiological assays we have access to through collaborators. It is an important aspect of this application that we will establish a robust statistical methodology, which is currently lacking, for integrative analyses of experimental proteomics networks and genetic data that can be a model for others to use in any area of genetics in the future. Overall, the goal of this project is to leverage the genetic analyses to map, validate and follow up on the brain-specific cellular networks perturbed by genetics in autism. This will catalyze biological insight and inform future therapeutic opportunities.
Taibo has recently been awarded a full MD/PhD scholarship to Johns Hopkins School of Medicine. Taibo started in the Lage Lab in April 2013 and worked on several projects including InWeb, GeNets, BINe, CanComSq. He looks forward to continuing working with the Lage Lab from his new position at Hopkins and we wish him the best of luck!
The presentation can be viewed by creating an account here: http://theleadingstrand.cshl.edu/activate/160970/2017/GENOME
Expanding discovery from cancer genomes by integrating network analyses with in vivo tumorigenesis experiments
Heiko Horn 1,2 , Michael Lawrence 2,3 , Candace Chouinard², Yashaswi Shresta², Jessica Hu 1,2 , Elizabeth Worstell 1,2 , Emily Shea², Nina Ilic 2,4 , Ejung Kim 2,4 , Atanas Kamburov 2,3 , Alireza Kashani 1,2 , William Hahn 2,4 , Joshua Campbell 2,5 , Jesse Boehm², Gad Getz 2,3 , Kasper Lage 1,2
1 Massachusetts General Hospital, Department of Surgery, Boston, MA, ²Broad Institute of MIT and Harvard, Cancer Program, Cambridge, MA, 3 Massachusetts General Hospital, Department of Pathology, Boston, MA, 4 Dana Farber Cancer Institute, Department of Medical Oncology, Boston, MA, 5 Boston University, Department of Medicine, Boston, MA
Gene-based statistical tests to find cancer genes look for increased rates of somatic mutations or genomic copy number changes in cancer genomes. However, considerable sample sizes are required to find driver genes with intermediate or low mutation frequencies, and additional cancer genes remain to be discovered. Previous analyses have shown that cancer mutations in some cases converge on specific functional genomics sub networks. This suggests that mutations in a genes’ functional network can be predictive of whether it is a cancer gene itself. However, this hypothesis has never been systematically explored across hundreds of known cancer genes, tens of tumor types, and thousands of cancer genomes. More importantly, analyses of cancer gene networks have not previously been coupled to systematic experimental validation assays and their predictive power to provide new insight into tumor biology remains unclear. We develop a statistic (NetSig) that combines molecular protein network information and existing cancer sequencing data to identify genes with a significantly mutated gene network (excluding data on the gene itself). We apply NetSig to data from 4,742 tumors spanning 21 tumor types and identify known and recently proposed driver genes in most (~ 60%) tumor types. NetSig also identifies 62 other genes with a significantly mutated gene network many suggesting new cancer biology. We test 25 known driver genes (positive controls), 33 NetSig candidates, and 79 random genes (random controls) in a massively parallel in vivo tumorigenesis cell assay. We demonstrate that the NetSig candidates induce tumors at rates that are comparable to the known driver genes and eightfold higher than random genes when injected into mouse models. Guided by the NetSig results and functional validation experiments, we looked for mutations and copy number changes in these genes that could explain 242 (out of a total of 660) lung adenocarcinomas without any known driver event; the analysis identified significant amplifications of several NetSig candidates in this patient subgroup. Overall, we present an integrated workflow that complements gene-based statistical tests by combining molecular network information, cancer sequencing data, and in vivo tumorigenesis assays to find and validate new driver genes in existing cancer genome data. The framework we describe is scalable to the rapid production of data and should become increasingly powerful as more tumors are sequenced in the future.