Computational analysis of genomic data to aid medical decision making.

In the new "post-genome" era of personalized medicine, many variants critical to disease susceptibilities and drug sensitivies will be identified and increased numbers of people will undergo genetic testing. We are developing algorithms and tools intended to facilitate this process.

Postdoc David Masica discusses a new algorithm to identify disease-causing mutations in CFTR (Cystic Fibrosis Transconductance Receptor).

The lab talks about an algorithm under development to predict probability of a complex phenotype from human genomic data.

Selected Publications:  PLoS Compbio 2016   Gastroenterology 2015   Human Mol. Genetics 2014  BMC Genomics 2013   Human Mutation 2012  Methods Mol. Biol 2011  Cancer Informatics 2008   PLoS Compbio 2007   


Identifying functionally important variation in cancer genomes

The genomes of tumors acquire somatic mutations that may provide insights into their mechanisms of action and potential cancer treatments. A key challenge is identifying biologically important sequence variation in these genomes.

CRAVAT. Web tool and services for high-throughput scoring and annotation of cancer mutations.

CHASM. A machine learning method that predicts missense mutations likely to drive tumor growth and progression. Read a JHU magazine article about CHASM.

MOCA. A model-free approach to find patterns of coordinated alterations in cancer genomics data sets. See this TCGA research highlight.

Selected Publications:  bioRxiv 2016  Annals of Oncology 2015  Bioinformatics 2013   PNAS 2011  Leukemia 2011  Nature 2011  Cancer Research 2011  Science 2011  Cancer Biology Therapy 2010  Cancer Research 2009


Protein evolution and antibiotic resistance

Understanding how novel functions evolve (genetic adaptation) is a critical goal of evolutionary biology. Among asexual organisms, genetic adaptation involves multiple mutations that frequently interact in a non-linear fashion (epistasis). Non-linear interactions pose a formatted challenge for computational prediction of mutation effects. We are exploring methods to predict epistatic effects and their impact on fitness, using the recent evolution of β-lactamase under antibiotic selection as a model for genetic adaptation.

Publications: PLoS Compbio 2011


Finding disease-associated rare variants, genes, and pathways from case-control studies and genomic sequencing data.

In the past few years, case-control studies of common diseases have shifted their focus from single genes to whole exomes. New sequencing technologies now routinely detect hundreds of thousands of sequence variants in a single study, many of which are rare or even novel. The limitation of classical single-marker association analysis for rare variants has been a challenge in such studies. A new generation of methods is being developed to meet this challenge. We are working to develop scalable algorithms that incorporate biological knowledge as well as allele frequencies to find causal variants, genes, and pathways..

Selected Publications:  Human Mutation 2016  PLoS Genetics 2013   BMC Genomics 2013


Mapping variants onto protein structures

The ability to visualize where variants and mutations occur within the tertiary struture of a protein can be useful in identifying biologically important mutations in an intuitive way, accessible to biologists.

LS-SNP/PDB. Systematic mapping of non-synonymous SNPs in dbSNP onto experimentally-derived structures in the Protein Data Bank.

MUPIT Interactive. On-demand mapping of any non-synonymous mutation onto an experimentally-derived structure in the Protein Data Bank

Selected Publications:  Cancer Research 2016   Human Genetics 2013  Nature 2011  Bioinformatics 2009  PLoS Compbio 2007