Next-generation DNA sequencing of the exome has detected millions of small somatic variants (SSV) in cancer. However, distinguishing genes containing driving mutations rather than simply passenger SSVs from a cohort sequenced cancer samples requires sophisticated computational approaches. 20/20+ integrates many features indicative of positive selection to predict oncogenes and tumor suppressor genes from small somatic variants. The features capture mutational clustering, conservation, mutation in silico pathogenicity scores, mutation consequence types, protein interaction network connectivity, and other covariates (e.g. replication timing). Contrary to methods based on mutation rate, 20/20+ uses ratio-metric features of mutations by normalizing for the total number of mutations in a gene. This decouples the genes from gene-level differences in background mutation rate.
Current stable release is 2020plus.1.0.1, last updated on 06/26/2016.
You can view the current source code on github.
2020plus-1.0.1.tar.gz 06/26/2016 added pipeline code.
Please consult the software documentation web page for installation and usage details.
Collin Tokheim: collintokheim at gmail dot com