CLUMP (CLustering by Mutation Position)

is a method that performs unsupervised clustering of amino acid residue positions where variants occur, without any prior knowledge of their functional importance.

Visualizing the distribution of missense variants in a given protein sequence can be informative in relation to identifying potentially causal variants. However, such visualization does not provide quantitative assessment of clustering patterns and it cannot be applied in a high-throughput setting. CLUMP is a method for the rapid determination of mutation clustering patterns and their statistical significance.

CLUMP is free for non-commercial use. For more details please refer to our Software License. Commercial users should contact the Johns Hopkins Technology Transfer office.

Current release is CLUMP.1.0.0, last updated on 04/26/2015.

Source Code Releases

CLUMP-1.0.0.tar.gz    04/26/2015    Initial release with the manuscript

Documentation for the user

Please consult the readme page for usage details and a sample workflow.


Visualization of the variants in the SH3BP2 gene for healthy individuals in 1000 Genomes and individuals with Cherubism.


CLUMP scripts were developed in python and should be available across all platforms.

CLUMP requires:






R library package 'fpc'.

Primary citations

If you use our software for a publication, please cite the following:

Turner TN, Douville C, Kim D, Stenson PD, Cooper DN, Chakravarti A, Karchin R (2015) Proteins linked to autosomal dominant and autosomal recessive disorders harbor characteristic rare missense mutation distribution patterns. Hum Mol Genet. 24(21):5995-6002

Software primary contact/developer

Christopher Douville:  cdouvil1 at jhmi dot edu