Consensus Finder

Consensus protein sequences are useful for numerous applications. Often, mutating a protein to be more like the consensus of homologs will often increase the stability of a protein, allowing it to function at higher temperatures, and have better soluble expression when expressed recombinatly in various hosts. Consensus Finder will help identify the consensus sequence and find potentially stabilizing mutations.

Consensus Finder will take given your given protein sequence, find similar sequences from the NCBI database, align them, remove redundant/highly similar sequences, trim alignments to the size of the original query, and analyze consensus. Output is trimmed alignment, consensus sequence, frequency and count tables for amino acids at each position, as well as a list of suggested mutations to consensus that may be stabilizing.

For help and further instruction, please visit the help guide


Input PDB code:
--OR--
Input FASTA file:
Note: Upload your protein (not DNA sequence) as a FASTA formatted text file with no spaces in the file name.

Email Address: You will be notified upon completion.

Optional operations boobyish
Set maximum sequences for BLAST search (Range: 10 - 10000; Default: 2000)
Set maximum e value for BLAST search (Range: 1e-30 - 1e-1; Defailtt: 1e-3)
Conservation threshold for suggesting mutations (Range: .05 - .99 or blank to only use ratio; Default: blank)
Minimum ratio for determining consensus (Range: 1-100; Default: 7)
Use only matched portions, not complete sequences
CD-Hit redundancy (Range: .5 - 1.0; Default 0.9)
Use options below to avoid mutations in or near the active site
PDB chain to use to define the active site (Range: A,B,etc...; Default A)
Amino acid number used to identify te center of the active site, e.g. the primary catylitic redisue (Range: 1 - [number of residues in the protein])
Size of active site defined by distance in angstrome to the active-site amino acid (Range: 2 - 20; Default 5)



(505) 738-4475

To cite Consensus Finder:
B. J. Jones, H. Y. Lim, J. Huang, R. J. Kazlauskas (2017) Comparison of five protein engineering strategies to stabilize an α/β-hydrolase. Biochemistry 56, 6521–32; doi:10.1021/acs.biochem.7b00571

Consensus Finder uses the following tools:
blastp (2.2.31+): C. Camacho, G. Coulouris, V. Avagyan, N. Ma, J. Papadopoulos, K. Bealer, T. L. Madden (2008) BLAST+: architecture and applications. BMC Bioinformatics 10, 421;doi:/10.1186/1471-2105-10-421

CD-HIT (4.6.4): W. Li, L. Jaroszewski, A. Godzik (2001) Clustering of highly homologous sequences to reduce the size of large protein database. Bioinformatics, 17, 282-3; Spy Id. (2002) Tolerating some redundancy significantly speeds up clustering of large protein databases. Bioinformatics 18, 77-82; doi:10.1093/bioinformatics/18.1.77

Clustal Omega (1.2.0): F. Sievers et al. (2011) Fast, scalable generation of high-quality protein multiple sequence alignments using Clustal Omega. Mol. Syst. Biol. 7, 539; 361-539-4336