Help




1. What is FACT?

2. Which features will be annotated?

3. How does the scoring work?

4. Example output

4. Data sources





What is FACT?
FACT compares the architecture of features such as functional domains, secondary structure motifs and compositional properties between pairs of proteins. A feature dotplot (FDP) allows for a rapid and intuitive assessment to what extent two proteins agree in their feature architecture, and thus may share a similar function. An automated scoring routine complements the FDP and is used to search entire proteomes for proteins with potentially similar function to a given query sequence.

Which features will be annotated?
We annotate the features listed below. Unless otherwise noted, the programs used for feature prediction are embedded into the SFINX package (Sonnhammer and Wootton (2001) Protein: Structure, Function, and Genetics 45: 262-273).



How does the similarity scoring work?
FACT calculates the similarity between a query protein and every protein from the chosen search proteome. The similarity is based on the feature architectures of the proteins. Four such scoring schemata are implemented and the scores will be calculated simultanously. Details about the FACT, MS_uni and MS_st scoring will follow soon. The Lib score is a modified version from Lin et. al.(2006 Bioinformatics 22(17):2081-2086).

Example output
To illustrate FACT we have used the human glutathione S-transferase to search for a functionally equivalent protein in the proteome of yeast Saccharomyces cerevisiae. For the query protein and all proteins in the search-proteome the FACT, MS_uni, MS_st and Lin scores will be computed. The sequences from the search-proteome are ranked according to their score. From the resulting list, any pair-wise comparison can be extracted and displayed in the feature dotplot. Finally, a histograms of all FACT scores are displayed.

Click the links below to view the example output.

If you want to performe the search yourself go to FACT search page and enter the following sequence in FASTA format into the textbox:

>HUMAN_glutathione_S_transferase
RWSFAAAVFATMPPYTVVYFPVRGRCAALRMLLADQGQSWKEEVVTVETWQEGSLKAS
CLYGQLPKFQDGDLTLYQSNTILRHLGRTLGLYGKDQQEAALVDMVNDGVEDLRCKYI
SLIYTNYEAGKDDYVKALPGQLKPFETLLSQNQGGKTFIVGDQISFADYNLLDLLLIH
EVLAPGCLDAFPLLSAYVGRLSARPKLKAFLASPEYVNLPINGNGKQ


Choose Saccharomyces cerevisiae from the menu and click the "run scoring" button.

Another example is to search for a functional equivalent to the human GolgA5 protein in Trypanosoma brucei. The individual feature architecture similarity scores and BLAST show different T. brucei proteins as top hit. Comparing the different query/top-hit pairs with the feature dotplot gives further insights which T. brucei protein is most likely a functional equivalent to the human GolgA5.

>Human_GolgA5_ENSP00000163416
MSWFVDLAGKAEDLLNRVDQGAATALSRKDNASNIYSKNTDYTELHQQNTDLIYQTGPKSTYISSAADNIRNQKATILAG
TANVKVGSRTPVEASHPVENASVPRPSSHFVRRKKSEPDDELLFDFLNSSQKEPTGRVEIRKEKGKTPVFQSSQTSSVSS
VNPSVTTIKTIEENSFGSQTHEAASNSDSSHEGQEESSKENVSSNAACPDHTPTPNDDGKSHELSNLRLENQLLRNEVQS
LNQEMASLLQRSKETQEELNKARARVEKWNADHSKSDRMTRGLRAQVDDLTEAVAAKDSQLAVLKVRLQEADQLLSTRTE
ALEALQSEKSRIMQDQSEGNSLQNQALQTFQERLHEADATLKREQESYKQMQSEFAARLNKVEMERQNLAEAITLAERKY
SDEKKRVDELQQQVKLYKLNLESSKQELIDYKQKATRILQSKEKLINSLKEGSGFEGLDSSTASSMELEELRHEKEMQRE
EIQKLMGQIHQLRSELQDMEAQQVNEAESAREQLQDLHDQIAGQKASKQELETELERLKQEFHYIEEDLYRTKNTLQSRI
KDRDEEIQKLRNQLTNKTLSNSSQSELENRLHQLTETLIQKQTMLESLSTEKNSLVFQLERLEQQMNSASGSSSNGSSIN
MSGIDNGEGTRLRNVPVLFNDTETNLAGMYGKVRKAASSIDQFSIRLGIFLRRYPIARVFVIIYMALLHLWVMIVLLTYT
PEMHHDQPYGK

FACT output page for the GolgA5 example GolgA5 result.

Data sources