Alamut Documentation Home | Tutorials | Managing Variants | Splicing Module | Orthologue Alignments

Orthologue Alignments


By default, orthologues aligned and displayed in Alamut are taken from the Ensembl Compara database. So far (March 2010), the only non-Ensembl-based alignment available was the manually-curated alignment of BRCA1 NM_007294, provided by IARC with Align GVGD.

Building new alignments

Although Ensembl Compara is a very valuable information source, manual selection of orthologues and alignment curation are necessary for optimal missense interpretation and for scoring systems like SIFT, Polyphen, and Align GVGD.

This is why we have designed a semi-automatic procedure for orthologue alignment construction, briefly described here.

Orthologous sequences are searched with the BlastP program, first against the Uniprot/Swissprot database. If sequences of distant species are not found, BlastP is then run against the Refseq database, and finally against the NCBI non-redundant protein sequence database, if needed.

The set of orthologues is then filtered manually, based on sequence length, identity with the human sequence, and available annotations.

Selected sequences are then aligned with M-Coffee, a meta-multiple sequence alignment program.

In order to adjust the alignment depth, two quality criteria are calculated. These criteria are based on work published by Tavtigian et al. (2008, 2009) and on recommendations published on the SIFT web site (Sorting Intolerant From Tolerant, a mutation classification system). Correct alignments should contain on average three substitutions per position, and the median information content should be less than or equal to 3.25. If the alignment does not satisfy the quality criteria, sequences creating large gaps are removed, and new sequences are added if needed to raise information content.

As a final step, alignments are optimized manually.

Available manually-curated alignments (October 04, 2011)

GeneTranscriptsOrigin
ABCA4NM_000350.2IBS
ACVRL1NM_000020.2IBS
ATMNM_000051.3Tavtigian et al. (2009)
ATMU82828.1Tavtigian et al. (2009)
BBS1NM_024649.4IBS
BRCA1NM_007294.2IARC - Align GVGD
BRCA1NM_007294.3IARC - Align GVGD
BRCA2NM_000059.3IBS
COL3A1NM_000090.3IBS
CYP21A2NM_000500.5IBS
DMDNM_004006.2IBS
ENGNM_001114753.1IBS
F8NM_000132.3IBS
FAT4NM_024582.4IBS
GCKNM_033507.1IBS
GJB2NM_004004.5IBS
GLANM_000169.2IBS
KITNM_000222.2IBS
KRASNM_033360.2IBS
L1CAMNM_000425.3IBS
LDLRNM_000527.3IBS
LDLRNM_000527.4IBS
LMNANM_170707.2IBS
MECP2NM_001110792.1IBS
MEN1NM_000244.3IBS
MLH1NM_000249.2IBS
MLH1NM_000249.3IBS
MSH2NM_000251.1IBS
MSH6NM_000179.1IBS
MSH6NM_000179.2IBS
MUTYHNM_001128425.1IBS
MYBPC3NM_000256.3IBS
MYH7NM_000257.2IBS
MYL2NM_000432.3IBS
NF1NM_001042492.2IBS
NOTCH3NM_000435.2IBS
ORC1NM_004153.3IBS
PKD1NM_001009944.2IBS
PKP2NM_004572.3IBS
PMS2NM_000535.5IBS
RB1NM_000321.2IBS
SCN1AAB093548.1IBS
SCN1ANM_001165963.1IBS
SCN5ANM_198056.2IBS
SDHBNM_003000.2IBS
SH3TC2NM_024577.3IBS
SPRED1NM_152594.2IBS
SRD5A2NM_000348.3IBS
TARDBPNM_007375.3IBS
TP53NM_000546.4IBS
TTNNM_133378.4IBS
VHLNM_000551.2IBS

We intend to add new alignments for the most frequently studied genes regularly. Should you wish a new alignment for a specific gene not in the above list, please send us a request at .

Acknowledgments

We would like to express our thanks to the Genetic Cancer Susceptibility Group at IARC for their kind help in defining our alignment protocol.

References

Tavtigian, SV., Greenblatt, MS., Lesueur, F., Byrnes, GB. (2008). In silico analysis of missense substitutions using sequence-alignment based methods. Hum Mutat.11 : 1327-36

Tavtigian, SV., Oefner, PJ., Babikyan, D. et al (2009). Rare, evolutionarily unlikely missense substitutions in ATM confer increased risk of breast cancer. Am J Hum Genet. 85 : 427-46.


© 2010 Interactive Biosoftware - Last modified: Oct 10, 2011