MAFFT-DASH: integrated protein sequence and structural alignment

J Rozewicki, S Li, KM Amada, DM Standley… - Nucleic acids …, 2019 - academic.oup.com
Nucleic acids research, 2019academic.oup.com
Here, we describe a web server that integrates structural alignments with the MAFFT
multiple sequence alignment (MSA) tool. For this purpose, we have prepared a web-based
Database of Aligned Structural Homologs (DASH), which provides structural alignments at
the domain and chain levels for all proteins in the Protein Data Bank (PDB), and can be
queried interactively or by a simple REST-like API. MAFFT-DASH integration can be invoked
with a single flag on either the web (https://mafft. cbrc. jp/alignment/server/) or command-line …
Abstract
Here, we describe a web server that integrates structural alignments with the MAFFT multiple sequence alignment (MSA) tool. For this purpose, we have prepared a web-based Database of Aligned Structural Homologs (DASH), which provides structural alignments at the domain and chain levels for all proteins in the Protein Data Bank (PDB), and can be queried interactively or by a simple REST-like API. MAFFT-DASH integration can be invoked with a single flag on either the web (https://mafft.cbrc.jp/alignment/server/) or command-line versions of MAFFT. In our benchmarks using 878 cases from the BAliBase, HomFam, OXFam, Mattbench and SISYPHUS datasets, MAFFT-DASH showed 10–20% improvement over standard MAFFT for MSA problems with weak similarity, in terms of Sum-of-Pairs (SP), a measure of how well a program succeeds at aligning input sequences in comparison to a reference alignment. When MAFFT alignments were supplemented with homologous sequences, further improvement was observed. Potential applications of DASH beyond MSA enrichment include functional annotation through detection of remote homology and assembly of template libraries for homology modeling.
Oxford University Press