InParanoiDB 9

What is InParanoid?
InParanoiDB is a database containing groups of orthologs and inparalogs for 640 species. The InParanoid algorithm was originally developed at the Center for Genomics and Bioinformatics to address the need to identify orthologs. Homologs that originate from a speciation event are called orthologs and homologs that originate from a gene duplication event are called paralogs. If a duplication event predates the speciation event the parlogs are called outparalogs, and they can be present in different species. If instead an ortholog undegoes one or several duplication events, the resulting paralogs are called inparalogs, and they are co-orthologs to one or more orthologs in another species. Since an outparalog pair ought to have a more diversified function than inparalogs, it is useful to distinguish between the two. Furthermore, clustering inparalogs together allows proper identification of both one-to-one and many-to-many orthology cases.

How does InParanoid detect orthologs?
The InParanoid program uses the pairwise similarity scores, calculated using DIAMOND, between two complete proteomes for constructing orthology groups. An orthology group is initially composed of two so-called seed orthologs that are found by two-way best hits between two proteomes. More sequences are added to the group if there are sequences in the two proteomes that are closer to the correpsonding seed ortholog than to any sequence in the other proteome. These members of an orthology group are called inparalogs. A confidence value is provided for each inparalog that shows how closely related it is to its seed ortholog.

Why should I use InParanoid?
By definition orthologs between two species have evolved from one single gene in their common ancestor. Thus, orthologs are likely to have the same function in both species. Another way to detect orthologs would be from phylogenetic trees. This is widely used for single gene families, but these are slow and difficult to automate. Morover, the preliminary steps - like clustering genes into homologous families and creation of multiple alignments are needed. Also the topology of the phylogenetic tree is strongly dependent on choice of tree building method.

Automatic clustering methods based on two-way best genome-wide matches on the other hand, have so far not effectively separated in-paralogs from out-paralogs. The problem of in-paralog clustering is more important for analyzing eukaryotic genomes. Eukaryotic genes form large homologous families that cannot be classified by simple best-best hit methods. InParanoid is a fully automatic method for finding orthologs and in-paralogs between TWO species. Ortholog clusters in the InParanoid are seeded with a two-way best pairwise match, after which an algorithm for adding in-paralogs is applied. The method bypasses multiple alignments and phylogenetic trees, which can be slow and error-prone steps in classical ortholog detection. Still, it robustly detects complex orthologous relationships and assigns confidence values for in-paralogs.

Download
The orthologs groups in InParanoiDB9 are inferred using the InParanoid-DIAMOND algorithm, available for download here
The orthologs are complemented with information on orthology on the protein domain level inferred using Domainoid

Contact
For general question on the InParanoid program, please sign up for the InParanoid mailing list here
If you want to report a bug, or if you are having issues with the webite, please send us an email at
emma.persson@scilifelab.se

Cite

InParanoiDB 9
Persson, E and Sonnhammer, E,L,L. InParanoiDB 9: Ortholog groups for protein domains and full-length proteins. Journal of Molecular Biology. (2023) https://doi.org/10.1016/j.jmb.2023.168001
InParanoid-DIAMOND
Persson, E and Sonnhammer, E,L,L. InParanoid-DIAMOND: faster orthology analysis with the InParanoid algorithm, Bioinformatics, Volume 38, Issue 10. (2022) https://doi.org/10.1093/bioinformatics/btac194
Domainoid
Persson, E., Kaduk, M., Forslund, S.K. et al. Domainoid: domain-oriented orthology inference. BMC Bioinformatics 20, 523 (2019). https://doi.org/10.1186/s12859-019-3137-2
InParanoid 8
Erik L.L.Sonnhammer and Gabriel Östlund, InParanoid 8: orthology analysis between 273 proteomes, mostly eukaryotic. Nucleic Acids Res. 43:D234-D239 (2015) https://doi.org/10.1093%2Fnar%2Fgku1203
InParanoid 7
Ostlund G, Schmitt T, Forslund K, Kostler T, Messina DN, Roopra S, Frings O and Sonnhammer ELL. InParanoid 7: new algorithms and tools for eukaryotic orthology analysis. Nucleic Acids Res. 38:D196-D203 (2010) PDF
InParanoid 6
Berglund AC, Sjolund E, Ostlund G and Sonnhammer ELL. InParanoid 6: eukaryotic ortholog clusters with inparalogs. Nucleic Acids Res. 36:D263-266 (2008) PDF
InParanoid database
O'Brien Kevin P, Remm Maido and Sonnhammer Erik L.L. InParanoid: A Comprehensive Database of Eukaryotic Orthologs. Nucleic Acids Res. 33:D476-D480 (2005) PDF
InParanoid algorithm
Maido Remm, Christian E. V. Storm, and Erik L. L. Sonnhammer. Automatic clustering of orthologs and in-paralogs from pairwise species comparisons. J. Mol. Biol. 314:1041-1052 (2001) PDF

License
Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License