首页> 外文期刊>Journal of Molecular Biology >Automatic clustering of orthologs and in-paralogs from pairwise species comparisons.
【24h】

Automatic clustering of orthologs and in-paralogs from pairwise species comparisons.

机译:通过成对物种比较对直系同源物和旁系同源物进行自动聚类。

获取原文
获取原文并翻译 | 示例
           

摘要

Orthologs are genes in different species that originate from a single gene in the last common ancestor of these species. Such genes have often retained identical biological roles in the present-day organisms. It is hence important to identify orthologs for transferring functional information between genes in different organisms with a high degree of reliability. For example, orthologs of human proteins are often functionally characterized in model organisms. Unfortunately, orthology analysis between human and e.g. invertebrates is often complex because of large numbers of paralogs within protein families. Paralogs that predate the species split, which we call out-paralogs, can easily be confused with true orthologs. Paralogs that arose after the species split, which we call in-paralogs, however, are bona fide orthologs by definition.Orthologs and in-paralogs are typically detected with phylogenetic methods, but these are slow and difficult to automate. Automatic clustering methods based on two-way best genome-wide matches on the other hand, have so far not separated in-paralogs from out-paralogs effectively.We present a fully automatic method for finding orthologs and in-paralogs from two species. Ortholog clusters are seeded with a two-way best pairwise match, after which an algorithm for adding in-paralogs is applied. The method bypasses multiple alignments and phylogenetic trees, which can be slow and error-prone steps in classical ortholog detection. Still, it robustly detects complex orthologous relationships and assigns confidence values for both orthologs and in-paralogs. The program, called INPARANOID, was tested on all completely sequenced eukaryotic genomes. To assess the quality of INPARANOID results, ortholog clusters were generated from a dataset of worm and mammalian transmembrane proteins, and were compared to clusters derived by manual tree-based ortholog detection methods. This study led to the identification with a high degree of confidence of over a dozen novel worm-mammalian ortholog assignments that were previously undetected because of shortcomings of phylogenetic methods.A WWW server that allows searching for orthologs between human and several fully sequenced genomes is installed at http://www.cgb.ki.se/inparanoid/. This is the first comprehensive resource with orthologs of all fully sequenced eukaryotic genomes. Programs and tables of orthology assignments are available from the same location. Copyright 2001 Academic Press.
机译:直系同源物是不同物种中的基因,它们起源于这些物种的最后一个共同祖先中的单个基因。这样的基因在当今的生物中经常保留相同的生物学作用。因此,重要的是鉴定直系同源物,以高度可靠地在不同生物体的基因之间传递功能信息。例如,人类蛋白质的直系同源物通常在模型生物中具有功能特征。不幸的是,人与人之间的拼字分析由于蛋白质家族中大量的旁系同源物,无脊椎动物通常很复杂。早于物种分裂的旁系同源物,我们称为外旁系同源物,很容易与真正的直系同源物混淆。在物种分裂后出现的旁系同源物,根据定义,我们称之为真正的直系同源物,直系同源物和旁系同源物通常是用系统发育方法检测到的,但它们缓慢且难以自动化。另一方面,基于双向最佳全基因组匹配的自动聚类方法至今仍无法有效地将旁系同源物与旁系同源物有效分离。我们提出了一种全自动方法,可从两个物种中找到直系同源物和旁系同源物。 Ortholog群集以双向最佳成对匹配方式播种,然后应用用于添加旁系同源物的算法。该方法绕开了多个比对和系统发育树,这在传统直系同源物检测中可能是缓慢且容易出错的步骤。尽管如此,它仍能可靠地检测复杂的直系同源关系,并为直系同源物和旁系同源物分配置信度值。该程序名为INPARANOID,已在所有完全测序的真核基因组上进行了测试。为了评估INPARANOID结果的质量,从蠕虫和哺乳动物跨膜蛋白数据集生成直系同源簇,并将其与通过手动基于树的直系同源物检测方法得出的簇进行比较。这项研究高度可靠地鉴定了十二种新颖的蠕虫-哺乳动物直系同源物,这些系统以前由于系统发育方法的缺陷而未被发现。安装了WWW服务器,该服务器可以搜索人与几个完全测序的基因组之间的直系同源物在http://www.cgb.ki.se/inparanoid/。这是第一个综合了所有完全测序的真核基因组直系同源物的资源。矫正作业的程序和表格可从同一位置获得。版权所有2001学术出版社。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号