...
首页> 外文期刊>Frontiers in Microbiology >VirionFinder: Identification of Complete and Partial Prokaryote Virus Virion Protein From Virome Data Using the Sequence and Biochemical Properties of Amino Acids
【24h】

VirionFinder: Identification of Complete and Partial Prokaryote Virus Virion Protein From Virome Data Using the Sequence and Biochemical Properties of Amino Acids

机译:Viriverfinder:使用氨基酸的序列和生化特性,鉴定来自生物数据的完整和部分原治病毒病毒虫蛋白

获取原文
           

摘要

Viruses are some of the most abundant biological entities on Earth, and prokaryote virus are the dominant members of the viral community. Because of the diversity of prokaryote virus, functional annotation cannot be performed on a large number of genes from newly discovered prokaryote virus by searching the current database; therefore, the development of an alignment-free algorithm for functional annotation of prokaryote virus proteins is important to understand the viral community. The identification of prokaryote virus proteins (PVVPs) is a critical step for many viral analyses, such as species classification, phylogenetic analysis and the exploration of how prokaryote virus interact with their hosts. Although a series of PVVP prediction tools have been developed, the performance of these tools is still not satisfactory. Moreover, viral metagenomic data contains fragmented sequences, leading to the existence of some incomplete genes. Therefore, a tool that can identify partial prokaryote virus proteins is also needed. In this work, we present a novel algorithm, called VirionFinder, to identify the complete and partial PVVPs from non-prokaryote virus virion proteins (non-PVVPs). VirionFinder uses the sequence and biochemical properties of 20 amino acids as the mathematical model to encode the protein sequences and uses a deep learning technique to identify whether a given protein is a PVVP. Compared with the state-of-the-art tools using artificial benchmark datasets, the results show that under the same specificity ( Sp ), the sensitivity ( Sn ) of VirionFinder is approximately 10–34% much higher than the Sn of these tools on both complete and partial proteins. When evaluating related tools using real virome data, the recognition rate of PVVP-like sequences of VirionFinder is also much higher than that of the other tools. We expect that VirionFinder will be a powerful tool for identifying novel virion proteins from both complete prokaryote virus genomes and viral metagenomic data. VirionFinder is freely available at https://github.com/zhenchengfang/VirionFinder .
机译:病毒是地球上最丰富的生物实体中最丰富的一些生物实体,原核生病毒是病毒界的主要成员。由于原核病毒的多样性,通过搜索当前数据库,不能对新发现的原核生病毒的大量基因进行功能诠释;因此,用于原核生病毒蛋白的功能注释的无功能注释的无序算法的发展对于理解病毒群是重要的。原核病毒蛋白(PVVPS)的鉴定是许多病毒分析的关键步骤,例如物种分类,系统发育分析以及原核病毒如何与其主体相互作用的探索。虽然已经开发了一系列PVVP预测工具,但这些工具的性能仍然不令人满意。此外,病毒偏心组数据含有碎片序列,导致存在一些不完全基因的存在。因此,还需要一种可以识别部分原核生病毒蛋白的工具。在这项工作中,我们提出了一种名为Virivinder的新型算法,以鉴定来自非原子病病毒病毒蛋白(非PVVPS)的完整和部分PVVP。 Viriverfinder使用20个氨基酸的序列和生化特性作为编码蛋白质序列的数学模型,并使用深度学习技术来识别给定的蛋白是否是PVVP。与使用人工基准数据集的最先进的工具相比,结果表明,在相同的特异性(SP)下,Viriverfinder的敏感性(Sn)大约高于这些工具的SN的10-34%完整和部分蛋白质。使用真实病毒数据评估相关工具时,VVVP样病毒进虫器的识别率也远高于其他工具的识别率。我们预计Viriverfinder将是一种强大的工具,用于从完全原核病毒基因组和病毒偏心组数据中识别新型病毒素蛋白。 ViriverFinder在https://github.com/zhenchengfang/virionfinder上免费提供。

著录项

获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号