...
首页> 外文期刊>Applied Soft Computing >Type II fuzzy set-based data analytics to explore amino acid associations in protein sequences of Swine Influenza Virus
【24h】

Type II fuzzy set-based data analytics to explore amino acid associations in protein sequences of Swine Influenza Virus

机译:基于II型的基于模糊的数据分析,用于探讨猪流感病毒蛋白质序列中的氨基酸关联

获取原文
获取原文并翻译 | 示例
           

摘要

The veracity present in molecular data available in biological databases possesses new challenges for data analytics. The analysis of molecular data of various diseases can provide vital information for developing better understanding of the molecular mechanism of a disease. In this paper, an attempt has been made to propose a model that addresses the issue of veracity in data analytics for amino acid association patterns in protein sequences of Swine Influenza Virus. The veracity is caused by intrasequential and inter-sequential biases present in the sequences due to varying degrees of relationships among amino acids. A complete dataset of 63,682 protein sequences is downloaded from NCBI and is refined. The refined dataset consists of 26,594 sequences which are employed in the present study. The type I fuzzy set is employed to explore amino acid association patterns in the dataset. The type I fuzzy support is refined to partially remove the inter-sequential biases causing veracity in data. The remaining inter-sequential biases present in refined fuzzy support are evaluated and eliminated using type II fuzzy set. Hence, it is concluded that a combination of type II fuzzy & refined fuzzy approach is the optimal approach for extracting a better picture of amino acid association patterns in the molecular dataset. (C) 2019 Elsevier B.V. All rights reserved.
机译:生物数据库中可用的分子数据中存在的真实性具有对数据分析的新挑战。各种疾病的分子数据分析可以提供重要信息,以便更好地了解疾病的分子机制。在本文中,已经尝试提出一种模型,该模型解决了猪流感病毒蛋白质序列中氨基酸关联模式的数据分析中的真实性问题。由于氨基酸之间的不同程度的关系,序列中存在的核和连续偏置的核和连续偏置引起的。从NCBI下载了63,682个蛋白质序列的完整数据集,并被精制。精制数据集由本研究中使用的26,594个序列组成。 I型模糊集合用于探索数据集中的氨基酸关联模式。 I型模糊支持被精制以部分地删除连续偏置导致数据中的准确性。使用II型模糊集进行评估和消除了精制模糊支撑中存在的剩余顺序偏差。因此,得出结论,II型模糊和精制模糊方法的组合是提取分子数据集中更好地提取氨基酸关联模式的最佳方法。 (c)2019年Elsevier B.V.保留所有权利。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号