首页> 外文会议>Pacific Symposium on Biocomputing(PSB); 20050104-08; Hawaii,HI(US) >IMPROVING FUNCTIONAL ANNOTATION OF NON-SYNONOMO US SNPs WITH INFORMATION THEORY
【24h】

IMPROVING FUNCTIONAL ANNOTATION OF NON-SYNONOMO US SNPs WITH INFORMATION THEORY

机译:信息论改进非共鸣美国单核苷酸多态性的功能注释

获取原文
获取原文并翻译 | 示例

摘要

Automated functional annotation of nsSNPs requires that amino-acid residue changes are represented by a set of descriptive features, such as evolutionary conservation, side-chain volume change, effect on ligand-binding, and residue structural rigidity. Identifying the most informative combinations of features is critical to the success of a computational prediction method. We rank 32 features according to their mutual information with functional effects of amino-acid substitutions, as measured by in vivo assays. In addition, we use a greedy algorithm to identify a subset of highly informative features. The method is simple to implement and provides a quantitative measure for selecting the best predictive features given a set of features that a human expert believes to be informative. We demonstrate the usefulness of the selected highly informative features by cross-validated tests of a computational classifier, a support vector machine (SVM). The SVM's classification accuracy is highly correlated with the ranking of the input features by their mutual information. Two features describing the solvent accessibility of "wild-type" and "mutant" amino-acid residues and one evolutionary feature based on superfamily-level multiple alignments produce comparable overall accuracy and 6% fewer false positives than a 32-feature set that considers physiochemical properties of amino acids, protein electrostatics, amino-acid residue flexibility, and binding interactions.
机译:nsSNPs的自动功能注释要求氨基酸残基变化由一组描述性特征表示,例如进化保守性,侧链体积变化,对配体结合的影响以及残基结构刚性。确定功能最丰富的特征组合对于计算预测方法的成功至关重要。我们通过体内实验测定的氨基酸互作性效应,根据它们的相互信息对32个特征进行排名。此外,我们使用贪婪算法来识别高度信息化功能的子集。该方法易于实现,并提供了定量的度量,用于在人类专家认为具有信息意义的一组特征的情况下选择最佳预测特征。我们通过对计算分类器,支持向量机(SVM)进行交叉验证的测试来证明所选的高信息功能的有用性。 SVM的分类准确度与输入特征的相互信息高度相关。描述“野生型”和“突变”氨基酸残基的溶剂可及性的两个特征,以及一个基于超家族水平多重比对的进化特征,与考虑到物理化学作用的32个特征集相比,可产生相当的总体准确性,并且假阳性率降低了6%氨基酸的特性,蛋白质静电,氨基酸残基的柔韧性和结合相互作用。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号