首页> 外文期刊>Journal of Molecular Biology >Proteins and domains vary in their tolerance of non-synonymous single nucleotide polymorphisms (nsSNPs)
【24h】

Proteins and domains vary in their tolerance of non-synonymous single nucleotide polymorphisms (nsSNPs)

机译:蛋白质和域对非同义单核苷酸多态性(nsSNPs)的耐受性各不相同

获取原文
获取原文并翻译 | 示例
           

摘要

The widespread application of whole-genome sequencing is identifying numerous non-synonymous single nucleotide polymorphisms (nsSNPs), many of which are associated with disease. We analyzed nsSNPs from Humsavar and the 1000 Genomes Project to investigate why some proteins and domains are more tolerant of mutations than others. We identified 311 proteins and 112 Pfam families, corresponding to 2910 domains, as disease susceptible and 32 proteins and 67 Pfam families (10,783 domains) as disease resistant based on the relative numbers of disease-associated and neutral polymorphisms. Proteins with no significant difference from expected numbers of disease and polymorphism nsSNPs are classified as other. This classification takes into account the phenotypes of all known mutations in the protein or domain rather than simply classifying based on the presence or absence of disease nsSNPs. Of the two hypotheses suggested, our results support the model that disease-resistant domains and proteins are more able to tolerate mutations rather than having more lethal mutations that are not observed. Disease-resistant proteins and domains show significantly higher mutation rates and lower sequence conservation than disease-susceptible proteins and domains. Disease-susceptible proteins are more likely to be encoded by essential genes, are more central in protein-protein interaction networks and are less likely to contain loss-of-function mutations in healthy individuals. We use this classification for nsSNP phenotype prediction, predicting nsSNPs in disease-susceptible domains to be disease and those in disease-resistant domains to be polymorphism. In this way, we achieve higher accuracy than SIFT, a state-of-the-art algorithm.
机译:全基因组测序的广泛应用正在确定众多非同义的单核苷酸多态性(nsSNPs),其中许多与疾病有关。我们分析了Humsavar和1000 Genomes Project的nsSNP,以调查为什么某些蛋白质和结构域比其他蛋白质和突变体更能耐受突变。根据疾病相关和中性多态性的相对数量,我们确定了311个蛋白和112个Pfam家族(对应于2910个域)为易感疾病,并确定了32个蛋白和67个Pfam家族(为10,783个域)为抗病性。与预期疾病数量和多态性nsSNPs没有显着差异的蛋白质被归类为其他。该分类考虑了蛋白质或结构域中所有已知突变的表型,而不是简单地基于疾病nsSNP的存在或不存在进行分类。在提出的两个假设中,我们的结果支持了该模型,即抗病性域和蛋白质更能耐受突变,而不是未观察到的更多致命突变。与疾病易感的蛋白和结构域相比,抗病蛋白和结构域显示出明显更高的突变率和更低的序列保守性。对疾病敏感的蛋白质更有可能由必需基因编码,在蛋白质-蛋白质相互作用网络中更重要,并且在健康个体中不太可能包含功能丧失突变。我们将这种分类用于nsSNP表型预测,预测疾病易感域中的nsSNPs为疾病,而抗病性域中的nsSNPs为多态性。这样,我们可以获得比最先进的算法SIFT更高的精度。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号