首页> 外文期刊>Systematic Biology >Type I error and the power of the s-test: old lessons from a new, analytically justified statistical test for phylogenies
【24h】

Type I error and the power of the s-test: old lessons from a new, analytically justified statistical test for phylogenies

机译:I型错误和s检验的功效:从新的,经过分析证明的系统发育统计检验中得出的旧课程

获取原文
获取原文并翻译 | 示例
           

摘要

We present a new procedure for assessing the statistical significance of the most likely unrooted dichotomous topology inferrable from four DNA sequences. The procedure calculates directly a P-value for the support given to this topology by the informative sites congruent with it, assuming the most likely star topology as the null hypothesis. Informative sites are crucial in the determination of the maximum likelihood dichotomous topology and are therefore an obvious target for a statistical test of phylogenies. Our P-value is the probability of producing through parallel substitutions on the branches of the star topology at least as much support as that given to the maximum likelihood dichotomous topology by the aforementioned informative sites, for any of the three possible dichotomous topologies. The degree of statistical significance is simply the complement of this P-value. Ours is therefore an a posteriori testing approach, in which no dichotomous topology is specified in advance. We implement the test for the case in which all sites behave identically and the substitution model has a single parameter. Under these conditions, the P-value can be easily calculated on the basis of the probabilities of change on the branches of the most likely star topology, because under these assumptions, each site can become informative independently from every other site; accordingly, the total number of informative sites of each kind is binomially distributed. We explore the test's type I error by applying it to data produced in star topologies having all branches equally long, or having two short and two long branches, and various degrees of homoplasy. The test is conservative but we demonstrate, by means of a discreteness correction and progressively assumption-free calculations of the P-values, that the conservativeness is mostlv due to the discrete nature of informative sites and the P-values calculated empirically are moreover mostly quite accurate in absolute terms. Applying the test to data produced in dichotomous topologies with increasing internal branch length shows that, despite the test's "conservativeness," its power is much higher than that of the bootstrap, especially when the relevant informative sites are few.
机译:我们提出了一种新的程序,用于评估从四个DNA序列推断出的最可能的无根二分拓扑的统计意义。假设最可能的星形拓扑为原假设,则该过程将直接计算与该拓扑对应的信息站点对该拓扑提供的支持的P值。信息位点在确定最大似然二分拓扑中至关重要,因此是系统发育统计测试的明显目标。对于三种可能的二分拓扑,我们的P值是通过并行替换在星形拓扑的分支上产生的概率至少与上述信息性站点对最大似然二分拓扑提供的支持一样多。统计显着性程度只是该P值的补充。因此,我们的方法是后验测试方法,其中没有预先指定二分拓扑。我们针对所有网站的行为均相同且替换模型具有单个参数的情况实施测试。在这些条件下,可以根据最可能的星形拓扑的分支上的变化概率轻松计算P值,因为在这些假设下,每个站点都可以彼此独立地提供信息;因此,每种信息站点的总数是二项分布的。我们将测试应用于I型错误,方法是将其应用于以所有分支均等长,或者具有两个短和两个长分支以及不同程度的同质性的星形拓扑结构生成的数据。该测试是保守的,但是我们通过离散性校正和P值的逐步无假设计算证明,由于信息站点的离散性,保守性是最有效的,而且凭经验计算的P值大部分都相当绝对准确。将测试应用于内部分支长度增加的二分拓扑结构中产生的数据表明,尽管测试具有“保守性”,但其功能却比引导程序高得多,尤其是在相关的信息站点很少的情况下。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号