首页> 外文会议>IEEE Annual Symposium on Foundations of Computer Science >An Automatic Inequality Prover and Instance Optimal Identity Testing
【24h】

An Automatic Inequality Prover and Instance Optimal Identity Testing

机译:自动不等式证明和实例最佳身份测试

获取原文

摘要

We consider the problem of verifying the identity of a distribution: Given the description of a distribution over a discrete support p = (p1, p2,, pn) how many samples (independent draws) must one obtain from an unknown distribution, q, to distinguish, with high probability, the case that p = q from the case that the total variation distance (L1 distance) ||p -- q|| 1≥ ε? We resolve this question, up to constant factors, on an instance by instance basis: there exist universal constants c, c' and a function f(p, ε) on distributions and error parameters, such that our tester distinguishes the case that p = q from ||p -- q|| 1≥ ε using f(p, ε) samples with success probability >2/3, but no tester can distinguish p = q ||p -- q|| 1≥ ε c ⋅ ε when given c' f(p, ε) samples. The function f(p, ε) is upper-bounded by a multiple of the ||p||2/3ε2, but is more complicated, and is significantly smaller in some cases when p has many small domain elements, or a single large one. This result significantly generalizes and tightens previous results: since distributions of support at most n have L2/3 norm bounded by √(n), this result immediately shows that for such distributions, O(√(n)/ε2) samples suffice, tightening the previous bound of O(√(n)polylog n/ε4) for this class of distributions, and matching the (tight) known results for the case that p is the uniform distribution over support n. The analysis of our very simple testing algorithm involves several hairy inequalities. To facilitate this analysis, we give a complete characterization of a general class of inequalities -- generalizing Cauchy-Schwarz, Holder's inequality, and the monotonicity of Lp norms. Specifically, we characterize the set of sequences (- )i = a1,, ar, (b) i = b1,, br, (c)i = c1,, cr, for which it holds that for all finite sequences of positive numbers (x)j = x1, and (y)j = y1, rΠi=1(Σj xai, j yjbi) ci ≥1. For example, the standard Cauchy-Schwarz inequality corresponds to the sequences a = (1,0,1/2), b = (0,1,1/2), c = (1/2, 1/2, -- 1). Our characterization is of a non-traditional nature in that it uses linear programming to compute a derivation that may otherwise have to be sought through trial and error, by hand. We do not believe such a characterization has appeared in the literature, and hope its computational nature will be useful to others, and facilitate analyses like the one here.
机译:我们考虑验证分布身份的问题:鉴于对离散支持p =(p1,p2 ,, pn)上的分布的描述,必须从未知分布q中获得多少个样本(独立抽取)才能p = q与总变化距离(L1距离)|| p-q ||的可能性极高的区别1≥&?我们根据实例逐个实例地解决这个问题,直到常量因子为止:存在通用常量c,c'和分布和误差参数的函数f(p,ε),因此我们的测试人员可以区分出p =来自|| p-q ||的q 1≥ε使用成功概率> 2/3的f(p,ε)样本,但是没有测试者可以区分p = q || p-q || 1≥ε c⋅ε给定c'f(p,ε)样本时。函数f(p,ε)由|| p || 2/3ε 2的倍数上限,但更为复杂,在某些情况下(当p具有许多小的域元素时)变小得多;或者一个大的。该结果显着概括并收紧了先前的结果:由于最多n个支持的分布具有以√(n)为边界的L2 / 3范数,因此该结果立即表明,对于这种分布,O(√(n)/ε 2)样本就足够了,收紧此类分布的O(√(n)polylog n /ε 4)的上界,并在p是支撑n上的均匀分布的情况下匹配(紧)已知结果。我们非常简单的测试算法的分析涉及多个毛发不等式。为便于进行此分析,我们对不等式的一般类别进行了完整的刻画-概括了Cauchy-Schwarz,Holder不等式和Lp范式的单调性。具体来说,我们表征序列集(-)i = a1,ar,(b)i = b1,br,(c)i = c1,cr,对于所有有限的正数序列, (x)j = x1,(y)j = y1,rΠi= 1(Σjxai,j y​​jbi)ci≥1。例如,标准的Cauchy-Schwarz不等式对应于以下序列a =(1,0,1 / 2),b =(0,1,1 / 2),c =(1/2,1/2,- 1)。我们的表征是非传统的,因为它使用线性编程来计算导数,否则可能需要通过反复试验来手工寻找。我们不认为这种特征已经出现在文献中,希望它的计算性质对其他人有用,并有助于像此处的分析那样进行分析。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号