【24h】

Better Autologistic Regression

机译:更好的自回归

获取原文
           

摘要

Autologistic regression is an important probability model for dichotomous random variables observed along with covariate information. It has been used in various fields for analyzing binary data possessing spatial or network structure. The model can be viewed as an extension of the autologistic model (also known as the Ising model, quadratic exponential binary distribution, or Boltzmann machine) to include covariates. It can also be viewed as an extension of logistic regression to handle responses that are not independent. Not all authors use exactly the same form of the autologistic regression model. Variations of the model differ in two respects. First, the variable coding---the two numbers used to represent the two possible states of the variables---might differ. Common coding choices are (zero, one) and (minus one, plus one). Second, the model might appear in either of two algebraic forms: a standard form, or a recently proposed centered form. Little attention has been paid to the effect of these differences, and the literature shows ambiguity about their importance. It is shown here that changes to either coding or centering in fact produce distinct, non-nested probability models. Theoretical results, numerical studies, and analysis of an ecological data set all show that the differences among the models can be large and practically significant. Understanding the nature of the differences and making appropriate modelling choices can lead to significantly improved autologistic regression analyses. The results strongly suggest that the standard model with plus/minus coding, which we call the symmetric autologistic model, is the most natural choice among the autologistic variants.
机译:自logistic回归是与协变量信息一起观察到的二分随机变量的重要概率模型。它已在各种领域中用于分析具有空间或网络结构的二进制数据。可以将模型视为自动逻辑模型(也称为Ising模型,二次指数二进制分布或Boltzmann机器)的扩展,以包含协变量。也可以将其视为逻辑回归的扩展,以处理非独立的响应。并非所有的作者都使用完全相同形式的自动逻辑回归模型。模型的变化在两个方面有所不同。首先,变量编码(用于表示变量的两种可能状态的两个数字)可能会有所不同。常见的编码选择是(零,一)和(负一,加一)。其次,模型可能以两种代数形式出现:标准形式或最近提出的居中形式。这些差异的影响很少引起关注,文献表明它们的重要性不明确。此处显示,对编码或居中的更改实际上会产生不同的,非嵌套的概率模型。理论结果,数值研究和对生态数据集的分析都表明,这些模型之间的差异可能很大并且具有实际意义。了解差异的性质并做出适当的建模选择可以大大改善自动回归分析。结果强烈表明,带有正负编码的标准模型(我们称为对称自动对数模型)是自动对数变体中最自然的选择。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号