首页> 外文OA文献 >Exploring the item difficulty and other psychometric properties of the core perceptual, verbal, and working memory subtests of the WAIS-IV using item response theory
【2h】

Exploring the item difficulty and other psychometric properties of the core perceptual, verbal, and working memory subtests of the WAIS-IV using item response theory

机译:使用项目反应理论探索WAIS-IV的核心知觉,言语和工作记忆子测验的项目难度和其他心理计量特性

摘要

The ceiling and basal rules of the Wechsler Adult Intelligence Scale – Fourth Edition (WAIS-IV; Wechsler, 2008) only function as intended if subtest items proceed in order of difficulty. While many aspects of the WAIS-IV have been researched, there is no literature about subtest item difficulty and precise item difficulty values are not available. The WAIS-IV was developed within the framework of Classical Test Theory (CTT) and item difficulty was most often determined using p-values. One limitation of this method is that item difficulty values are sample dependent. Both standard error of measurement, an important indicator of reliability, and p-values change when the sample changes. A different framework within which psychological tests can be created, analyzed and refined is called Item Response Theory (IRT). IRT places items and person ability onto the same scale using linear transformations and links item difficulty level to person ability. As a result, IRT is said to be produce sample-independent statistics. Rasch modeling, a form of IRT, is one parameter logistic model that is appropriate for items with only two response options and assumes that the only factors affecting test performance are characteristics of items, such as their difficulty level or their relationship to the construct being measured by the test, and characteristics of participants, such as their ability levels. The partial credit model is similar to the standard dichotomous Rasch model, except that it is appropriate for items with more than two response options. Proponents of standard dichotomous Rasch model argue that it has distinct advantages above both CTT-based methods as well as other IRT models (Bond u26 Fox, 2007; Embretson u26 Reise, 2000; Furr u26 Bacharach, 2013; Hambleton u26 Jones, 1993) because of the principle of monotonicity, also referred to as specific objectivity, the principle of additivity or double cancellation, which “establishes that two parameters are additively related to a third variable” (Embretson u26 Reise, 2000, p. 148). In other words, because of the principle of monotonicity, in Rasch modeling, probability of correctly answering an item is the additive function of individuals’ ability, or trait level, and the item’s degree of difficulty. As ability increases, so does an individual’s probability of answering that item. Because only item difficulty and person ability affect an individual’s chance of correctly answering an item, inter-individual comparisons can be made even if individuals did not receive identical items or items of the same difficulty level. This is why Rasch modeling is referred to as a test-free measurement. The purpose of this study was to apply a standard dichotomous Rasch model or partial credit model to the individual items of seven core perceptual, verbal and working memory subtests of the WAIS-IV: Block Design, Matrix Reasoning, Visual Puzzles, Similarities, Vocabulary, Information, Arithmetic Digits Forward, Digits Backward and Digit Sequencing. Results revealed that WAIS-IV subtests fall into one of three categories: optimally ordered, near optimally ordered and sub-optimally ordered. Optimally ordered subtests, Digits Forward and Digits Backward, had no disordered items. Near optimally ordered subtests were those with one to three disordered items and included Digit Sequencing, Arithmetic, Similarities and Block Design. Sub-optimally ordered subtests consisted of Matrix Reasoning, Visual Puzzles, Information and Vocabulary, with the number of disordered items ranging from six to 16. Two major implications of the result of this study were considered: the impact on individuals’ scores and the impact on overall test administration time. While the number of disordered items ranged from 0 to 16, the overall impact on raw scores was deemed minimal. Because of where the disordered items occur in the subtest, most individuals are administered all the items that they would be expected to answer correctly. A one-point reduction in any one subtest is unlikely to significantly affect overall index scores, which are the scores most commonly interpreted in the WAIS-IV. However, if an individual received a one-point reduction across all subtests, this may have a more noticeable impact on index scores. In cases where individuals discontinue before having a chance to answer items that were easier, clinicians may consider testing the limits. While this would have no impact on raw scores, it may provide clinicians with a better understanding of individuals’ true abilities. Based on the findings of this study, clinicians may consider administering only certain items in order to test the limits, based on the items’ difficulty value. This study found that the start point for most subtests is too easy for most individuals. For some subtests, most individuals may be administered more than 10 items that are too easy for them. Other than increasing overall administration time, it is not clear what impact, of any, this has. However, it does suggest the need to reevaluate current start items so that they are the true basal for most people. Future studies should break standard test administration by ignoring basal and ceiling rules to collect data on more items. In order to help clarify why some items are more or less difficult than would be expected given their ordinal rank, future studies should include a qualitative aspect, where, after each subtest, individuals are asked describe what they found easy and difficult about each item. Finally, future research should examine the effects of item ordering on participant performance. While this study revealed that only minimal reductions in index scores likely result from the prematurely stopping test administration, it is not known if disordering has other impacts on performance, perhaps by increasing or decreasing an individual’s confidence.
机译:韦氏成人智力量表第四版(WAIS-IV;韦氏,2008)的上限和基础规则仅在子测验项目按照难度顺序进行时才按预期发挥作用。虽然已经研究了WAIS-IV的许多方面,但没有关于子测验项目难度的文献,也没有精确的项目难度值。 WAIS-IV是在经典测试理论(CTT)框架内开发的,项目难度通常是使用p值确定的。该方法的局限性在于,项目难度值取决于样本。当样品变化时,标准的测量误差(可靠性的重要指标)和p值都会变化。可以创建,分析和完善心理测验的另一个框架称为项目反应理论(IRT)。 IRT使用线性变换将项目和人员能力置于相同的比例,并将项目难度级别与人员能力相关联。结果,IRT被认为是独立于样本的统计数据。 Rasch建模是IRT的一种形式,它是一种参数逻辑模型,适用于只有两个响应选项的项目,并假设影响测试性能的唯一因素是项目的特征,例如其难度级别或与被测结构的关系通过测试以及参与者的特征,例如他们的能力水平。局部信用模型类似于标准的二分Rasch模型,不同之处在于它适用于具有两个以上响应选项的项目。支持标准二分法Rasch模型的人认为,与基于CTT的方法以及其他IRT模型相比,它具有明显的优势(Bond u26 Fox,2007; Embretson u26 Reise,2000; Furr u26 Bacharach,2013; Hambleton u26 Jones (1993年),因为单调性原则(也称为特定客观性),可加性或双重抵消原则,即“确定两个参数与第三个变量累加相关”(Embretson u26 Reise,2000,第148页) )。换句话说,由于单调性的原理,在Rasch建模中,正确回答某项问题的可能性是个人能力或特质水平以及该项目难易程度的累加函数。随着能力的提高,个人回答该项目的可能性也会增加。因为只有项目难度和人的能力会影响个人正确回答项目的机会,所以即使个人没有收到相同项目或相同难度级别的项目,也可以进行个体间比较。这就是为什么Rasch建模被称为免测试测量的原因。这项研究的目的是将标准的二分Rasch模型或部分信用模型应用于WAIS-IV的七个核心知觉,言语和工作记忆子测验的各个项目:块设计,矩阵推理,视觉难题,相似性,词汇,信息,算术数字前进,数字后退和数字排序。结果显示,WAIS-IV子测试属于以下三类之一:最佳排序,接近最佳排序和次最佳排序。最佳排序的子测试,“向前数字”和“向后数字”没有混乱的项目。接近最佳排序的子测试是那些包含一到三个无序项目的子测试,其中包括数字排序,算术,相似性和模块设计。次优排序的子测验由矩阵推理,视觉谜题,信息和词汇组成,无序题的数量在6到16之间。考虑了这项研究结果的两个主要含义:对个人分数的影响和影响总体测试管理时间。尽管无序项目的数量从0到16不等,但对原始分数的总体影响被认为是最小的。由于在子测验中出现乱序的项目,大多数个人都被管理了所有可以正确回答的项目。任何一项子测试中的单点降低都不太可能显着影响整体指数得分,这是WAIS-IV中最常解释的得分。但是,如果某人在所有子测验中获得了1分的降低,这可能会对指数得分产生更明显的影响。如果个人在有机会回答更容易解决的问题之前停药,临床医生可以考虑测试极限值。虽然这不会影响原始分数,但可以使临床医生更好地了解个人的真实能力。根据这项研究的结果,临床医生可能会考虑根据某些物品的难度值仅管理某些物品以测试限量。这项研究发现,对于大多数个人而言,大多数子测验的起点太容易了。对于某些子测试,大多数个人可能会被管理超过10个项目,这对他们来说太容易了。除了增加整体管理时间,尚不清楚这会产生什么影响。但是,它确实建议需要重新评估当前的开始项目,以便它们成为大多数人的真正基础。未来的研究应该通过忽略基础规则和上限规则来收集更多项目的数据,从而打破标准的考试管理。为了帮助弄清为什么有些项目比按顺序排列的项目难度要大或小,未来的研究应包括定性方面,在每个子测试之后,要求个人描述他们发现每个项目的难点和难点。最后,未来的研究应检查项目订购对参与者绩效的影响。尽管这项研究表明,过早停止考试管理可能只会使指数得分降低最小,但尚不清楚乱序是否会对表现产生其他影响,可能是通过增加或降低个人的自信心。

著录项

  • 作者

    Schleicher-Dilks Sara Ann;

  • 作者单位
  • 年度 2015
  • 总页数
  • 原文格式 PDF
  • 正文语种
  • 中图分类

相似文献

  • 外文文献
  • 中文文献
  • 专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号