首页> 外文OA文献 >Understanding Topic Models in Context: A Mixed-Methods Approach to the Meaningful Analysis of Large Document Collections
【2h】

Understanding Topic Models in Context: A Mixed-Methods Approach to the Meaningful Analysis of Large Document Collections

机译:理解语境中的主题模型:大文档集合有意义分析的混合方法

摘要

In recent years, we have witnessed an unprecedented proliferation of large document collections. This development has spawned the need for appropriate analytical means. In particular, to seize the thematic composition of large document collections, researchers increasingly draw on quantitative topic models. Among their most prominent representatives is the Latent Dirichlet Allocation (LDA). Yet, these models have significant drawbacks, e.g. the generated topics lack context and thus meaningfulness. Prior research has rarely addressed this limitation through the lens of mixed-methods research. We position our paper towards this gap by proposing a structured mixed-methods approach to the meaningful analysis of large document collections. Particularly, we draw on qualitative coding and quantitative hierarchical clustering to validate and enhance topic models through re-contextualization. To illustrate the proposed approach, we conduct a case study of the thematic composition of the AIS Senior Scholars' Basket of Journals.
机译:近年来,我们目睹了大型文档收藏的空前增长。这种发展催生了对适当分析手段的需求。特别是,为了抓住大型文档集的主题构成,研究人员越来越多地使用定量主题模型。在他们最杰出的代表中有潜在的狄利克雷分配(LDA)。然而,这些模型具有明显的缺点,例如生成的主题缺少上下文,因此没有意义。先前的研究很少通过混合方法研究来解决这一限制。通过提出一种结构化的混合方法来对大型文档集合进行有意义的分析,我们将论文定位于这一差距。特别是,我们利用定性编码和定量层次聚类来通过重新上下文化来验证和增强主题模型。为了说明提议的方法,我们对AIS高级学者期刊杂志的主题构成进行了案例研究。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号