首页> 外文期刊>Information Processing & Management >Unsupervised Latent Dirichlet Allocation for supervised question classification
【24h】

Unsupervised Latent Dirichlet Allocation for supervised question classification

机译:用于监督问题分类的无监督潜在Dirichlet分配

获取原文
获取原文并翻译 | 示例
           

摘要

Question answering systems assist users in satisfying their information needs more precisely by providing focused responses to their questions. Among the various systems developed for such a purpose, community-based question answering has recently received researchers’ attention due to the large amount of user-generated questions and answers in social question-and-answer platforms. Reusing such data sources requires an accurate information retrieval component enhanced by a question classifier. The question classification gives the system the possibility to have information about question categories to focus on questions and answers from relevant categories to the input question. In this paper, we propose a new method based on unsupervised Latent Dirichlet Allocation for classifying questions in community-based question answering. Our method first uses unsupervised topic modeling to extract topics from a large amount of unlabeled data. The learned topics are then used in the training phase to find their association with the available category labels in the training data. The category mixture of topics is finally used to predict the label of unseen data.
机译:问题回答系统通过针对他们的问题提供有针对性的回答,帮助用户更精确地满足他们的信息需求。在为此目的而开发的各种系统中,基于社区的问答系统由于在社交问答平台中由用户生成的大量问答而引起了研究人员的关注。重用此类数据源需要由问题分类器增强的准确信息检索组件。通过问题分类,系统可以获取有关问题类别的信息,以专注于从相关类别到输入问题的问题和答案。在本文中,我们提出了一种基于无监督的潜在狄利克雷分配的新方法,用于在社区问答中对问题进行分类。我们的方法首先使用无监督主题建模从大量未标记的数据中提取主题。然后,在训练阶段将学习到的主题用于在训练数据中找到它们与可用类别标签的关联。主题的类别混合最终用于预测未见数据的标签。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号