Phrase2VecGLM: Neural generalized language model-based semantic tagging for complex query reformulation in medical IR

机译：基于神经通用语言模型的语言模型，用于医疗IR的复杂查询重构的神经通用语言模型语义标记

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

In fact-based information retrieval, state-of-the-art performance is traditionally achieved by knowledge graphs driven by knowledge bases, as they can represent facts about and capture relationships between entities very well. However, in domains such as medical information retrieval, where addressing specific information needs of complex queries may require understanding query intent by capturing novel associations between potentially latent concepts, these systems can fall short. In this work, we develop a novel, completely unsupervised, neural language model-based ranking approach for semantic tagging of documents, using the document to be tagged as a query into the model to retrieve candidate phrases from top-ranked related documents, thus associating every document with novel related concepts extracted from the text. For this we extend the word embedding-based generalized language model (GLM) due to (Ganguly et al., 2015), to employ phrasal embeddings, and use the semantic tags thus obtained for downstream query expansion, both directly and in feedback loop settings. Our method, evaluated using the TREC 2016 clinical decision support challenge dataset, shows statistically significant improvement not only over various baselines that use standard MeSH terms and UMLS concepts for query expansion, but also over baselines using human expert-assigned concept tags for the queries, on top of a standard Okapi BM25-based document retrieval system.

机译：在基于事实的信息检索中，传统上通过知识库驱动的知识图形来实现最先进的性能，因为它们可以代表实体之间的事实和捕捉实体之间的关系。然而，在诸如医疗信息检索的域中，在寻址复杂查询的特定信息需求可能需要了解潜在潜在的概念之间的新颖关联来了解查询意图，这些系统可以缩短。在这项工作中，我们开发了一种新颖的，完全无监督的基于神经语言模型的排名方法，用于文档的语义标记，使用该文档被标记为模型，以检索来自排名相关文档的候选短语，从而关联每个文档都从文本中提取了新的相关概念。为此，我们扩展了基于嵌入的广义语言模型（GLM）（Ganguly等，2015），雇用短语嵌入品，并使用如此获得的语义标签直接和反馈循环设置中获得的下游查询扩展。我们的方法使用TREC 2016临床决策支持挑战数据集进行了评估，不仅在使用标准网格术语和UMLS概念的各种基线上显示了统计上显着的改进，而且还使用用于查询的人类专家分配的概念标签的基准，在标准的基于OKAPI BM25的文档检索系统之上。

著录项

来源
《Annual meeting of the Association for Computational Linguistics》|2018年|xii 193 p.|共11页
会议地点
作者
Manirupa Das; Eric Fosler-Lussier; Simon Lin; Soheil Moosavinasab; David Chen; Steve Rust; Yungui Huang; Rajiv Ramnath;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类程序设计、软件工程;
关键词

相似文献

外文文献
中文文献
专利

1. An ambiguous tag-based query reformulation technique for an effective semantic-based social image research [J] . Mariam Bouchakwa, Yassine Ayadi, Ikram Amous Procedia Computer Science . 2020,第5期

机译：基于模糊的基于标签的查询重构技术，用于有效语义的社会图像研究
2. A query language for semantic complex event processing: Syntax, semantics and implementation [J] . Gillani Syed, Zimmermann Antoine, Picard Gauthier, Semantic web . 2019,第1期

机译：语义复杂事件处理的查询语言：语法，语义和实现
3. Training Neural Language Models with SPARQL queries for Semi-Automatic Semantic Mapping [J] . Giuseppe Futia, Antonio Vetro, Alessio Melandri, Procedia Computer Science . 2018,第22期

机译：使用SPARQL查询训练神经语言模型以进行半自动语义映射
4. Phrase2VecGLM: Neural generalized language model-based semantic tagging for complex query reformulation in medical IR [C] . Manirupa Das, Eric Fosler-Lussier, Simon Lin, Annual meeting of the Association for Computational Linguistics;Workshop on biomedical natural language processing . 2018

机译：Phrase2VecGLM：基于神经广义语言模型的语义标记，用于医学IR中复杂的查询重构
5. Well-definedness, semantic type-checking, and type inference for database query languages. [D] . Vansummeren, Stijn. 2005

机译：数据库查询语言的定义明确，语义类型检查和类型推断。
6. Model-based semantic dictionaries for medical language understanding. [O] . A. M. Rassinoux, R. H. Baud, P. Ruch, 1999

机译：用于医学语言理解的基于模型的语义词典。
7. Phrase2VecGLM: Neural generalized language model–based semantic tagging for complex query reformulation in medical IR [O] . Manirupa Das, Eric Fosler-Lussier, Simon Lin, 2018

机译：基于神经通用语言模型的语言模型，用于医疗IR的复杂查询重构的神经通用语言模型语义标记

Phrase2VecGLM: Neural generalized language model-based semantic tagging for complex query reformulation in medical IR

摘要

著录项

相似文献

相关主题

期刊订阅