Phrase2VecGLM: Neural generalized language model-based semantic tagging for complex query reformulation in medical IR

机译：Phrase2VecGLM：基于神经广义语言模型的语义标记，用于医学IR中复杂的查询重构

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

In fact-based information retrieval, state-of-the-art performance is traditionally achieved by knowledge graphs driven by knowledge bases, as they can represent facts about and capture relationships between entities very well. However, in domains such as medical information retrieval, where addressing specific information needs of complex queries may require understanding query intent by capturing novel associations between potentially latent concepts, these systems can fall short. In this work, we develop a novel, completely unsupervised, neural language model-based ranking approach for semantic tagging of documents, using the document to be tagged as a query into the model to retrieve candidate phrases from top-ranked related documents, thus associating every document with novel related concepts extracted from the text. For this we extend the word embedding-based generalized language model (GLM) due to (Ganguly et al., 2015), to employ phrasal embeddings, and use the semantic tags thus obtained for downstream query expansion, both directly and in feedback loop settings. Our method, evaluated using the TREC 2016 clinical decision support challenge dataset, shows statistically significant improvement not only over various baselines that use standard MeSH terms and UMLS concepts for query expansion, but also over baselines using human expert-assigned concept tags for the queries, on top of a standard Okapi BM25-based document retrieval system.

机译：在基于事实的信息检索中，传统上，先进的性能是由知识库驱动的知识图来实现的，因为它们可以很好地表示事实并很好地捕获实体之间的关系。但是，在诸如医学信息检索之类的领域中，要解决复杂查询的特定信息需求可能需要通过捕获潜在的潜在概念之间的新颖关联来理解查询意图，这些系统可能会不完善。在这项工作中，我们开发了一种新颖的，完全不受监督的，基于神经语言模型的文档语义标记排序方法，使用被标记为文档的文档作为查询模型，以从排名最高的相关文档中检索候选短语，从而进行关联从文本中提取的每个具有新颖相关概念的文档。为此，由于（Ganguly et al。，2015），我们扩展了基于词嵌入的通用语言模型（GLM），以使用短语嵌入，并将由此获得的语义标签用于直接和反馈循环设置中的下游查询扩展。我们的方法经过TREC 2016临床决策支持挑战数据集的评估，不仅在使用标准MeSH术语和UMLS概念进行查询扩展的各种基线上，而且在使用专家分配的概念标签进行查询的基线上，都显示出统计学上的显着改善，在基于Okapi BM25的标准文档检索系统之上。

著录项

来源
《Annual meeting of the Association for Computational Linguistics;Workshop on biomedical natural language processing》|2018年|118-128|共11页
会议地点
作者
Manirupa Das; Eric Fosler-Lussier; Simon Lin; Soheil Moosavinasab; David Chen; Steve Rust; Yungui Huang; Rajiv Ramnath;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. An ambiguous tag-based query reformulation technique for an effective semantic-based social image research [J] . Mariam Bouchakwa, Yassine Ayadi, Ikram Amous Procedia Computer Science . 2020,第5期

机译：基于模糊的基于标签的查询重构技术，用于有效语义的社会图像研究
2. A query language for semantic complex event processing: Syntax, semantics and implementation [J] . Gillani Syed, Zimmermann Antoine, Picard Gauthier, Semantic web . 2019,第1期

机译：语义复杂事件处理的查询语言：语法，语义和实现
3. Training Neural Language Models with SPARQL queries for Semi-Automatic Semantic Mapping [J] . Giuseppe Futia, Antonio Vetro, Alessio Melandri, Procedia Computer Science . 2018,第22期

机译：使用SPARQL查询训练神经语言模型以进行半自动语义映射
4. Phrase2VecGLM: Neural generalized language model-based semantic tagging for complex query reformulation in medical IR [C] . Manirupa Das, Eric Fosler-Lussier, Simon Lin, Annual meeting of the Association for Computational Linguistics . 2018

机译：基于神经通用语言模型的语言模型，用于医疗IR的复杂查询重构的神经通用语言模型语义标记
5. Well-definedness, semantic type-checking, and type inference for database query languages. [D] . Vansummeren, Stijn. 2005

机译：数据库查询语言的定义明确，语义类型检查和类型推断。
6. Model-based semantic dictionaries for medical language understanding. [O] . A. M. Rassinoux, R. H. Baud, P. Ruch, 1999

机译：用于医学语言理解的基于模型的语义词典。
7. Phrase2VecGLM: Neural generalized language model–based semantic tagging for complex query reformulation in medical IR [O] . Manirupa Das, Eric Fosler-Lussier, Simon Lin, 2018

机译：基于神经通用语言模型的语言模型，用于医疗IR的复杂查询重构的神经通用语言模型语义标记

Phrase2VecGLM: Neural generalized language model-based semantic tagging for complex query reformulation in medical IR

摘要

著录项

相似文献

相关主题

期刊订阅