General Representation Model for Text Similarity

机译：文本相似性的通用表示模型

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Text similarity is a central issue in multiple information access tasks. General speaking, most of existing similarity models focus on a particular kind of text features such as words, n-grams, or linguistic features or distributional semantics units. In this paper, we introducea general theoretical model for integrating multiple sources in the text feature representation called Feature Projection Information model. The proposed model allows us to integrate traditional features such as words with other sources such as the output of classifiers over different categories or distributional semantics information. The theoretical analysis shows that traditional approaches can be seen as particularizations of the model. Our first empirical results support the idea that additional features in the representation step outperform the predictive power of similarity measures.

机译：文本相似性是多个信息访问任务中的核心问题。一般而言，大多数现有的相似性模型都集中在一种特殊的文本特征上，例如单词，n-gram，语言特征或分布语义单元。在本文中，我们介绍了一种在文本特征表示中集成多个源的通用理论模型，称为特征投影信息模型。提出的模型使我们能够将诸如单词之类的传统特征与其他来源（例如，不同类别上的分类器的输出或分布语义信息）进行集成。理论分析表明，传统方法可以看作是模型的特殊化。我们的第一个实证结果支持这样的想法，即表示步骤中的其他功能胜过相似性度量的预测能力。

著录项

来源
《International workshop on future and emergent trends in language technology》|2017年|158-169|共12页
会议地点
作者
Fernando Giner; Enrique Amigo;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Enrichment; Features; Similarity; Text representation;

机译：丰富;产品特点;相似;文字表示;

相似文献

外文文献
中文文献
专利

1. Text Document Categorization using Enhanced Sentence Vector Space Model and Bi-Gram Text Representation Model Based on Novel Fusion Techniques [J] . Abdisa Demissie Amensisa New Media and Mass Communication . 2020,第4期

机译：基于新型融合技术的基于增强句子矢量空间模型和双革文本表示模型的文本文档分类
2. Gaze patterns reveal how situation models and text representations contribute to episodic text memory [J] . Johansson Roger, Oren Franziska, Holmqvist Kenneth Cognition: International Journal of Cognitive Psychology . 2018,第期

机译：凝视图案揭示了如何模型和文本表示如何促成巨大的文本记忆
3. SimiT: A Text Similarity Method Using Lexicon and Dependency Representations [J] . Emrah Inan New Generation Computing . 2020,第3期

机译：SIMIT：使用Lexicon和依赖表示的文本相似性方法
4. Calculating similarity between texts using graph-based text representation model [C] . Junji Tomita, Hidekazu Nakawatase, Megumi Ishii ACM international conference on Information and knowledge management . 2004

机译：使用基于图的文本表示模型计算文本之间的相似度
5. An Automatic Similarity Detection Engine Between Sacred Texts Using Text Mining and Similarity Measures [D] . Qahl, Salha Hassan Muhammed. 2014

机译：使用文本挖掘和相似度度量的神圣文本之间的自动相似度检测引擎
6. Approach for Text Classification Based on the Similarity Measurement between Normal Cloud Models [O] . Jin Dai, Xin Liu -1

机译：基于正常云模型之间相似度度量的文本分类方法
7. Comparative Analysis of N-gram Text Representation on Igbo Text Document Similarity [O] . Ifeanyi-Reuben Nkechi J., Ugwu Chidiebere, Nwachukwu E. O. 2017

机译：N-GRAN文本表示对IGBO文本文献相似性的比较分析

General Representation Model for Text Similarity

摘要

著录项

相似文献

相关主题

期刊订阅