A New Term-Term Similarity Measure for Selecting Expansion Features in Big Data

机译：选择大数据扩展特征的新术语-术语相似性度量

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

The massive growth of information and the exponential increase in the number of documents published and uploaded online each day have led to led to the appearance of new words in the Internet. Due to the difficulty of reaching the meanings of these new terms, which play a central role in retrieving the desired information, it becomes necessary to give more importance to the sites and topics where these new words appear, or rather, to give value to the words that occur frequently with them. For this purpose, in this paper, we propose a new term-term similarity measure based on the co-occurrence and closeness of words. It relies on searching for each query feature the locations where it appears, then selecting from these locations the words which often neighbor and co-occur with the query features, and finally used the selected words in the retrieval process. Our experiments were performed using the OHSUMED test collection and show significant performance enhancement over the state-of-the-art.

机译：信息的大量增长以及每天在线发布和上传的文档数量呈指数级增长，导致出现了新词在Internet上出现。由于很难找到这些新术语的含义，这些含义在检索所需信息中起着核心作用，因此有必要更加重视这些新单词出现的位置和主题，或者更重要的是，经常出现的单词。为此，在本文中，我们提出了一种基于词的共现和接近度的新的词项相似度度量。它依赖于搜索每个查询特征出现的位置，然后从这些位置中选择经常与查询特征相邻并共同出现的单词，最后在检索过程中使用选定的单词。我们的实验是使用OHSUMED测试集进行的，并且显示出与现有技术相比显着的性能增强。

著录项

来源
《2014 International conference on advanced networking distributedystems and applications》|2014年|87-92|共6页
会议地点 Bejaia(DZ)
作者
Khennak I.; Drias H.;
展开▼
作者单位

Lab. for Res. in Artificial Intell. Comput. Sci. Dept., USTHB, Algiers, Algeria;

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类
关键词
Big Data; Internet; query processing; Big Data; Internet; OHSUMED test collection; expansion features; information retrieval; query feature; term-term similarity measure; Dictionaries; Google; Indexing; Probabilistic logic; Q measurement; Vocabulary; information retrieval; query expansion; term co-occurrence; term proximity;

机译：大数据;互联网;查询处理;大数据;互联网; OHSUMED测试集合;扩展功能;信息检索;查询功能;项间相似性度量;词典;谷歌;索引;概率逻辑; Q测量;词汇;信息检索;查询扩展;词条共现;词条接近度;;

相似文献

外文文献
中文文献
专利

1. Adapting Measures of Clumping Strength to Assess Term-Term Similarity [J] . Abraham Bookstein, Vladimir Kulyukin, Timo Raita, Journal of the American Society for Information Science and Technology . 2003,第7期

机译：调整聚集强度以评估术语相似度
2. A combination of fuzzy similarity measures and fuzzy entropy measures for supervised feature selection [J] . Lohrmann Christoph, Luukka Pasi, Jablonska-Sabuka Matylda, Expert Systems with Application . 2018,第NOVa期

机译：模糊相似度量与模糊熵度量相结合的监督特征选择
3. Selecting a semantic similarity measure for concepts in two different CAD model data ontologies [J] . Wenlong Lu, Yuchu Qin, Qunfen Qi, Advanced engineering informatics . 2016,第3期

机译：为两个不同的CAD模型数据本体中的概念选择语义相似性度量
4. A New Term-Term Similarity Measure for Selecting Expansion Features in Big Data [C] . Khennak I., Drias H. International conference on advanced networking distributedystems and applications . 2014

机译：用于在大数据中选择扩展功能的新任期相似度量
5. Psychophysical similarity based feature selection for nodule retrieval in CT [D] . Samala, Ravi K. 2011

机译：基于心理物理相似度的CT结节检索特征选择
6. ClusTrack: Feature Extraction and Similarity Measures for Clustering of Genome-Wide Data Sets [O] . Halfdan Rydbeck, Geir Kjetil Sandve, Egil Ferkingstad, -1

机译：ClusTrack：用于全基因组数据集聚类的特征提取和相似性度量
7. Feature selection using Fuzzy Entropy measures with Yu's Similarity measure [O] . Cesar Iyakaremye 2012

机译：基于模糊熵测度和余氏相似度测度的特征选择

A New Term-Term Similarity Measure for Selecting Expansion Features in Big Data

摘要

著录项

相似文献

相关主题

期刊订阅