Preceding Document Clustering by Graph Mining Based Maximal Frequent Termsets Preservation

Shah Syed; Amjad Mohammad

首页> 外文期刊>The international arab journal of information technology >Preceding Document Clustering by Graph Mining Based Maximal Frequent Termsets Preservation

【24h】

Preceding Document Clustering by Graph Mining Based Maximal Frequent Termsets Preservation

机译：通过基于图挖掘的最大频繁项保留来进行文档聚类

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

This paper presents an approach to cluster documents. It introduces a novel graph mining based algorithm to find frequent termsets present in a document set. The document set is initially mapped onto a bipartite graph. Based on the results of our algorithm, the document set is modified to reduce its dimensionality. Then, Bisecting K-means algorithm is executed over the modified document set to obtain a set of very meaningful clusters. It has been shown that the proposed approach, Clustering preceded by Graph Mining based Maximal Frequent Termsets Preservation (CGFTP), produces better quality clusters than produced by some classical document clustering algorithm(s). It has also been shown that the produced clusters are easily interpretable. The quality of clusters has been measured in terms of their F-measure.

机译：本文提出了一种对文档进行聚类的方法。它引入了一种新颖的基于图挖掘的算法来查找文档集中存在的频繁术语集。首先将文档集映射到二部图。根据我们算法的结果，对文档集进行了修改以降低其维数。然后，对修改后的文档集执行二等分K均值算法，以获得一组非常有意义的聚类。结果表明，与基于某些经典文档聚类算法产生的聚类相比，所提出的方法聚类在基于图挖掘的最大频繁项保留（CGFTP）之前产生了更好的质量聚类。还显示出产生的簇易于解释。集群的质量已根据其F度量进行了度量。

著录项

来源
《The international arab journal of information technology》 |2019年第3期|364-370|共7页
作者
Shah Syed; Amjad Mohammad;
展开▼
作者单位

Jamia Millia Islamia, Dept Comp Engn, New Delhi, India;

Jamia Millia Islamia, Dept Comp Engn, New Delhi, India;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Bipartite graph; graph mining; frequent termsets mining; bisecting K-means;

机译：二部图;图挖掘;频繁项集挖掘;对分K均值;

相似文献

外文文献
中文文献
专利

1. Preceding Document Clustering by Graph Mining Based Maximal Frequent Termsets Preservation [J] . Shah Syed, Amjad Mohammad The international arab journal of information technology . 2019,第3期

机译：基于Graph挖掘的上述文档聚类基于最大频繁启动保存
2. Preceding Document Clustering by Graph Mining Based Maximal Frequent Termsets Preservation [J] . Shah Syed, Amjad Mohammad Annals of the American Thoracic Society . 2019,第3期

机译：基于Graph挖掘的先前文档聚类基于最大频繁启动保存
3. Maximal Frequent Term Based Document Clustering [J] . Harsha Patil, Ramjeevan Singh Thakur International Journal of Applied Engineering Research . 2017,第22aPta4期

机译：基于最大频繁的文档群集
4. Document Clustering Based on Maximal Frequent Sequences [C] . Edith Hernandez-Reyes, Rene A. Garcia-Hernandez, J.A. Carrasco-Ochoa, International Conference on Advances in Natural Language Processing(NLP, FinTAL2006); 20060823-25; Turku(FI) . 2006

机译：基于最大频繁序列的文档聚类
5. Aspect-based opinion mining of product reviews in microblogs using most relevant frequent clusters of terms. [D] . Ejieh, Chukwuma. 2016

机译：使用最相关的频繁术语集群在微博中基于方面的产品评论意见挖掘。
6. RASMA: a reverse search algorithm for mining maximal frequent subgraphs [O] . Saeed Salem, Mohammed Alokshiya, Mohammad Al Hasan 2021

机译：RASMA：用于采矿最大频繁子图的反向搜索算法
7. Multi-objective Frequent Termset Clustering [O] . 2008

机译：多目标频繁项集聚

Preceding Document Clustering by Graph Mining Based Maximal Frequent Termsets Preservation

摘要

著录项

相似文献

相关主题

期刊订阅