An Improved Text Clustering Method based on Maximal Frequent Itemsets and K-means

机译：一种基于最大频繁项目集和k均值的改进的文本聚类方法

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

In this paper, we present an improved text clustering method by integrating K-means with a novel maximal frequent itemsets based clustering method. This approach automatically extract the clusters in two stages, and in each stage, we use maximal frequent itemsets to locate the initial cenlroids in the dense area. Experiments show 3-10% improvement on clustering accuracy comparing with the traditional MEI K-means clustering method. Another advantage of our method is we can get an approximate cluster number without giving a k value in K-means.

机译：在本文中，我们通过与基于新的基于最大频繁项目集的聚类方法集成了K-ilit来提出改进的文本聚类方法。该方法在两个阶段中自动提取群集，在每个阶段，我们使用最大频繁的项目集来定位密集区域中的初始Cenlroid。实验表明，与传统的Mei K均值聚类方法比较的聚类精度提高了3-10％。我们的方法的另一个优点是我们可以获得近似的簇号而不在k均值中提供k值。

著录项

来源
《IEEE international conference on signal processing systems》|2011年||共5页
会议地点
作者
Yuyan Huang; Xuan Wang; Xinxin Li;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类通信理论;
关键词
text clustering; maximal frequent itemset; Kmeans;

机译：文本群集;最大频繁的itemset;kmeans;

相似文献

外文文献
中文文献
专利

1. CL-MAX: a clustering-based approximation algorithm for mining maximal frequent itemsets [J] . Fatemi Seyed Mohsen, Hosseini Seyed Mohsen, Kamandi Ali, International journal of machine learning and cybernetics . 2021,第2期

机译：CL-MAX：用于采矿最大频繁项目集的基于聚类的近似算法
2. FICW: Frequent Itemset Based Text Clustering with Window Constraint [J] . ZHOU Chong, LU Yansheng, ZOU Lei, Wuhan University Journal of Natural Sciences . 2006,第5期

机译：FICW：具有窗口约束的基于频繁项集的文本聚类
3. DIC-DOC-K-means: Dissimilarity-based Initial Centroid selection for DOCument clustering using K-means for improving the effectiveness of text document clustering [J] . Lakshmi R., Baskar S. Journal of Information Science . 2019,第6期

机译：DIC-DOC-K-means：使用K-means的DOCument聚类基于不相似性的初始质心选择，以提高文本文档聚类的效率
4. An Improved Text Clustering Method based on Maximal Frequent Itemsets and K-means [C] . Yuyan Huang, Xuan Wang, Xinxin Li IEEE international conference on signal processing systems . 2011

机译：基于最大频繁项集和K均值的改进文本聚类方法
5. Frequent item-based text clustering. [D] . Afshar, Homayoun. 2003

机译：基于项目的频繁文本聚类。
6. SiBIC: A Web Server for Generating Gene Set Networks Based on Biclusters Obtained by Maximal Frequent Itemset Mining [O] . Kei-ichiro Takahashi, Ichigaku Takigawa, Hiroshi Mamitsuka -1

机译：SiBIC：一种基于Biclusters的基因组网络生成Web服务器该Biclusters通过最大频繁项集挖掘获得
7. Frequent Itemset-based Text Clustering Approach to Cluster Ranked Documents [O] . Snehalata Nandanwar, Geetanjali Kale, Sheetal Sonawane 2014

机译：基于项目集的基于项目的文本聚类方法来群集排名文档

An Improved Text Clustering Method based on Maximal Frequent Itemsets and K-means

摘要

著录项

相似文献

相关主题

期刊订阅