Comment-based Multi-View Clustering of Web 2.0 Items

机译：Web 2.0项目基于注释的多视图群集

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Clustering Web 2.0 items (i.e., web resources like videos, images) into semantic groups benefits many applications, such as organizing items, generating meaningful tags and improving web search. In this paper, we systematically investigate how user-generated comments can be used to improve the clustering of Web 2.0 items. In our preliminary study of Last.fm, we find that the two data sources extracted from user comments - the textual comments and the commenting users - provide complementary evidence to the items' intrinsic features. These sources have varying levels of quality, but we importantly we find that incorporating all three sources improves clustering. To accommodate such quality imbalance, we invoke multi-view clustering, in which each data source represents a view, aiming to best leverage the utility of different views. To combine multiple views under a principled framework, we propose CoNMF (Co-regularized Non-negative Matrix Factorization), which extends NMF for multi-view clustering by jointly fac-torizing the multiple matrices through co-regularization. Under our CoNMF framework, we devise two paradigms - pair-wise CoNMF and cluster-wise CoNMF - and propose iterative algorithms for their joint factorization. Experimental results on Last.fm and Yelp datasets demonstrate the effectiveness of our solution. In Last.fm, CoNMF betters k-means with a statistically significant F_1 increase of 14%, while achieving comparable performance with the state-of-the-art multi-view clustering method CoSC [24]. On a Yelp dataset, CoNMF outperforms the best baseline CoSC with a statistically significant performance gain of 7%.

机译：将Web 2.0项（即视频，图像之类的Web资源）聚类为语义组会有益于许多应用程序，例如组织项，生成有意义的标签和改进Web搜索。在本文中，我们系统地研究了如何使用用户生成的注释来改进Web 2.0项目的聚类。在对Last.fm的初步研究中，我们发现从用户评论中提取的两个数据源（文本评论和评论用户）为项目的内在特征提供了补充证据。这些来源的质量水平各不相同，但重要的是，我们发现合并所有这三个来源可改善聚类。为了解决这种质量不平衡问题，我们调用了多视图聚类，其中每个数据源都代表一个视图，旨在最大程度地利用不同视图的效用。为了在一个有原则的框架下合并多个视图，我们提出了CoNMF（共正则化非负矩阵分解），它通过共同正则化共同对多个矩阵进行扩展，从而将NMF扩展为多视图聚类。在我们的CoNMF框架下，我们设计了两个范式-逐对CoNMF和聚类CoNMF-并提出了用于联合分解的迭代算法。在Last.fm和Yelp数据集上的实验结果证明了我们解决方案的有效性。在Last.fm中，CoNMF以具有统计意义的F_1增加14％改善了k均值，同时与最新的多视图聚类方法CoSC取得了可比的性能[24]。在Yelp数据集上，CoNMF优于最佳基准CoSC，具有统计上显着的7％的性能提升。

著录项

来源
《International conference on world wide web》|2014年|771-781|共11页
会议地点
作者
Xiangnan He; Min-Yen Kan; Peichu Xie; Xiao Chen;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Comment-based clustering; Multi-view clustering; Co-regularized NMF; CoNMF;

机译：基于评论的集群;多视图聚类;共同规范的NMF;联合会;

相似文献

外文文献
中文文献
专利

1. web-rMKL: a web server for dimensionality reduction and sample clustering of multi-view data based on unsupervised multiple kernel learning [J] . Benedict R?der, Nicolas Kersten, Marius Herr, Nucleic acids research . 2019,第W1期

机译：web-rMKL：一种基于无监督多核学习的降维和多视图数据样本聚类的Web服务器
2. Social web video clustering based on multi-view clustering via nonnegative matrix factorization [J] . Mekthanavanh Vinath, Li Tianrui, Meng Hua, International journal of machine learning and cybernetics . 2019,第10期

机译：基于非负矩阵分解的基于多视图聚类的社交网络视频聚类
3. ChemBioServer 2.0: an advanced web server for filtering, clustering and networking of chemical compounds facilitating both drug discovery and repurposing [J] . Karatzas Evangelos, Zamora Juan Eiros, Athanasiadis Emmanouil, Bioinformatics . 2020,第8期

机译：ChemBioserver 2.0：用于过滤，聚类和网络的高级Web服务器，促进药物发现和重新施用
4. Comment-based Multi-View Clustering of Web 2.0 Items [C] . Xiangnan He, Min-Yen Kan, Peichu Xie, International conference on world wide web . 2014

机译：基于评论的Web 2.0项的多视图群集
5. General Studies Writing (GSW) Digital Communication at Bowling Green State University: To Web 2.0 or not to Web 2.0? [D] . Mauk, Brianna. 2017

机译：鲍灵格林州立大学的常识写作（GSW）数字通信：是否使用Web 2.0？
6. web-rMKL: a web server for dimensionality reduction and sample clustering of multi-view data based on unsupervised multiple kernel learning [O] . Benedict Röder, Nicolas Kersten, Marius Herr, 2019

机译：web-rMKL：一种基于无监督多核学习的降维和多视图数据样本聚类的Web服务器
7. Comment-based Multi-View Clustering of Web 2.0 Items∗ [O] . Xiangnan He, Min-yen Kan, Peichu Xie, 2014

机译：基于注释的Web 2.0项目的多视图聚类*

Comment-based Multi-View Clustering of Web 2.0 Items

摘要

著录项

相似文献

相关主题

期刊订阅