首页> 外文期刊>Journal of Intelligent Information Systems >RedTweet: recommendation engine for reddit
【24h】

RedTweet: recommendation engine for reddit

机译:RedTweet:Reddit的推荐引擎

获取原文
获取原文并翻译 | 示例
           

摘要

Twitter and Reddit are two of the most popular social media sites used today. In this paper, we study the use of machine learning and WordNet-based classifiers to generate an interest profile from a user's tweets and use this to recommend loosely related Reddit threads which the reader is most likely to be interested in. We introduce a genre classification algorithm using a similarity measure derived from WordNet lexical database for English to label genres for nouns in tweets. The proposed algorithm generates a user's interest profile from their tweets based on a referencing taxonomy of genres derived from the genre-tagged Brown Corpus augmented with a technology genre. The top K genres of a user's interest profile can be used for recommending subreddit articles in those genres. Experiments using real life test cases collected from Twitter have been done to compare the performance on genre classification by using the WordNet classifier and machine learning classifiers such as SVM, Random Forests, and an ensemble of Bayesian classifiers. Empirically, we have obtained similar results from the two different approaches with a sufficient number of tweets. It seems that machine learning algorithms as well as the WordNet ontology are viable tools for developing recommendation engine based on genre classification. One advantage of the WordNet approach is simplicity and no learning is required. However, the WordNet classifier tends to have poor precision on users with very few tweets.
机译:Twitter和Reddit是当今使用的两个最受欢迎的社交媒体网站。在本文中,我们研究了使用机器学习和基于WordNet的分类器从用户的推文中生成兴趣配置文件,并以此来推荐读者最可能感兴趣的松散相关Reddit线程。我们介绍了一种体裁分类一种算法,该算法使用从WordNet词库数据库获取的英语相似度来标记推文中名词的流派。所提出的算法基于参考的分类法从用户的推文中生成用户的兴趣概况,该参考分类法是从带有技术体裁的体裁标记的布朗语料库衍生而来的。用户兴趣档案的前K个流派可用于推荐这些流派中的子书签项。通过使用WordNet分类器和机器学习分类器(例如SVM,Random Forests和贝叶斯分类器的集合),已经进行了使用从Twitter收集的真实测试案例进行的实验,以比较类型分类的性能。从经验上讲,我们从两种不同的方法中获得了足够的推文,获得了相似的结果。机器学习算法以及WordNet本体似乎是用于基于体裁分类开发推荐引擎的可行工具。 WordNet方法的优点之一是简单,不需要学习。但是,WordNet分类器对推文很少的用户的准确性往往很差。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号