Effective Opinion Spam Detection: A Study on Review Metadata Versus Content

Ajay Rastogi; Monica Mehrotra; Syed Shafat Ali

首页> 外文期刊>Journal of Data and Information Science >Effective Opinion Spam Detection: A Study on Review Metadata Versus Content

【24h】

Effective Opinion Spam Detection: A Study on Review Metadata Versus Content

机译：有效意见垃圾邮件检测：关于审查元数据与内容的研究

获取原文

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Purpose This paper aims to analyze the effectiveness of two major types of features—metadata-based (behavioral) and content-based (textual)—in opinion spam detection. Design/methodology/approach Based on spam-detection perspectives, our approach works in three settings: review-centric (spam detection), reviewer-centric (spammer detection) and product-centric (spam-targeted product detection). Besides this, to negate any kind of classifier-bias, we employ four classifiers to get a better and unbiased reflection of the obtained results. In addition, we have proposed a new set of features which are compared against some well-known related works. The experiments performed on two real-world datasets show the effectiveness of different features in opinion spam detection. Findings Our findings indicate that behavioral features are more efficient as well as effective than the textual to detect opinion spam across all three settings. In addition, models trained on hybrid features produce results quite similar to those trained on behavioral features than on the textual, further establishing the superiority of behavioral features as dominating indicators of opinion spam. The features used in this work provide improvement over existing features utilized in other related works. Furthermore, the computation time analysis for feature extraction phase shows the better cost efficiency of behavioral features over the textual. Research limitations The analyses conducted in this paper are solely limited to two well-known datasets, viz., YelpZip and YelpNYC of Yelp.com. Practical implications The results obtained in this paper can be used to improve the detection of opinion spam, wherein the researchers may work on improving and developing feature engineering and selection techniques focused more on metadata information. Originality/value To the best of our knowledge, this study is the first of its kind which considers three perspectives (review, reviewer and product-centric) and four classifiers to analyze the effectiveness of opinion spam detection using two major types of features. This study also introduces some novel features, which help to improve the performance of opinion spam detection methods.

机译：目的本文旨在分析两种主要类型的特征 - 基于元数据（行为）和基于内容的（文本） - 在意见垃圾邮件检测的有效性。基于垃圾邮件检测视角的设计/方法/方法，我们的方法在三种设置中工作：以审查为中心（垃圾邮件检测），以审查员为中心（垃圾邮件发送）和以产品为中心（垃圾邮件靶向产品检测）。除此之外，要否定任何类型的分类器 - 偏见，我们雇用了四个分类器来获得更好，并且对所获得的结果的反映更好。此外，我们提出了一组新的特征，这些特征与一些知名的相关作品进行比较。在两个现实世界数据集上进行的实验表明了意见垃圾邮件检测中不同特征的有效性。调查结果我们的调查结果表明，行为特征比文本方式更有效，并且有效地检测到所有三种设置的意见垃圾邮件。此外，在混合特征上培训的模型产生的结果与对行为特征的培训相似，而不是文本，进一步建立了行为特征的优越性，作为意见垃圾邮件的主导指标。本工作中使用的功能提供了对其他相关工程中使用的现有功能的改进。此外，特征提取阶段的计算时间分析显示了文本上的行为特征的更好成本效率。研究限制本文进行的分析仅限于两个众所周知的数据集，yelpzip和yelp.com的yelpzip和yelpnyc。实际意义本文中获得的结果可用于改善意见垃圾邮件的检测，其中研究人员可以在改进和开发特征工程和选择技术上致力于更多地参考元数据信息。本研究的原创性/价值是我们所知的最佳，这是首先考虑三个观点（审查，审阅者和以产品为中心）和四个分类器，可以使用两种主要特征分析意见垃圾邮件检测的有效性。本研究还介绍了一些新颖功能，有助于提高意见垃圾邮件检测方法的性能。

著录项

来源
《Journal of Data and Information Science》 |2020年第2期|共35页
作者
Ajay Rastogi; Monica Mehrotra; Syed Shafat Ali;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种
中图分类
关键词
Opinion spamBehavioral featuresTextual featuresReview spammersSpam-targeted products;

机译：意见Spambehavioral everyureStextual特色临背垃圾邮件顾问攻击产品;

相似文献

外文文献
中文文献
专利

1. Effective Opinion Spam Detection: A Study on Review Metadata Versus Content [J] . Ajay Rastogi, Monica Mehrotra, Syed Shafat Ali 数据与情报科学学报：英文版 . 2020,第002期

机译：有效的意见垃圾邮件检测：评论元数据与内容的研究
2. Opinion Spam Detection in Online Reviews [J] . Ajay Rastogi, Monica Mehrotra Journal of information & knowledge management . 2017,第4期

机译：在线评论中的意见垃圾邮件检测
3. Opinion spam detection by incorporating multimodal embedded representation into a probabilistic review graph [J] . Liu Yuanchao, Pang Bo, Wang Xiaolong Neurocomputing . 2019,第Nova13期

机译：通过将多模式嵌入表示形式纳入概率查看图中来检测垃圾邮件
4. Opinion Spam Detection in Product Reviews Using Self-Training Semi-Supervised Learning Approach [C] . Dini Adni Navastara, Ana Alimatus Zaqiyah, Chastine Fatichah International Conference on Advanced Mechatronics, Intelligent Manufacture and Industrial Automation . 2019

机译：使用自我训练半监督学习方法的产品评论中的意见垃圾邮件检测
5. A Semantic Metadata Enrichment Software Ecosystem (SMESE): Its Prototypes for Digital Libraries, Metadata Enrichments and Assisted Literature Reviews. [D] . Brisebois, Ronald. 2017

机译：语义元数据丰富软件生态系统（SMESE）：其数字图书馆原型，元数据丰富和辅助文献评论。
6. Capturing Public Opinion on Public Health Topics: A Comparison of Experiences from a Systematic Review Focus Group Study and Analysis of Online User-Generated Content [O] . Emma Louise Giles, Jean M. Adams 2015

机译：捕获公众对公共卫生主题的意见：系统评价焦点小组研究以及对用户生成的在线内容进行分析后的经验比较
7. Effective Opinion Spam Detection: A Study on Review Metadata Versus Content [O] . Ajay Rastogi, Monica Mehrotra, Syed Shafat Ali 2020

机译：有效意见垃圾邮件检测：关于审查元数据与内容的研究

Effective Opinion Spam Detection: A Study on Review Metadata Versus Content

摘要

著录项

相似文献

相关主题

期刊订阅