Veracity of information in twitter data: A case study

机译：Twitter数据中信息的准确性：一个案例研究

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Twitter is a powerful real-time micro-blogging service and a platform where users communicate with each other instantaneously. Thus, tweets form an integral part of big data ecosystem. While this platform serves as an efficient information diffusion medium, it can also be used to spread misinformation intentionally or unintentionally, which can damage the reputation of an individual or a corporation. Misinformation could also be harmful to society in general. As veracity in big data gains more attention, it is also important to develop methods to estimate veracity of tweets. There are no definitive measures to determine the veracity of tweets from tweets themselves. Other information that are required to verify tweets may not be readily available. Hence, there is a need for such mechanisms to determine the level of accuracy of tweets from available data. In this paper we propose three quantitative measures we name as topic diffusion, geographic dispersion, and spam index as indicators of veracity of tweets. These measures are derived from tweets themselves independent of any corroborating data. The proposed measures are tested using tweets about oil companies as validators. To validate the proposed measures, information extracted from tweets are compared with information collected from official data sources. Our experiments show that the proposed measures were able to estimate the level of veracity among tweets in most topics we tested. We also found the measures useful to compare the veracity of different topics as points in a 3-dimensional space. Another application of veracity indices to positions of political candidates is also described.

机译：Twitter是功能强大的实时微博服务，也是用户即时相互交流的平台。因此，推文构成了大数据生态系统的组成部分。尽管此平台用作有效的信息传播介质，但它也可以用于有意或无意地传播错误信息，这可能会损害个人或公司的声誉。虚假信息也可能对整个社会有害。随着大数据准确性越来越受到关注，开发估算推文准确性的方法也很重要。没有确定的方法可以根据推文本身确定推文的准确性。验证推文所需的其他信息可能不容易获得。因此，需要这样的机制来根据可用数据确定推文的准确性水平。在本文中，我们提出了三种定量措施，分别称为主题扩散，地理分散和垃圾邮件指数，以作为推文真实性的指标。这些措施来自推文本身，与任何确证数据无关。建议的措施使用有关石油公司的推文作为验证者进行测试。为了验证提议的措施，将从推文中提取的信息与从官方数据源收集的信息进行比较。我们的实验表明，所建议的措施能够估算我们测试的大多数主题中推文之间的准确性。我们还发现了一些措施，可用于比较3维空间中不同主题的准确性。还描述了准确性指标在政治候选人职位上的另一种应用。

著录项

来源
《International Conference on Big Data and Smart Computing》|2016年|129-136|共8页
会议地点
作者
Kumar TK Ashwin; Prashanth Kammarpally; KM George;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Twitter; geographic dispersion; micro-blog; misinformation; reputation; spam rate; topic diffusion; veracity;

机译：Twitter;地理分散;微博;错误信息;声誉;垃圾邮件率;主题扩散;真实性;

相似文献

外文文献
中文文献
专利

1. Construction and dissemination of information veracity on French social media during crises: Comparison of Twitter and Wikipedia [J] . Bubendorff Sandrine, Rizza Caroline, Prieur Christophe Journal of contingencies and crisis management . 2021,第2期

机译：危机中法国社交媒体信息的建设与传播：推特和维基百科的比较
2. Rumour veracity detection on twitter using particle swarm optimized shallow classifiers [J] . Kumar Akshi, Sangwan Saurabh Raj, Nayyar Anand Multimedia Tools and Applications . 2019,第17期

机译：使用粒子群优化浅层分类器对Twitter进行谣言准确性检测
3. Rumour veracity detection on twitter using particle swarm optimized shallow classifiers [J] . Kumar Akshi, Sangwan Saurabh Raj, Nayyar Anand Multimedia Tools and Applications . 2019,第17期

机译：使用粒子群优化浅分类器的Twitter上的谣言正度检测
4. Veracity of information in twitter data: A case study [C] . Kumar TK Ashwin, Prashanth Kammarpally, KM George International Conference on Big Data and Smart Computing . 2016

机译：Twitter数据中信息的真实性：一个案例研究
5. A Case Study on Determining the Big Data Veracity: A Method to Compute the Relevance of Twitter Data [D] . Paryani, Jyotsna. 2017

机译：确定大数据准确性的案例研究：一种计算Twitter数据相关性的方法
6. The Story of Goldilocks and Three Twitter’s APIs: A Pilot Study on Twitter Data Sources and Disclosure [O] . Yoonsang Kim, Rachel Nordgren, Sherry Emery 2020

机译：金发姑娘的故事和三个Twitter API：关于Twitter数据源和披露的初步研究
7. Data Veracity of Patients and Health Consumers Reported Adverse Drug Reactions on Twitter: Key Linguistic Features, Twitter Variables, and Association Rules [O] . Tianchu Lyu, Andrew Eidson, Jungmi Jun, 2020

机译：患者和健康消费者的数据验证报告了Twitter上的不良药物反应：关键语言特征，Twitter变量和关联规则

Veracity of information in twitter data: A case study

摘要

著录项

相似文献

相关主题

期刊订阅