首页> 外文会议>International joint conference on natural language processing >The Trumpiest Trump? Identifying a Subject's Most Characteristic Tweets
【24h】

The Trumpiest Trump? Identifying a Subject's Most Characteristic Tweets

机译:Trum The Strumist Truch?识别受试者最具特色的推文

获取原文

摘要

The sequence of documents produced by any given author varies in style and content, but some documents are more typical or representative of the source than others. We quantify the extent to which a given short text is characteristic of a specific person, using a dataset of tweets from fifteen celebrities. Such analysis is useful for generating excerpts of high-volume Twitter profiles, and understanding how representativeness relates to tweet popularity. We first consider the related task of binary author detection (is x the author of text 7*?), and report a test accuracy of 90.37% for the best of five approaches to this problem. We then use these models to compute characterization scores among all of an author's texts. A user study shows human evaluators agree with our characterization model for all 15 celebrities in our dataset, each with p-value < 0.05. We use these classifiers to show surprisingly strong correlations between characterization scores and the popularity of the associated texts. Indeed, we demonstrate a statistically significant correlation between this score and tweet popularity (likes/replies/retweets) for 13 of the 15 celebrities in our study.
机译:由任何给定作者产生的文件序列因风格和内容而异,但一些文件更典型或代表源的源。我们使用来自十五名名人的推文的数据集来量化给定的简短文本是特定人物特征的程度。这种分析对于生成高批量推特配置文件的摘录是有用的,并了解代表性如何与推文人气有关。我们首先考虑二进制作者检测的相关任务(是文本7 *的作者,报告了测试精度为90.37%,以便最佳五种方法。然后,我们使用这些模型来计算所有作者文本中的特征分数。用户学习显示人类评估人员同意我们数据集中所有15个名人的特征模型,每个名人都有P值<0.05。我们使用这些分类器在表征分数和相关文本的普及之间表现出令人惊讶的强烈相关性。事实上,我们在我们研究中的15个名人中有13个的比分和推文人气(喜欢/回复/转发)之间的统计上显着相关性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号