首页> 外文期刊>Malaysian Journal of Computer Science >Combining Social-Based Data Mining Techniques To Extract Collective Trends From Twitter
【24h】

Combining Social-Based Data Mining Techniques To Extract Collective Trends From Twitter

机译:结合基于社交的数据挖掘技术以从Twitter提取集体趋势

获取原文
           

摘要

Social Networks have become an important environment for Collective Trends extraction. The interactions amongst users provide information of their preferences and relationships. This information can be used to measure the influence of ideas, or opinions, and how they are spread within the Network. Currently, one of the most relevant and popular Social Networks is Twitter. This Social Network was created to share comments and opinions. The information provided by users is especially useful in different fields and research areas such as marketing. This data is presented as short text strings containing different ideas expressed by real people. With this representation, different Data Mining techniques (such as classification or clustering) will be used for knowledge extraction to distinguish the meaning of the opinions. Complex Network techniques are also helpful to discover influential actors and study the information propagation inside the Social Network. This work is focused on how clustering and classification techniques can be combined to extract collective knowledge from Twitter. In an initial phase, clustering techniques are applied to extract the main topics from the user opinions. Later, the collective knowledge extracted is used to relabel the dataset according to the clusters obtained to improve the classification results. Finally, these results are compared against a dataset which has been manually labelled by human experts to analyse the accuracy of the proposed method.
机译:社交网络已成为提取集体趋势的重要环境。用户之间的交互提供了他们的偏好和关系的信息。此信息可用于衡量思想或观点的影响以及它们在网络中的传播方式。当前,最相关和最受欢迎的社交网络之一是Twitter。创建此社交网络是为了分享评论和意见。用户提供的信息在不同领域和研究领域(例如市场营销)特别有用。这些数据以短文本字符串形式呈现,其中包含真实人物表达的不同想法。通过这种表示,不同的数据挖掘技术(例如分类或聚类)将用于知识提取,以区分意见的含义。复杂网络技术还有助于发现有影响力的参与者并研究社交网络内部的信息传播。这项工作的重点是如何结合使用聚类和分类技术以从Twitter提取集体知识。在初始阶段,应用聚类技术从用户意见中提取主要主题。后来,提取的集体知识用于根据获得的聚类重新标记数据集,以改善分类结果。最后,将这些结果与已由人类专家手动标记的数据集进行比较,以分析所提出方法的准确性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号