首页> 外文会议>International conference on advances in computing, communications and informatics >Author identification based on word distribution in word space
【24h】

Author identification based on word distribution in word space

机译:基于Word Space中的Word分布的作者识别

获取原文

摘要

Author attribution has grown into an area that is more challenging from the past decade. It has become an inevitable task in many sectors like forensic analysis, law, journalism and many more as it helps to detect the author in every documentation. Here unigram/bigram features along with latent semantic features from word space were taken and the similarity of a particular document was tested using Random forest tree, Logistic Regression and Support Vector Machine in order to create a global model. Dataset from PAN Author Identification shared task 2014 is taken for processing. It has been observed that the proposed model shows state-of-art accuracy of 80% which is significantly greater when compared to the Author Identification PAN results of the year 2014.
机译:作者归因已成为过去十年更具挑战性的地区。它在许多部门中成为法医分析,法律,新闻等许多行业的不可避免的任务,因为它有助于检测每个文件中的作者。这里Unigram / Bigram特征以及来自Word空间的潜在语义特征,使用随机林树,逻辑回归和支持向量机测试了特定文档的相似性,以创建全球模型。来自PAN作者识别共享任务2014的数据集进行处理。已经观察到,与2014年的作者识别PAN结果相比,所提出的模型显示出最新的80%,这显着更大。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号