首页> 外文会议>International conference on neural information processing >Authorship Attribution of Electronic Documents Comparing the Use of Normalized Compression Distance and Support Vector Machine in Authorship Attribution
【24h】

Authorship Attribution of Electronic Documents Comparing the Use of Normalized Compression Distance and Support Vector Machine in Authorship Attribution

机译:电子文档的作者身份归属,比较归一化压缩距离和支持向量机在作者身份归属中的使用

获取原文

摘要

Automatic attribution of text subject and even authorship attribution is possible with the use of classifiers. Previous studies used function-words and Support Vector Machine (SVM) to accomplish this task. We use a data compressor-based approach and a document similarity metric called Normalized Compression Distance (NCD). Tests were performed in the same database of a previous work, composed of 3,000 documents and 100 different authors, to allow comparison of the results. The results show that NCD can have a slightly better performance in such task, depending on the compressor used.
机译:通过使用分类器,可以自动对文本主题进行署名,甚至可以对作者进行署名。先前的研究使用功能词和支持向量机(SVM)来完成此任务。我们使用基于数据压缩器的方法和称为归一化压缩距离(NCD)的文档相似性度量。测试是在以前的工作的相同数据库中进行的,该数据库由3,000个文档和100位不同的作者组成,以比较结果。结果表明,根据所用压缩机的不同,NCD在此类任务中的性能可能会稍好一些。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号