首页> 外文期刊>Computer speech and language >Siamese networks for large-scale author identification
【24h】

Siamese networks for large-scale author identification

机译:暹罗网络大型作者识别

获取原文
获取原文并翻译 | 示例
       

摘要

Authorship attribution is the process of identifying the author of a text. Approaches to tackling it have been conventionally divided into classification-based ones, which work well for small numbers of candidate authors, and similarity-based methods, which are applicable for larger numbers of authors or for authors beyond the training set; these existing similarity-based methods have only embodied static notions of similarity. Deep learning methods, which blur the boundaries between classification-based and similarity-based approaches, are promising in terms of ability to learn a notion of similarity, but have previously only been used in a conventional small-closed-class classification setup.Siamese networks have been used to develop learned notions of similarity in one-shot image tasks, and also for tasks of mostly semantic relatedness in NLP. We examine their application to the stylistic task of authorship attribution on datasets with large numbers of authors, looking at multiple energy functions and neural network architectures, and show that they can substantially outperform previous approaches.
机译:作者归属是识别文本作者的过程。解决它的方法通常分为基于分类的,这对于少量候选作者来说,以及基于相似性的方法,适用于较大数量的作者或除了培训集之外的作者;这些现有的基于相似性的方法仅具有相似性的静态概念。模糊基于分类和相似性的方法之间的边界的深度学习方法在学习相似性概念的能力方面具有很大的承诺,但是先前只用于传统的小型级别分类设置.SIAMESE网络已被用来在单次图像任务中开发知识的相似概念,以及NLP中主要是语义相关性的任务。我们将其应用于具有大量作者的数据集的作者归因的体型任务,观察多个能量功能和神经网络架构,并表明它们可以大大倾向于以前的方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号