Author identification based on word distribution in word space

机译：基于Word Space中的Word分布的作者识别

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Author attribution has grown into an area that is more challenging from the past decade. It has become an inevitable task in many sectors like forensic analysis, law, journalism and many more as it helps to detect the author in every documentation. Here unigram/bigram features along with latent semantic features from word space were taken and the similarity of a particular document was tested using Random forest tree, Logistic Regression and Support Vector Machine in order to create a global model. Dataset from PAN Author Identification shared task 2014 is taken for processing. It has been observed that the proposed model shows state-of-art accuracy of 80% which is significantly greater when compared to the Author Identification PAN results of the year 2014.

机译：作者归因已成为过去十年更具挑战性的地区。它在许多部门中成为法医分析，法律，新闻等许多行业的不可避免的任务，因为它有助于检测每个文件中的作者。这里Unigram / Bigram特征以及来自Word空间的潜在语义特征，使用随机林树，逻辑回归和支持向量机测试了特定文档的相似性，以创建全球模型。来自PAN作者识别共享任务2014的数据集进行处理。已经观察到，与2014年的作者识别PAN结果相比，所提出的模型显示出最新的80％，这显着更大。

著录项

来源
《International conference on advances in computing, communications and informatics》|2015年||共5页
会议地点
作者
Barathi Ganesh H B; Reshma U; Anand Kumar M;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类通信;
关键词
Author attribution; Logistic Regression; PAN Author Identification 2014; Random forest tree; Support Vector Machine;

机译：作者归因;Logistic回归;Pan作者识别2014;随机林树;支持向量机;

相似文献

外文文献
中文文献
专利

1. Automatic Detection of Words Associations in Texts Based on Joint Distribution of Words Occurrences [J] . Santoni Daniele, Pourabbas Elaheh Computational Intelligence . 2016,第4期

机译：基于单词出现联合分布的文本中单词联想自动检测
2. Identification of Words and Phrases Through a Phonemic-Based Haptic Display: Effects of Inter-Phoneme and Inter-Word Interval Durations [J] . Reed Charlotte M., Tan Hong Z., Jiao Yang, ACM Transactions on Applied Perception (TAP) . 2021,第3期

机译：通过基于音素的触觉显示器识别单词和短语：音素间和词间间隔持续时间的效果
3. Words matter. Spinal Cord asks authors to choose their words carefully [J] . Harvey Lisa A. Spinal cord: the official journal of the International Medical Society of Paraplegia . 2019,第4期

机译：话语。脊髓要求作者仔细选择他们的话
4. Author identification based on word distribution in word space [C] . Barathi Ganesh H B, Reshma U, Anand Kumar M International conference on advances in computing, communications and informatics . 2015

机译：基于词空间中词分布的作者识别
5. THE IDENTIFICATION OF WORDS AND LETTERS WITHIN WORDS: A LEVELS OF PROCESSING ANALYSIS [D] . MARMUREK, CHIL HARVEY HOWARD. 1975

机译：单词中的单词和字母的识别：过程分析级别
6. Spaced words and kmacs: fast alignment-free sequence comparison based on inexact word matches [O] . Sebastian Horwege, Sebastian Lindner, Marcus Boden, 2014

机译：隔开的单词和kmac：基于不精确单词匹配的快速无比对序列比较
7. The Word-Space Model: using distributional analysis to represent syntagmatic and paradigmatic relations between words in high-dimensional vector spaces [O] . Sahlgren Magnus 2006

机译：词空间模型：使用分布分析来表示高维向量空间中词之间的句法关系和范式关系

Author identification based on word distribution in word space

摘要

著录项

相似文献

相关主题

期刊订阅