A Simple Approach to Classify Fictional and Non-Fictional Genres

机译：对虚构和非虚构类型进行分类的简单方法

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

In this work, we deploy a logistic regression classifier to ascertain whether a given document belongs to the fiction or non-fiction genre. For genre identification, previous work had proposed three classes of features, viz., low-level (character-level and token counts), high-level (lexical and syntactic information) and derived features (type-token ratio, average word length or average sentence length). Using the Recursive feature elimination with cross-validation (RFECV) algorithm, we perform feature selection experiments on an exhaustive set of nineteen features (belonging to all the classes mentioned above) extracted from Brown corpus text. As a result, two simple features viz., the ratio of the number of adverbs to adjectives and the number of adjectives to pronouns turn out to be the most significant. Subsequently, our classification experiments aimed towards genre identification of documents from the Brown and Baby BNC corpora demonstrate that the performance of a classifier containing just the two aforementioned features is at par with that of a classifier containing the exhaustive feature set.

机译：在这项工作中，我们部署了逻辑回归分类器来确定给定的文档属于小说类型还是非小说类型。对于体裁识别，先前的工作提出了三类功能，即低级（字符级和标记计数），高级（词法和句法信息）和派生功能（类型标记比，平均单词长度或平均句子长度）。使用带有交叉验证的递归特征消除（RFECV）算法，我们对从布朗语料库文本中提取的19个特征（属于上述所有类）的详尽集合进行了特征选择实验。结果，两个简单特征即副词与形容词的数量之比和形容词与代词的数量之比被证明是最重要的。随后，我们的旨在对来自Brown and Baby BNC语料库的文档进行体裁识别的分类实验表明，仅包含上述两个特征的分类器的性能与包含详尽特征集的分类器的性能相当。

著录项

来源
《Workshop on storytelling;Annual meeting of the Association for Computational Linguistics》|2019年|81-89|共9页
会议地点 Florence(IT)
作者
Mohammed Rameez Qureshi; Rajakrishnan P. Rajkumar; Sidharth Ranjan; Kushal Shah;
展开▼
作者单位

IISER Bhopal;

IIT Delhi;

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. A COLLABORATIVE FILTERING BASED APPROACH TO CLASSIFY MOVIE GENRES USING USER RATINGS [J] . RAJI GHAWI, JUERGEN PFEFFER Journal of Data Intelligence . 2020,第4期

机译：基于协作的过滤方法来使用用户评级对电影流派进行分类
2. A RABIC SIGN LANGUAGE CHARACTERS RECOGNITION BASED ON A DEEP LEARNING APPROACH AND A SIMPLE LINEAR CLASSIFIER [J] . Ahmad Hasasneh Jordanian Journal of Computers and Information Technology . 2020,第3期

机译：基于深度学习方法和简单的线性分类器的阿拉伯语标志语言字符识别
3. Deep Learning Application to Ensemble Learning—The Simple, but Effective, Approach to Sentiment Classifying [J] . Thien Khai Tran, Tuoi Thi Phan Applied Sciences . 2019,第13期

机译：深入学习应用于集合学习 - 简单，但有效，思想探讨
4. A Simple Approach to Classify Fictional and Non-Fictional Genres [C] . Mohammed Rameez Qureshi, Rajakrishnan P. Rajkumar, Sidharth Ranjan, Workshop on storytelling . 2019

机译：分类虚构和非虚拟类型的简单方法
5. Evaluation in children's discourse: Genre differences in autobiographical and fictional narratives. [D] . MacGibbon, Ann L. 2010

机译：对儿童话语的评估：自传体和小说叙事的体裁差异。
6. Fractality and Variability in Canonical and Non-Canonical English Fiction and in Non-Fictional Texts [O] . Mahdi Mohseni, Volker Gast, Christoph Redies 2021

机译：规范和非规范英语小说和非虚构文本的快速性和可变性
7. The use of respeaking for the transcription of non-fictional genres : an exploratory study [O] . Matamala Anna 100

机译：使用重写非虚构类型的转录：一项探索性研究

A Simple Approach to Classify Fictional and Non-Fictional Genres

摘要

著录项

相似文献

相关主题

期刊订阅