Gender Classification of Twitter Data Based on Textual Meta-Attributes Extraction

机译：基于文本元属性提取的Twitter数据的性别分类

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

With the growth of social media in recent years, there has been an increasing interest in the automatic characterization of users based on the informal content they generate. In this context, the labeling of users in demographic categories, such as age, ethnicity, origin and race, among the investigation of other attributes inherent to users, such as political preferences, personality and gender expression, has received a great deal of attention, especially based on Twitter data. The present paper focuses on the task of gender classification by using 60 textual meta-attributes, commonly used on text attribution tasks, for the extraction of gender expression linguistic cues in tweets written in Portuguese. Therefore, taking into account characters, syntax, words, structure and morphology of short length, multi-genre, content free texts posted on Twitter to classify author's gender via three different machine-learning algorithms as well as evaluate the influence of the proposed meta-attributes in this process.

机译：随着近年来社交媒体的增长，基于他们生成的非正式内容，对用户自动表征的兴趣日益增长。在这方面，人口统计类别的标签，如年龄，种族，起源和种族，在对用户固有的其他属性的调查中，例如政治偏好，人格和性别表达，都得到了大量的关注，特别是基于推特数据。本文通过使用普遍用于文本归属任务的60个文本元属性，为在葡萄牙语中撰写的推文中提取性别表达语言线索的提取来侧重于性别分类的任务。因此，考虑到短长度的字符，语法，单词，结构和形态，多类型，内容免费文本发布在Twitter上，通过三个不同的机器学习算法对作者的性别进行分类，并评估所提出的元的影响在此过程中的属性。

著录项

来源
《World Conference on Information Systems and Technologies》|2016年||共10页
会议地点
作者
José Ahirton Batista Lopes Filho; Rodrigo Pasti; Leandro Nunes de Castro;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP14-532;
关键词
machine-learning; classification; gender; social media; Twitter; extraction; meta-attributes; Portuguese language;

机译：机器学习;分类;性别;社交媒体;Twitter;提取;元属性;葡萄牙语;

相似文献

外文文献
中文文献
专利

1. Hidden data states-based complex terminology extraction from textual web data model [J] . Quantum electronics . 2020,第6期

机译：基于隐藏的基于数据状态的复杂术语从文本Web数据模型提取
2. Approaches to samples selection for machine learning based classification of textual data [J] . Darena, Frantisek, Zizka Computing and informatics . 2013,第5期

机译：基于机器学习的文本数据分类的样本选择方法
3. APPROACHES TO SAMPLES SELECTION FOR MACHINE LEARNING BASED CLASSIFICATION OF TEXTUAL DATA [J] . Frantisek Darena, Jan Zizka Computing and informatics . 2013,第5期

机译：基于机器学习的文本数据分类的样本选择方法
4. Gender Classification of Twitter Data Based on Textual Meta-Attributes Extraction [C] . José Ahirton Batista Lopes Filho, Rodrigo Pasti, Leandro Nunes de Castro World Conference on Information Systems and Technologies . 2016

机译：基于文本元属性提取的Twitter数据的性别分类
5. New covariance-based feature extraction methods for classification and prediction of high-dimensional data. [D] . Sofolahan, Mopelola A. 2013

机译：基于协方差的新特征提取方法，用于高维数据的分类和预测。
6. An automated data extraction and classification pipeline to identify a novel type of neuron within the dorsal striatum based on single-cell patch clamp and confocal imaging data [O] . Miaomiao Mao, Aditya Nair, George J. Augustine 2020

机译：一种自动数据提取和分类管道以识别基于单细胞贴片钳和共聚焦成像数据的背体内的新型神经元的神经元
7. Classification Of Twitter’s Data To Get Gender Identification [O] . Waqas Ali, Malik Tahir Hassan, Syed Fawad Raza, 2018

机译：Twitter数据进行分类以获得性别识别

Gender Classification of Twitter Data Based on Textual Meta-Attributes Extraction

摘要

著录项

相似文献

相关主题

期刊订阅