首页> 外文会议>International Workshop on Semantic Evaluation >UPB at SemEval-2020 Task 12: Multilingual Offensive Language Detection on Social Media by Fine-tuning a Variety of BERT-based Models

【24h】

UPB at SemEval-2020 Task 12: Multilingual Offensive Language Detection on Social Media by Fine-tuning a Variety of BERT-based Models

机译：在Semeval-2020的UPB任务12：通过微调各种基于BERT的型号来社交媒体对社交媒体的多语言攻击性语言检测

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Offensive language detection is one of the most challenging problem in the natural language processing field, being imposed by the rising presence of this phenomenon in online social media. This paper describes our Transformer-based solutions for identifying offensive language on Twitter in five languages (i.e., English, Arabic. Danish. Greek, and Turkish), which was employed in Subtask A of the Offenseval 2020 shared task. Several neural architectures (i.e., BERT, mBERT, Roberta, XLM-Roberta, and ALBERT), pre-trained using both single-language and multilingual corpora, were fine-tuned and compared using multiple combinations of datasets. Finally, the highest-scoring models were used for our submissions in the competition, which ranked our team 21st of 85, 28th of 53, 19th of 39, 16th of 37, and 10th of 46 for English, Arabic, Danish, Greek, and Turkish, respectively.

机译：令人反感的语言检测是自然语言处理领域中最具挑战性的问题之一，在在线社交媒体上的这种现象的存在上升。本文介绍了我们的变换器的解决方案，用于以五种语言识别Twitter上的攻击性语言（即英语，阿拉伯文。丹麦语。希腊语和土耳其语），该方法是在违法者2020共享任务的子任务中使用的。使用单语言和多语言语料库预培训的几个神经架构（即，BERT，MBERT，ROBERTA，XLM-ROBERTA和ALBERT）进行了微调，并使用多种数据集的多种组合进行了微调。最后，最高评分的模型被用于我们的提交的竞争中，该竞争中的意见书将我们的团队排名第215,85,28，19,39，19，37，第37和第10次和46名，为英语，阿拉伯语，丹麦语，希腊语，和土耳其人分别。

著录项

来源
《International Workshop on Semantic Evaluation》|2020年|2222-2231|共10页
会议地点
作者
Mircea-Adrian Tanase; Dumitru-Clementin Cercel; Costin-Gabriel Chiru;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. FakeBERT: Fake news detection in social media with a BERT-based deep learning approach [J] . Kaliyar Rohit Kumar, Goswami Anurag, Narang Pratik Multimedia Tools and Applications . 2021,第8期

机译：Fakebert：具有基于伯特的深度学习方法的社交媒体假新闻检测
2. Language, images, and Paris Orly airport on Instagram: multilingual approaches to identity and self-representation on social media [J] . Blackwood Robert International Journal of Multilingualism . 2019,第1期

机译：Instagram上的语言，图像和巴黎机场：在社交媒体上的身份和自我代表的多语言方法
3. Do birds of different feather flock together? Analyzing the political use of social media through a language-based approach in a multilingual context [J] . Ahmed Saifuddin, Jaidka Kokil, Cho Jaeho Computers in Human Behavior . 2018,第SEPa期

机译：不同羽毛的鸟会聚集在一起吗？在多语言环境中通过基于语言的方法分析社交媒体的政治用途
4. Ferryman at SemEval-2020 Task 12: BERT-Based Model with Advanced Improvement Methods for Multilingual Offensive Language Identification [C] . Weilong Chen, Peng Wang, Jipeng Li, International Workshop on Semantic Evaluation . 2020

机译：Semeval-2020的渡轮任务12：基于BERT的模型，具有多语言攻击性语言识别的高级改进方法
5. Detecting Offensive Social Media Text in Nepali Language [D] . ?Timilsina, Sandesh 2020

机译：进攻检测社会化媒体中的文本尼泊尔语
6. Consumers’ Purchase Intention of Organic Food via Social Media: The Perspectives of Task-Technology Fit and Post-acceptance Model [O] . Jun-Jer You, Din Jong, Uraiporn Wiangin 2020

机译：消费者通过社交媒体购买有机食品的购买意向：任务技术合适与接受后模型的角度
7. LaSTUS/TALN at SemEval-2019 Task 6: Identification and Categorization of Offensive Language in Social Media with Attention-based Bi-LSTM model [O] . Lutfiye Seda Mut Altin, Àlex Bravo Serrano, Horacio Saggion 2019

机译：Lastus / Taln在Semeval-2019任务6：基于注意力的Bi-LSTM模型的社交媒体中的识别和分类

UPB at SemEval-2020 Task 12: Multilingual Offensive Language Detection on Social Media by Fine-tuning a Variety of BERT-based Models

摘要

著录项

相似文献

相关主题

期刊订阅