首页> 外文会议>International Conference on Artificial Neural Networks >Named Entity Recognition for Chinese Social Media with Domain Adversarial Training and Language Modeling
【24h】

Named Entity Recognition for Chinese Social Media with Domain Adversarial Training and Language Modeling

机译:基于领域对抗训练和语言建模的中国社交媒体命名实体识别

获取原文

摘要

Recent years have seen a surge of interest in natural language processing (NLP) for social media because the massive unstructured data from social media provide valuable information. However, natural language processing in this domain often suffers from the lack of large scale labeled data used for building models. In this paper, we focus specifically on the task of named entity recognition (NER) for Chinese social media. We propose a neural network model for domain adaptation which builds on domain-adversarial training and language modeling. The model is capable of learning from multiple sources of training data: labeled in-domain data, labeled out-of-domain data, as well as (large-scale) unlabeled in-domain data. To demonstrate the effectiveness of our approach, we experiment on an enlarged Chinese social media corpus. Results show-that the approach outperforms baselines significantly.
机译:近年来,社交媒体对自然语言处理(NLP)的兴趣激增,因为来自社交媒体的大量非结构化数据提供了有价值的信息。但是,该领域中的自然语言处理通常会遭受缺乏用于构建模型的大规模标记数据的困扰。在本文中,我们专门针对中国社交媒体的命名实体识别(NER)任务。我们提出了一种基于领域对抗训练和语言建模的领域适应神经网络模型。该模型能够从多种训练数据源中学习:带标签的域内数据,带标签的域外数据以及(大规模)无标签的域内数据。为了证明我们方法的有效性,我们在扩大的中国社交媒体语料库上进行了实验。结果表明,该方法明显优于基线。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号