Drop-out Conditional Random Fields for Twitter with Huge Mined Gazetteer

机译：带有巨大地名词典的Twitter的条件退出随机字段

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

In named entity recognition task especially for massive data like Twitter, having a large amount of high quality gazetteers can alleviate the problem of training data scarcity. One could collect large gazetteers from knowledge graph and phrase embeddings to obtain high coverage of gazetteers. However, large gazetteers cause a side-effect called "feature under-training", where the gazetteer features overwhelm the context features. To resolve this problem, we propose the dropout conditional random fields, which decrease the influence of gazetteer features with a high weight. Our experiments on named entity recognition with Twitter data lead to higher F1 score of 69.38%, about 4% better than the strong baseline presented in Smith and Osborne (2006).

机译：在特别是针对Twitter之类的海量数据的命名实体识别任务中，拥有大量高质量的地名词典可以缓解训练数据稀缺的问题。一个人可以从知识图和短语嵌入中收集大型地名词典，以获得高覆盖率的地名词典。但是，大型地名词典会引起一种称为“功能不足训练”的副作用，其中，地名词典功能会淹没上下文功能。为解决此问题，我们提出了丢弃条件随机场，该条件场降低了权重较高的地名词典特征的影响。我们使用Twitter数据进行的命名实体识别实验导致F1得分更高，为69.38％，比Smith和Osborne（2006年）提出的强基准高出约4％。

著录项

来源
《Conference on the North American Chapter of the Association for Computational Linguistics: Human Language Technologies》|2016年|282-288|共7页
会议地点
作者
Eunsuk Yang; Young-Bum Kim; Ruhi Sarikaya; Yu-Seop Kim;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. Named Entity Recognition for Kannada using Gazetteers list with Conditional Random Fields [J] . K.P. Pallavi, L. Sobha, M.M. Ramya Journal of computer sciences . 2018,第5期

机译：使用带有条件随机字段的地名词典列表将其命名为卡纳达语实体识别
2. Named Entity Recognition for Kannada using Gazetteers list with Conditional Random Fields [J] . Pallavi K. P., Sobha L., Ramya M. M. Journal of computer sciences . 2018,第5期

机译：使用带有条件随机字段的地名词典列表将其命名为卡纳达语实体识别
3. A combination of active learning and self-learning for named entity recognition on Twitter using conditional random fields [J] . Van Cuong Tran, Ngoc Thanh Nguyen, Fujita Hamido, Knowledge-Based Systems . 2017,第sepa15期

机译：主动学习和自学习的结合，使用条件随机字段在Twitter上进行命名实体识别
4. Drop-out Conditional Random Fields for Twitter with Huge Mined Gazetteer [C] . Eunsuk Yang, Young-Bum Kim, Ruhi Sarikaya, Conference on the North American Chapter of the Association for Computational Linguistics: Human Language Technologies . 2016

机译：带巨大的矿瞪羚的推特的辍学条件随机字段
5. SELECTED TOPICS IN SPATIAL STATISTICAL ANALYSIS: NONSTATIONARY VECTOR KRIGING, LARGE SCALE CONDITIONAL SIMULATION OF THREE-DIMENSIONAL GAUSSIAN RANDOM FIELDS, AND HYPOTHESIS TESTING IN A CORRELATED RANDOM FIELD [D] . QUIMBY, WILLIAM F. 1986

机译：空间统计分析中的选定主题：非平稳向量Kriging，三维高斯随机场的大规模条件模拟以及相关随机场中的假设检验
6. Semi-Supervised Bidirectional Long Short-Term Memory and Conditional Random Fields Model for Named-Entity Recognition Using Embeddings from Language Models Representations [O] . Min Zhang, Guohua Geng, Jing Chen 2020

机译：使用语言模型表示的嵌入式识别命名实体识别的半监控双向短期内存和条件随机字段模型
7. Named Entity Recognition for Kannada using Gazetteers list with Conditional Random Fields [O] . K. P. Pallavi, L. Sobha, M. M. Ramya 2018

机译：使用带有条件随机字段的公鸡列表命名为kannada的实体识别

Drop-out Conditional Random Fields for Twitter with Huge Mined Gazetteer

摘要

著录项

相似文献

相关主题

期刊订阅