首页> 外文会议>AAAI Workshop on Analyzing Microtext >Hardtoparse: POS Tagging and Parsing the Twitter-verse
【24h】

Hardtoparse: POS Tagging and Parsing the Twitter-verse

机译:Hardtoparse:POS标记和解析推特诗

获取原文

摘要

We evaluate the statistical dependency parser, Malt, on a new dataset of sentences taken from tweets. We use a version of Malt which is trained on gold standard phrase structure Wall Street Journal (WSJ) trees converted to Stanford labeled dependencies. We observe a drastic drop in performance moving from our in-domain WSJ test set to the new Twitter dataset, much of which has to do with the propagation of part-of-speech tagging errors. Retraining Malt on dependency trees produced by a state-of-the-art phrase structure parser, which has itself been self-trained on Twitter material, results in a significant improvement. We analyse this improvement by examining in detail the effect of the retraining on individual dependency types.
机译:我们评估统计依赖解析器,麦芽,从推文中的新数据集上。我们使用一个版本的麦芽版,这是在黄金标准短语结构华尔街日记(WSJ)树上被转换为Stanford标记依赖的树木。我们观察到从我们的域WSJ测试设置到新的Twitter数据集中的性能急剧下降,其中大部分都与语音部分标记错误的传播有关。通过最先进的短语结构解析器制作的依赖树木的竞争麦芽,这本身在Twitter材料上自培训,导致显着改善。我们通过详细检查Retringing对各个依赖类型的效果来分析这种改进。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号