【24h】

A Framework for Developing and Evaluating Word Embeddings of Drug-named Entity

机译:开发和评估药物命名实体的词嵌入的框架

获取原文

摘要

We investigate the quality of task specific word embeddings created with relatively small, targeted corpora. We present a comprehensive evaluation framework including both intrinsic and extrinsic evaluation that can be expanded to named entities beyond drug name. Intrinsic evaluation results tell that drug name embeddings created with a domain specific document corpus outperformed the previously published versions that derived from a very large general text corpus. Extrinsic evaluation uses word embedding for the task of drug name recognition with Bi-LSTM model and the results demonstrate the advantage of using domain-specific word embeddings as the only input feature for drug name recognition with F1-score achieving 0.91. This work suggests that it may be advantageous to derive domain specific embeddings for certain tasks even when the domain specific corpus is of limited size.
机译:我们调查使用相对较小的目标语料库创建的特定于任务的单词嵌入的质量。我们提出了一个全面的评估框架,包括内在和外在评估,可以将其扩展到药品名称以外的命名实体。内部评估结果表明,使用特定领域文档语料库创建的药品名称嵌入优于以前发布的源自非常大的通用文本语料库的版本。外在评估使用Bi-LSTM模型将单词嵌入用于药物名称识别的任务,结果证明了使用领域特定单词嵌入作为药物名称识别的唯一输入特征的优势,F1得分达到0.91。这项工作表明,即使在特定领域语料库的大小有限的情况下,为某些任务派生特定领域的嵌入也可能是有利的。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号