首页> 外文OA文献 >A Bootstrapping-based Method to Automatically Identify Data-usage Statements in Publications
【2h】

A Bootstrapping-based Method to Automatically Identify Data-usage Statements in Publications

机译:基于引导的方法,用于在出版物中自动识别数据使用陈述

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Purpose: Our study proposes a bootstrapping-based method to automatically extract data-usage statements from academic texts.Design/methodology/approach: The method for data-usage statements extraction starts with seed entities and iteratively learns patterns and data-usage statements from unlabeled text. In each iteration, new patterns are constructed and added to the pattern list based on their calculated score. Three seed-selection strategies are also proposed in this paper.Findings: The performance of the method is verified by means of experiments on real data collected from computer science journals. The results show that the method can achieve satisfactory performance regarding precision of extraction and extensibility of obtained patterns.Research limitations: While the triple representation of sentences is effective and efficient for extracting data-usage statements, it is unable to handle complex sentences. Additional features that can address complex sentences should thus be explored in the future.Practical implications: Data-usage statements extraction is beneficial for data-repository construction and facilitates research on data-usage tracking, dataset-based scholar search, and dataset evaluation.Originality/value: To the best of our knowledge, this paper is among the first to address the important task of automatically extracting data-usage statements from real data.
机译:目的:我们的研究提出了一种基于引导的方法来自动从学术文本中提取数据用法语句。设计/方法/方法:数据用法语句提取的方法从种子实体开始,迭代地从未标记的文本中了解模式和数据用法语句。在每次迭代中,基于其计算得分构建并将新模式构造并添加到模式列表中。本文还提出了三种种子选择策略。调查结果:通过从计算机科学期刊收集的实际数据进行实验来验证该方法的性能。结果表明,该方法可以实现令人满意的提取精度和所获得的模式的伸展性的性能。研究限制:虽然句子的三重表示是有效且有效地提取数据用法语句,但它无法处理复杂的句子。因此,可以在将来解决复杂句子的其他功能。实际意义:数据使用陈述提取是有益的数据存储库构造,并促进数据使用跟踪,基于数据集的学者搜索和数据集评估的研究。原创性/值:据我们所知,本文是第一个解决自动从真实数据中提取数据用法语句的重要任务的重要任务之一。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号