INTRINSIC EVALUATION OF TEXT MINING TOOLS MAY NOT PREDICT PERFORMANCE ON REALISTIC TASKS

机译：文本挖掘工具的内部评估可能无法预测现实任务的性能

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

Biomedical text mining and other automated techniques are beginning to achieve performance which suggests that they could be applied to aid database curators. However, few studies have evaluated how these systems might work in practice. In this article we focus on the problem of annotating mutations in Protein Data Bank (PDB) entries, and evaluate the relationship between performance of two automated techniques, a text-mining-based approach (MutationFinder) and an alignment-based approach, in intrinsic versus extrinsic evaluations. We find that high performance on gold standard data (an intrinsic evaluation) does not necessarily translate to high performance for database annotation (an extrinsic evaluation). We show that this is in part a result of lack of access to the full text of journal articles, which appears to be critical for comprehensive database annotation by text mining. Additionally, we evaluate the accuracy and completeness of manually annotated mutation data in the PDB, and find that it is far from perfect. We conclude that currently the most cost-effective and reliable approach for database annotation might incorporate manual and automatic annotation methods.

机译：生物医学文本挖掘和其他自动化技术已开始实现性能，这表明它们可用于辅助数据库管理员。但是，很少有研究评估这些系统在实际中的工作方式。在本文中，我们着重于对蛋白质数据库（PDB）条目中的突变进行注释的问题，并评估了两种自动化技术（基于文本挖掘的方法（MutationFinder）和基于比对的方法）的内在性能之间的关系。与外部评估。我们发现，在黄金标准数据上的高性能（内在评估）并不一定会转化为数据库注释的高性能（外在评估）。我们表明，这部分是由于缺乏对期刊文章全文的访问权限的结果，这对于通过文本挖掘进行全面的数据库注释至关重要。此外，我们评估了PDB中手动注释的突变数据的准确性和完整性，并发现它远非完美。我们得出的结论是，当前用于数据库注释的最具成本效益和最可靠的方法可能包含手动和自动注释方法。

著录项

来源
《Pacific Symposium on Biocomputing; 20080104-08; Kohala Coast,HI(US)》|2008年|P.640651|共2页
会议地点 Kohala CoastHI(US)
作者
J. GREGORY CAPORASO; NITA DESHPANDE; J. LYNN FINK; PHILIP E. BOURNE; K. BRETONNEL COHEN; LAWRENCE HUNTER;
展开▼
作者单位

Center for Computational Pharmacology University of Colorado Health Sciences Center, Aurora, CO, USA;

PrescientSoft Inc., San Diego, CA, USA;

Skaggs School of Pharmacy and Pharmaceutical Sciences University of California, San Diego, San Diego, CA, USA;

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类计算机的应用;生物工程学（生物技术）;
关键词

相似文献

外文文献
中文文献
专利

1. Functional evaluation of out-of-the-box text-mining tools for data-mining tasks [J] . Jung Kenneth, LePendu Paea, Iyer Srinivasan, Journal of the American Medical Informatics Association : . 2015,第1期

机译：开箱即用的文本挖掘工具用于数据挖掘任务的功能评估
2. A corpus of full-text journal articles is a robust evaluation tool for revealing differences in performance of biomedical natural language processing tools [J] . Karin Verspoor, Kevin B Cohen, Arrick Lanfranchi, BMC Bioinformatics . 2012,第1期

机译：全文期刊文章集是一种强大的评估工具，可揭示生物医学自然语言处理工具的性能差异
3. Creating efficiencies in the extraction of data from randomized trials: a prospective evaluation of a machine learning and text mining tool [J] . Gates Allison, Gates Michelle, Sim Shannon, BMC Medical Research Methodology . 2021,第1期

机译：从随机试验中提取数据的提取效率：机器学习和文本挖掘工具的预期评估
4. INTRINSIC EVALUATION OF TEXT MINING TOOLS MAY NOT PREDICT PERFORMANCE ON REALISTIC TASKS [C] . J. GREGORY CAPORASO, NITA DESHPANDE, J. LYNN FINK, Pacific Symposium on Biocomputing . 2008

机译：文本挖掘工具的内在评估可能无法预测现实任务的性能
5. Attitudinal and situational antecedents as predictors of contextual and task performance: Moderating effects of commitment and perceived supervisor support on the relationship between task characteristics and performance measures. [D] . Bellia, Roxana Edith. 2007

机译：态度和情境的先例，作为上下文和任务绩效的预测器：承诺和感知的主管支持对任务特征与绩效度量之间关系的调节作用。
6. INTRINSIC EVALUATION OF TEXT MINING TOOLS MAY NOT PREDICT PERFORMANCE ON REALISTIC TASKS [O] . J. GREGORY CAPORASO, NITA DESHPANDE, J. LYNN FINK, -1

机译：文本挖掘工具的内部评估可能无法预测现实任务的性能
7. Intrinsic evaluation of text mining tools may not predict performance on realistic tasks [O] . J. Gregory Caporaso, Nita Deshpande, J. Lynn Fink, 2008

机译：文本挖掘工具的内在评估可能无法预测实际任务的性能
8. Text Summarization Evaluation: Correlating Human Performance on an Extrinsic Task with Automatic Intrinsic Metrics [R] . President, S. F. , Dorr, B. J. 2006

机译：文本摘要评估：将外部任务的人员绩效与自动内在度量相关联

INTRINSIC EVALUATION OF TEXT MINING TOOLS MAY NOT PREDICT PERFORMANCE ON REALISTIC TASKS

摘要

著录项

相似文献

相关主题

期刊订阅