Identifying Key Sentences for Precision Oncology Using Semi-Supervised Learning

机译：使用半监督学习识别精确肿瘤学的关键句子

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

We present a machine learning pipeline that identifies key sentences in abstracts of oncological articles to aid evidence-based medicine. This problem is characterized by the lack of gold standard data-sets, data imbalance and thematic differences between available silver standard corpora. Additionally, available training and target data differs with regard to their domain (professional summaries vs. sentences in abstracts). This makes supervised machine learning inapplicable. We propose the use of two semi-supervised machine learning approaches: To mitigate difficulties arising from heterogeneous data sources, overcome data imbalance and create reliable training data we propose using transductive learning from positive and unlabelled data (PU Learning). For obtaining a realistic classification model, we propose the use of abstracts summarised in relevant sentences as unlabelled examples through Self-Training. The best model achieves 84% accuracy and 0.84 F1 score on our dataset.

机译：我们展示了一条机器学习管道，识别肿瘤文章摘要中的关键句，以帮助循证医学。此问题的特点是缺乏金标准数据集，可用银标准Corpora之间的数据不平衡和主题差异。此外，可用的培训和目标数据对其域的不同之处（摘要中的专业摘要与句子）。这使得监督机器学习不适用。我们建议使用两个半监督机器学习方法：减轻异构数据源产生的困难，克服数据不平衡，并创造了使用从积极和未标记的数据（PU学习）的转换学习的可靠培训数据。为了获得现实的分类模式，我们通过自我训练提出了在相关句子中汇总的摘要使用。最佳模型可实现84％的准确度和我们数据集的0.84 F1分数。

著录项

来源
《Annual meeting of the Association for Computational Linguistics;Workshop on biomedical natural language processing》|2018年|35-46|共12页
会议地点
作者
Jurica Seva; Martin Wackerbauer; Ulf leser;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. Combining active learning and semi-supervised learning techniques to extract protein interaction sentences [J] . Min Song, Hwanjo Yu, Wook-Shin Han BMC Bioinformatics . 2011,第SUPPLEMENTa12期

机译：结合主动学习和半监督学习技术提取蛋白质互动句
2. Effective semi-supervised learning strategies for automatic sentence segmentation [J] . Dalva Dogan, Guz Umit, Gurkan Hakan Pattern recognition letters . 2018,第APRa1期

机译：有效的半监督学习策略，用于句子自动切分
3. Identifying Health Information Technology Needs of Oncologists to Facilitate the Adoption of Genomic Medicine: Recommendations From the 2016 American Society of Clinical Oncology Omics and Precision Oncology Workshop [J] . Hughes Kevin S., Ambinder Edward P., Hess Gregory P., Journal of Clinical Oncology . 2017,第27期

机译：识别肿瘤学家的健康信息技术需求，以促进基因组医学的采用：2016年美国临床肿瘤学会常规和精密肿瘤学会的建议
4. Identifying Key Sentences for Precision Oncology Using Semi-Supervised Learning [C] . Jurica Seva, Martin Wackerbauer, Ulf leser Annual meeting of the Association for Computational Linguistics . 2018

机译：使用半监督学习识别精密肿瘤学的关键句子
5. Semi-Supervised Learning for Electronic Phenotyping in Support of Precision Medicine. [D] . Halpern, Yonatan. 2016

机译：电子表型的半监督学习，以支持精准医学。
6. Combining active learning and semi-supervised learning techniques to extract protein interaction sentences [O] . Min Song, Hwanjo Yu, Wook-Shin Han 2011

机译：结合主动学习和半监督学习技术提取蛋白质相互作用句
7. Combining active learning and semi-supervised learning techniques to extract protein interaction sentences [O] . 2011

机译：结合主动学习和半监督学习技术提取蛋白质相互作用句

Identifying Key Sentences for Precision Oncology Using Semi-Supervised Learning

摘要

著录项

相似文献

相关主题

期刊订阅