首页> 外文期刊>BMC Medical Informatics and Decision Making >Ordinal labels in machine learning: a user-centered approach to improve data validity in medical settings
【24h】

Ordinal labels in machine learning: a user-centered approach to improve data validity in medical settings

机译:机器学习中的序数标签:以用户为中心的方法来提高医疗设置中的数据有效性

获取原文
           

摘要

Despite the vagueness and uncertainty that is intrinsic in any medical act, interpretation and decision (including acts of data reporting and representation of relevant medical conditions), still little research has focused on how to explicitly take this uncertainty into account. In this paper, we focus on the representation of a general and wide-spread medical terminology, which is grounded on a traditional and well-established convention, to represent severity of health conditions (for instance, pain, visible signs), ranging from Absent to Extreme. Specifically, we will study how both potential patients and doctors perceive the different levels of the terminology in both quantitative and qualitative terms, and if the embedded user knowledge could improve the representation of ordinal values in the construction of machine learning models. To this aim, we conducted a questionnaire-based research study involving a relatively large sample of 1,152 potential patients and 31 clinicians to represent numerically the perceived meaning of standard and widely-applied labels to describe health conditions. Using these collected values, we then present and discuss different possible fuzzy-set based representations that address the vagueness of medical interpretation by taking into account the perceptions of domain experts. We also apply the findings of this user study to evaluate the impact of different encodings on the predictive performance of common machine learning models in regard to a real-world medical prognostic task. We found significant differences in the perception of pain levels between the two user groups. We also show that the proposed encodings can improve the performances of specific classes of models, and discuss when this is the case. In perspective, our hope is that the proposed techniques for ordinal scale representation and ordinal encoding may be useful to the research community, and also that our methodology will be applied to other widely used ordinal scales for improving validity of datasets and bettering the results of machine learning tasks.
机译:尽管在任何医疗法案中的内在的含义和不确定性,但诠释和决定(包括数据报告行为和相关医疗条件的代表性),仍然很少的研究专注于如何考虑到账户中的这种不确定性。在本文中,我们专注于一般和广泛的医学术语的代表,它基于传统和既定的公约,以代表缺席的健康状况(例如,疼痛,可见迹象)的严重程度极端。具体来说,我们将研究潜在的患者和医生如何在定量和定性方面感知术语的不同层次,以及嵌入式用户知识可以提高机器学习模型建设中的序数值的表示。为此目的,我们进行了一项基于调查问卷的研究研究,涉及相对较大的1,152名潜在患者和31名临床医生的研究,以表示标准和广泛应用标签的数字含义来描述健康状况。使用这些收集的值,我们随后展示并讨论基于不同的模糊组的表示,以通过考虑到域专家的看法来解决医学解释的模糊性。我们还将本用户研究的调查结果应用于评估不同编码对普通机器学习模型的预测性能的影响,这是一个真实的医学预后任务。我们发现两个用户组之间的疼痛水平的感知差异显着。我们还表明,拟议的编码可以改善特定类别的模型的性能,并在这种情况下讨论。在视角,我们的希望是,序序表格和序数编码的所提出的技术对研究界可能有用,并且我们的方法将应用于其他广泛使用的序数尺度,以改善数据集的有效性并更好地完成机器的结果学习任务。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号