...
首页> 外文期刊>Chemometrics and Intelligent Laboratory Systems >Investigating the need for preprocessing of near-infrared spectroscopic data as a function of sample size
【24h】

Investigating the need for preprocessing of near-infrared spectroscopic data as a function of sample size

机译:调查作为样本大小的函数的近红外光谱数据预处理的需要

获取原文
获取原文并翻译 | 示例
           

摘要

Preprocessing of near-infrared (NIR) spectra is an essential part of multivariate calibration. It mainly aims to remove artefacts caused during measurement to improve prediction performance or interpretation. However, preprocessing can have undesired side-effects. Additionally, calibration algorithms can learn to deal with artefacts by themselves when enough samples are available. This may influence the effect preprocessing has on prediction performance when the calibration dataset size increases. In this paper we investigate the interaction between the size of the calibration data and preprocessing for NIR calibrations for several datasets. Results show that extending the calibration data with more samples improves prediction performance, regardless of the preprocessing strategy. Although prediction performance almost always benefits from preprocessing, extending the calibration data can reduce the effect of preprocessing on prediction performance. This means the optimal preprocessing strategy may change as a function of the number of samples. It is demonstrated that using a Design of Experiments (DoE) approach to determine the optimal preprocessing strategy leads to equal or better prediction performance for all calibration set sizes compared to the case of not preprocessing at all. Preprocessing is most valuable for small calibration sets, but as the calibration set increases can become obsolete or even harmful. Therefore, we recommend to always evaluate the effect of a preprocessing strategy before making or updating calibration models.
机译:近红外(NIR)光谱的预处理是多变量校准的重要组成部分。它主要旨在去除在测量期间引起的人工制品以改善预测性能或解释。然而,预处理可能具有不希望的副作用。此外,校准算法可以学习在足够的样品时自行处理人工制品。当校准数据集大小增加时,这可能影响效果预处理对预测性能。在本文中,我们研究了校准数据大小与多个数据集的NIR校准的预处理之间的相互作用。结果表明,无论预处理策略如何,使用更多样本扩展校准数据可提高预测性能。虽然预测性能几乎总是受益于预处理,但扩展校准数据可以降低预处理对预测性能的影响。这意味着最佳预处理策略可能随着样本数量的函数而变化。结果表明,使用实验(DOE)方法来确定最佳预处理策略,与根本没有预处理的情况相比,所有校准组尺寸的相同或更好的预测性能。预处理对于小校准组是最有价值的,但由于校准组增加可能会变得过时甚至有害。因此,我们建议始终在制作或更新校准模型之前评估预处理策略的效果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号