Investigating the need for preprocessing of near-infrared spectroscopic data as a function of sample size

Schoot Mark; Kapper Christiaan; van Kollenburg Geert H.; Postma Geert J.; van Kessel Gijs; Buydens Lutgarde M. C.; Jansen Jeroen J.

首页> 外文期刊>Chemometrics and Intelligent Laboratory Systems >Investigating the need for preprocessing of near-infrared spectroscopic data as a function of sample size

【24h】

Investigating the need for preprocessing of near-infrared spectroscopic data as a function of sample size

机译：调查作为样本大小的函数的近红外光谱数据预处理的需要

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Preprocessing of near-infrared (NIR) spectra is an essential part of multivariate calibration. It mainly aims to remove artefacts caused during measurement to improve prediction performance or interpretation. However, preprocessing can have undesired side-effects. Additionally, calibration algorithms can learn to deal with artefacts by themselves when enough samples are available. This may influence the effect preprocessing has on prediction performance when the calibration dataset size increases. In this paper we investigate the interaction between the size of the calibration data and preprocessing for NIR calibrations for several datasets. Results show that extending the calibration data with more samples improves prediction performance, regardless of the preprocessing strategy. Although prediction performance almost always benefits from preprocessing, extending the calibration data can reduce the effect of preprocessing on prediction performance. This means the optimal preprocessing strategy may change as a function of the number of samples. It is demonstrated that using a Design of Experiments (DoE) approach to determine the optimal preprocessing strategy leads to equal or better prediction performance for all calibration set sizes compared to the case of not preprocessing at all. Preprocessing is most valuable for small calibration sets, but as the calibration set increases can become obsolete or even harmful. Therefore, we recommend to always evaluate the effect of a preprocessing strategy before making or updating calibration models.

机译：近红外（NIR）光谱的预处理是多变量校准的重要组成部分。它主要旨在去除在测量期间引起的人工制品以改善预测性能或解释。然而，预处理可能具有不希望的副作用。此外，校准算法可以学习在足够的样品时自行处理人工制品。当校准数据集大小增加时，这可能影响效果预处理对预测性能。在本文中，我们研究了校准数据大小与多个数据集的NIR校准的预处理之间的相互作用。结果表明，无论预处理策略如何，使用更多样本扩展校准数据可提高预测性能。虽然预测性能几乎总是受益于预处理，但扩展校准数据可以降低预处理对预测性能的影响。这意味着最佳预处理策略可能随着样本数量的函数而变化。结果表明，使用实验（DOE）方法来确定最佳预处理策略，与根本没有预处理的情况相比，所有校准组尺寸的相同或更好的预测性能。预处理对于小校准组是最有价值的，但由于校准组增加可能会变得过时甚至有害。因此，我们建议始终在制作或更新校准模型之前评估预处理策略的效果。

著录项

来源
《Chemometrics and Intelligent Laboratory Systems》 |2020年第1期|共8页
作者
Schoot Mark; Kapper Christiaan; van Kollenburg Geert H.; Postma Geert J.; van Kessel Gijs; Buydens Lutgarde M. C.; Jansen Jeroen J.;
展开▼
作者单位

Nutricontrol NCB Laan 52 NL-5462 GE Veghel Netherlands;

Nutricontrol NCB Laan 52 NL-5462 GE Veghel Netherlands;

Radboud Univ Nijmegen Inst Mol &

Mat POB 9010 NL-6500 GL Nijmegen Netherlands;

Radboud Univ Nijmegen Inst Mol &

Mat POB 9010 NL-6500 GL Nijmegen Netherlands;

Agrifirm Innovat Ctr BV Agrifirm Landgoedlaan 20 NL-7325 AW Apeldoorn Netherlands;

Radboud Univ Nijmegen Inst Mol &

Mat POB 9010 NL-6500 GL Nijmegen Netherlands;

Radboud Univ Nijmegen Inst Mol &

Mat POB 9010 NL-6500 GL Nijmegen Netherlands;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类计量学;
关键词
Calibration modelling; Preprocessing; Design of experiments; NIR; Spectroscopy; Model maintenance;

机译：校准建模;预处理;实验设计;NIR;光谱;模型维护;

相似文献

外文文献
中文文献
专利

1. Investigating the need for preprocessing of near-infrared spectroscopic data as a function of sample size [J] . Schoot Mark, Kapper Christiaan, van Kollenburg Geert H., Chemometrics and Intelligent Laboratory Systems . 2020,第1期

机译：调查作为样本大小的函数的近红外光谱数据预处理的需要
2. Towards automated preprocessing of bulk data in digital forensic investigations using hash functions [J] . Harald Baier Information Technology . 2015,第6期

机译：使用散列函数实现数字取证调查中批量数据的自动预处理
3. Multi-Objective Genetic Algorithm-Based Sample Selection for Partial Least Squares Model Building with Applications to Near-Infrared Spectroscopic Data [J] . HIDEYUKI SHINZAWA, BOYAN LI, TAKEHIRO NAKAGAWA, Applied Spectroscopy: Society for Applied Spectroscopy . 2006,第6期

机译：基于多目标遗传算法的偏最小二乘模型建立的样本选择及其在近红外光谱数据中的应用
4. Built-in hyperspectral camera for smartphone in visible, near-infrared and middle-infrared lights region (first report): Trial products of beans-size Fourier-spectroscopic line-imager and feasibility experimental results of middle infrared spectroscopic i [C] . Ichiro ISHIMARU, Natsumi KAWASHIMA, Satsuki HOSONO Next-Generation Spectroscopic Technologies IX . 2016

机译：可见光，近红外光和中红外光区域中用于智能手机的内置高光谱相机（第一份报告）：豆大小的傅立叶光谱线成像仪的试用产品和中红外光谱仪的可行性实验结果
5. Spectroscopic, electrochemical and structural investigations of cofacial and facially functionalized porphyrin compounds synthesized via metal-mediated [2+2+2] cycloaddition methodology. [D] . Fletcher, James Terrence. 2001

机译：通过金属介导的[2 + 2 + 2]环加成法合成的表面和面部官能化卟啉化合物的光谱，电化学和结构研究。
6. A comprehensive evaluation of collapsing methods using simulated and real data: excellent annotation of functionality and large sample sizes required [O] . Carmen Dering, Inke R. König, Laura B. Ramsey, 2014

机译：使用模拟和真实数据对崩塌方法进行全面评估：功能的出色注释和所需的大样本量
7. Near-infrared K-band Spectroscopic Investigation of Seyfert 2 Nuclei in the CfA and 12 Micron Samples [O] . Imanishi, M, Alonso-Herrero, A 2004

机译：CfA和12微米样品中Seyfert 2核的近红外K波段光谱研究
8. Empirical Keying of Biographical Data: Cross-Validity as a Function of ScalingProcedures and Sample Size. (Reannouncement with New Availability Information) [R] . Devlin, S. E., Abrahams, N. M., Edwards, J. E. 1992

机译：传记数据的经验键控：交叉有效性作为缩放程序和样本量的函数。（重新公布新的可用性信息）

Investigating the need for preprocessing of near-infrared spectroscopic data as a function of sample size

摘要

著录项

相似文献

相关主题

期刊订阅