...
首页> 外文期刊>Parasitology >Discrimination of fish populations using parasites: Random Forests on a 'predictable' host-parasite system.
【24h】

Discrimination of fish populations using parasites: Random Forests on a 'predictable' host-parasite system.

机译:使用寄生虫区分鱼类种群:“可预测”宿主-寄生虫系统上的随机森林。

获取原文
获取原文并翻译 | 示例
           

摘要

We address the effect of spatial scale and temporal variation on model generality when forming predictive models for fish assignment using a new data mining approach, Random Forests (RF), to variable biological markers (parasite community data). Models were implemented for a fish host-parasite system sampled along the Mediterranean and Atlantic coasts of Spain and were validated using independent datasets. We considered 2 basic classification problems in evaluating the importance of variations in parasite infracommunities for assignment of individual fish to their populations of origin: multiclass (2-5 population models, using 2 seasonal replicates from each of the populations) and 2-class task (using 4 seasonal replicates from 1 Atlantic and 1 Mediterranean population each). The main results are that (i) RF are well suited for multiclass population assignment using parasite communities in non-migratory fish; (ii) RF provide an efficient means for model cross-validation on the baseline data and this allows sample size limitations in parasite tag studies to be tackled effectively; (iii) the performance of RF is dependent on the complexity and spatial extent/configuration of the problem; and (iv) the development of predictive models is strongly influenced by seasonal change and this stresses the importance of both temporal replication and model validation in parasite tagging studies.
机译:当使用一种新的数据挖掘方法,随机森林(RF),对可变的生物标记物(寄生虫群落数据)形成鱼类分配的预测模型时,我们解决了空间尺度和时间变化对模型一般性的影响。对西班牙地中海和大西洋沿岸的鱼类寄主-寄生虫系统实施了模型,并使用独立的数据集进行了验证。在评估寄生虫次生物多样性对将单个鱼类分配给其原种群的重要性时,我们考虑了2个基本分类问题:多类(2-5个种群模型,使用每个种群的2个季节性复制品)和2类任务(使用来自1个大西洋和1个地中海种群的4个季节性重复样本)。主要结果是:(i)RF非常适合使用非迁移性鱼类中的寄生虫群落进行多类种群分配; (ii)RF提供了一种有效的手段来对基准数据进行模型交叉验证,从而可以有效解决寄生虫标签研究中的样本量限制; (iii)射频的性能取决于问题的复杂性和空间范围/配置; (iv)预测模型的开发受到季节变化的强烈影响,这在寄生虫标记研究中强调了时间复制和模型验证的重要性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号