首页> 外文期刊>IFAC PapersOnLine >CIRAL: a hybrid active learning framework for plankon taxa labeling
【24h】

CIRAL: a hybrid active learning framework for plankon taxa labeling

机译:CiRAL:Plankon Taxa标签的混合活跃学习框架

获取原文
           

摘要

With the complex structure of planktonic species and an immense amount of data captured from autonomous underwater vehicles (AUVs), a large burden is placed on the domain experts for plankton taxa labeling. At the same time, the most prominent machine learning (ML) methods for classification rely heavily on a massive amount of labeled datasets to create and train neural network classifier models that perform their tasks accurately. Active Learning (AL) is an ML paradigm that reduces this manual effort by proposing algorithms that support the construction of the training datasets, thus enlarging the sets while minimizing human involvement. To build the training set, AL methods apply heuristics to select a subset of images, i.e., samples, from the entire data. The selected samples that capture the common statistical patterns or feature space are likely to include all the information needed for the training and the learning processes. In addition, the algorithm should prioritize samples that are likely belonging to multiple classes, i.e., having close inter-class boundaries, and might lead to model confusion. Many of the current AL approaches fail to incorporate both types of samples representing the statistical pattern and the samples in which the particular machine learning model is uncertain about.In this paper, we extend our framework which addresses these challenges with an augmentation module to increase the robustness of the model and ensure its adaptability to the planktonic domain. We compare the framework with existing hybrid AL techniques and test an adaption of our extended framework on the planktonic domain. The empirical results from the experiments exerted in this paper confirm higher accuracy achieved by the new extended framework.
机译:随着浮游物种的复杂结构和自主水下车辆(AUV)捕获的巨大数据,浮游生物征集标签的域专家占据了大量负担。同时,最突出的机器学习(ML)用于分类的方法严重依赖于大量标记的数据集来创建和培训准确地执行任务的神经网络分类器模型。主动学习(AL)是一个ML范例,通过提出支持训练数据集的构造的算法来减少本手册努力,从而在最大限度地减少人类参与的同时扩大集合。要构建培训集,Al方法将从整个数据中选择图像,即样本的子集。捕获常见统计模式或特征空间的所选样本可能包括培训和学习过程所需的所有信息。此外,该算法应该优先考虑可能属于多个类的样本,即,具有关闭的帧间边界,并且可能导致模型混淆。许多当前的AL方法未能纳入代表统计模式的样本和特定机器学习模型不确定的样本。在本文中,我们扩展了我们的框架,该框架通过增强模块来增加这些挑战来增加模型的鲁棒性,并确保其对浮游域的适应性。我们将框架与现有的混合AL技术进行比较,并测试我们在浮游域的扩展框架的适应。本文施加实验的经验结果证实了新的扩展框架实现的更高的准确性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号