首页> 外文会议>International conference on artificial neural networks >DeepMimic: Mentor-Student Unlabeled Data Based Training
【24h】

DeepMimic: Mentor-Student Unlabeled Data Based Training

机译:DeepMimic:导师-学生无标签数据培训

获取原文

摘要

In this paper, we present a deep neural network (DNN) training approach called the "DeepMimic" training method. Enormous amounts of data are available nowadays for training usage. Yet, only a tiny portion of these data is manually labeled, whereas almost all of the data are unlabeled. The training approach presented utilizes, in a most simplified manner, the unlabeled data to the fullest, in order to achieve remarkable (classification) results. Our DeepMimic method uses a small portion of labeled data and a large amount of unlabeled data for the training process, as expected in a real-world scenario. It consists of a mentor model and a student model. Employing a mentor model trained on a small portion of the labeled data and then feeding it only with unlabeled data, we show how to obtain a (simplified) student model that reaches the same accuracy and loss as the mentor model, on the same test set, without using any of the original data labels in the training of the student model. Our experiments demonstrate that even on challenging classification tasks the student network architecture can be simplified significantly with a minor influence on the performance, i.e., we need not even know the original network architecture of the mentor. In addition, the time required for training the student model to reach the mentor's performance level is shorter, as a result of a simplified architecture and more available data. The proposed method highlights the disadvantages of regular supervised training and demonstrates the benefits of a less traditional training approach.
机译:在本文中,我们提出了一种称为“ DeepMimic”训练方法的深度神经网络(DNN)训练方法。如今,大量数据可供培训使用。但是,这些数据中只有一小部分是手动标记的,而几乎所有数据都是未标记的。提出的培训方法以最简化的方式最大程度地利用了未标记的数据,从而获得了可观的(分类)结果。我们的DeepMimic方法会在训练过程中使用一小部分标记数据和大量未标记数据,这是真实情况下所期望的。它由一个导师模型和一个学生模型组成。利用在少量标记数据上训练过的导师模型,然后仅将其与未标记数据一起提供,我们将展示如何在同一测试集上获得达到与导师模型相同的准确性和损失的(简化的)学生模型。 ,而无需在训练学生模型时使用任何原始数据标签。我们的实验表明,即使在具有挑战性的分类任务中,学生网络体系结构也可以显着简化,而对性能的影响很小,即,我们甚至不需要知道指导者的原始网络体系结构。此外,由于简化了架构并提供了更多可用数据,因此训练学生模型以达到导师的绩效水平所需的时间更短。所提出的方法突出了定期监督培训的弊端,并证明了较不传统的培训方法的优势。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号