首页> 外文会议>IEEE International Conference on Computer Vision Workshops >Diabetes60 — Inferring Bread Units From Food Images Using Fully Convolutional Neural Networks
【24h】

Diabetes60 — Inferring Bread Units From Food Images Using Fully Convolutional Neural Networks

机译:Diabetes60 —使用完全卷积神经网络从食物图像推断面包单元

获取原文

摘要

In this paper we propose a challenging new computer vision task of inferring Bread Units (BUs) from food images. Assessing nutritional information and nutrient volume from a meal is an important task for diabetes patients. At the moment, diabetes patients learn the assessment of BUs on a scale of one to ten, by learning correspondence of BU and meals from textbooks. We introduce a large scale data set of around 9k different RGB-D images of 60 western dishes acquired using a Microsoft Kinect v2 sensor. We recruited 20 diabetes patients to give expert assessments of BU values to each dish based on several images. For this task, we set a challenging baseline using state-of-the-art CNNs and evaluated it against the performance of human annotators. In our work we present a CNN architecture to infer the depth from RGB-only food images to be used in BU regression such that the pipeline can operate on RGB data only and compare its performance to RGB-D input data. We show that our inferred depth maps from RGB images can replace RGB-D input data at high significance for the BU regression task. In its best configuration, our proposed method achieves a RMSE of 1.53 BUs using RGB and inferred depth. Considering the variability among the raters themselves of RMSE = 0.89, we can show that our baseline method with depth prediction can extract reasonable nutritional information from RGB image data only.
机译:在本文中,我们提出了一种挑战从食物图像推断面包单元(公共汽车)的新计算机视觉任务。评估膳食的营养信息和营养量是糖尿病患者的重要任务。目前,糖尿病患者通过从教科书的膳食的学习对应学习一至十的规模学习公共汽车的评估。我们介绍了使用Microsoft Kinect V2传感器获取的60个Western Dishes的大约9k不同RGB-D图像的大规模数据集。我们招募了20名糖尿病患者,基于几种图像对每个菜肴进行了专家评估。对于此任务,我们使用最先进的CNN设置了一个具有挑战性的基线,并评估了人类注释器的表现。在我们的工作中,我们提出了一种CNN架构,可从RGB的食物图像推断在BU回归中使用的深度,使得管道仅可以在RGB数据上运行并将其性能与RGB-D输入数据进行比较。我们表明,来自RGB图像的推断深度映射可以替换RGB-D输入数据对BU回归任务的高意义。在其最佳配置中,我们的提出方法使用RGB和推断深度实现了1.53总线的RMSE。考虑到RMSE = 0.89的评级人自己之间的变异性,我们可以表明我们的深度预测的基线方法只能从RGB图像数据中提取合理的营养信息。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号