首页> 外文会议>International Conference on Computational Collective Intelligence >UIT-ViIC: A Dataset for the First Evaluation on Vietnamese Image Captioning
【24h】

UIT-ViIC: A Dataset for the First Evaluation on Vietnamese Image Captioning

机译:UIT-VIIC:用于越南图像标题的第一次评估的数据集

获取原文

摘要

Image Captioning (IC), the task of automatic generation of image captions, has attracted attentions from researchers in many fields of computer science, being computer vision, natural language processing and machine learning in recent years. This paper contributes to research on Image Captioning task in terms of extending dataset to a different language - Vietnamese. So far, there has been no existed Image Captioning dataset for Vietnamese language, so this is the foremost fundamental step for developing Vietnamese Image Captioning. In this scope, we first built a dataset which contains manually written captions for images from Microsoft COCO dataset relating to sports played with balls, we called this dataset UIT-VilC (University Of Information Technology -Vietnamese Image Captions). UIT-VilC consists of 19,250 Vietnamese captions for 3,850 images. Following that, we evaluated our dataset on deep neural network models and did comparisons with English dataset and two Vietnamese datasets built by different methods. UIT-VilC is published on our lab website for research purposes.
机译:图像标题(IC)是自动生成图像标题的任务,吸引了计算机科学许多领域的研究人员的关注,是近年来的计算机视觉,自然语言处理和机器学习。本文有助于将数据集扩展到不同语言 - 越南语的图像标题任务的研究。到目前为止,越南语没有存在图像标题数据集,这是开发越南图像标题的最重要的基本步骤。在这个范围内,我们首先建立了一个数据集,其中包含与球员扮演的运动员有关的Microsoft Coco DataSet的手动写入标题,我们叫这个DataSet UIT-Vilc(信息技术大学 - 申请图片标题)。 UIT-VILC由19,250个越南标题组成3,850个图像。在此之后,我们在深神经网络模型上评估了我们的数据集,并与英语数据集和由不同方法构建的两个越南数据集进行了比较。 UIT-VILC发布在我们的实验室网站上进行研究目的。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号