首页> 外文会议>3rd workshop on representation learning for NLP 2018 >Learning Distributional Token Representations from Visual Features
【24h】

Learning Distributional Token Representations from Visual Features

机译:从视觉特征中学习分配令牌表示

获取原文
获取原文并翻译 | 示例

摘要

In this study, we compare token representations constructed from visual features (i.e., pixels) with standard lookup-based embeddings. Our goal is to gain insight about the challenges of encoding a text representation from low-level features, e.g. from characters or pixels. We focus on Chinese, which--as a logographic language--has properties that make a representation via visual features challenging and interesting. To train and evaluate different models for the token representation, we chose the task of character-based neural machine translation (NMT) from Chinese to English. We found that a token representation computed only from visual features can achieve competitive results to lookup embeddings. However, we also show different strengths and weaknesses in the models' performance in a part-of-speech tagging task and also a semantic similarity task. In summary, we show that it is possible to achieve a text representation only from pixels. We hope that this is a useful stepping stone for future studies that exclusively rely on visual input, or aim at exploiting visual features of written language.
机译:在这项研究中,我们将根据视觉特征(即像素)构建的令牌表示形式与基于标准查找的嵌入进行了比较。我们的目标是深入了解从低级功能(例如,从字符或像素。我们专注于中文,作为一种逻辑语言,中文具有通过具有挑战性和趣味性的视觉特征进行表示的特性。为了训练和评估令牌表示的不同模型,我们选择了从汉字到英语的基于字符的神经机器翻译(NMT)的任务。我们发现,仅从视觉特征计算出的令牌表示形式可以在查找嵌入方面获得竞争性结果。但是,在词性标注任务和语义相似性任务中,我们还显示了模型性能的优缺点。总而言之,我们表明仅通过像素即可实现文本表示。我们希望这对于将来仅依靠视觉输入或旨在利用书面语言的视觉特征的研究而言是有用的垫脚石。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号