Vision transformers for dense prediction: A survey

Zuo Shuangquan; Xiao Yun; Chang XiaojunWang Xuanhong

首页> 外文期刊>Knowledge-based systems >Vision transformers for dense prediction: A survey

【24h】

Vision transformers for dense prediction: A survey

机译：Vision transformers for dense prediction: A survey

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相关主题

摘要

Transformers have demonstrated impressive expressiveness and transfer capability in computer vision fields. Dense prediction is a fundamental problem in computer vision that is more challenging to solve than general image-level prediction tasks. The inherent properties of transformers enable them to process feature representations with stable and relatively high resolution, which precisely satisfies the demands of dense prediction tasks for finer-grained and more globally coherent predictions. Furthermore, compared to convolutional networks, transformer methods require minimal inductive bias and permit long-range information interaction. These strengths have contributed to exciting advancements in dense prediction tasks that apply transformer networks. This survey aims to provide a comprehensive overview of transformer models with a specific focus on dense prediction. In this survey, we provide a well-rounded view of state-of-the-art transformer-based approaches, explicitly emphasizing pixel-level prediction tasks. We generally consider transformer variants from the network architecture perspective. We further propose a novel taxonomy to organize these models according to their constructions. Subsequently, we examine various specific optimization strategies to tackle certain bottleneck problems in dense prediction tasks. We explore the commonalities and differences among these works and provide multiple horizontal comparisons from the experimental point of view. Finally, we summarize several stubborn problems that continue to impact visual transformers and outline some possible development directions. (C) 2022 Elsevier B.V. All rights reserved.

著录项

来源
《Knowledge-based systems》 |2022年第11期|109552.-109552.23|共23页
作者
Zuo Shuangquan; Xiao Yun; Chang XiaojunWang Xuanhong;
展开▼
作者单位

Northwest Univ;

Univ Technol Sydney;

Xian Univ Posts & Telecommunicat||Xian Univ Posts & Telecommunicat;

展开▼
收录信息
原文格式 PDF
正文语种英语
中图分类
关键词
Deep learning; Transformer; Dense prediction; Computer vision;

Vision transformers for dense prediction: A survey

摘要

著录项

相关主题

期刊订阅