[2201.10271v1] Convolutional Xformers for Vision

Vision transformers (ViTs) have found only limited practical use in processing images, in spite of their state-of-the-art accuracy on certain benchmarks. The reason for their limited use include their need for larger training datasets and more computational resources compared to convolutional neural networks (CNNs), owing to the quadratic complexity of their self-attention mechanism. We propose a linear attention-convolution hybrid architecture -- Convolutional X-formers for Vision (CXV) -- to overcome these limitations. We replace the quadratic attention with linear attention mechanisms, such as Performer, Nyströmformer, and Linear Transformer, to reduce its GPU usage. Inductive prior for image data is provided by convolutional sub-layers, thereby eliminating the need for class token and positional embeddings used by the ViTs. We also propose a new training method where we use two different optimizers during different phases of training and show that it improves the top-1 image classi

1 mentions: @Maxwell_110
Date:

Referring Tweets

@Maxwell_110
@Maxwell_110 Conv X-formers for Vision (CXV) 📝 t.co/mvEdHyEdiy CXV は attention + conv の hybrid モデル Linear Attention の採用, Conv/Layer Norm/Residual 等の配置最適化(Fig. 1)だけで,低計算コストでの高精度を報告 なお,学習進度に応じて AdamW => SGD と Optimizer を切り替えている t.co/8GCbc9lKtA

Related Entries

Kaggle APTOS 2019 @ U-Tokyo Med - Speaker Deck
Read more Kaggle APTOS 2019 @ U-Tokyo Med - Speaker Deck
0 users, 1 mentions 2020/12/07 12:52
[2107.08357] As Easy as 1, 2, 3: Behavioural Testing of NMT Systems for Numerical Translation
Read more [2107.08357] As Easy as 1, 2, 3: Behavioural Testing of NMT Systems for Numerical Translation
0 users, 1 mentions 2021/08/09 22:37
GitHub - tom-andersson/icenet-paper: Code associated with the paper 'Seasonal Arctic sea ice forecas...
Read more GitHub - tom-andersson/icenet-paper: Code associated with the paper 'Seasonal Arctic sea ice forecas...
0 users, 1 mentions 2021/09/04 01:37
[2104.13369] Explaining in Style: Training a GAN to explain a classifier in StyleSpace
Read more [2104.13369] Explaining in Style: Training a GAN to explain a classifier in StyleSpace
0 users, 1 mentions 2022/01/25 22:37
GitHub - agnesdeng/mixgb: mixgb: multiple imputation through XGBoost
Read more GitHub - agnesdeng/mixgb: mixgb: multiple imputation through XGBoost
0 users, 1 mentions 2022/08/04 22:37
GitHub - dime-worldbank/googletraffic: R package to query Google Maps traffic data and transform int...
Read more GitHub - dime-worldbank/googletraffic: R package to query Google Maps traffic data and transform int...
0 users, 1 mentions 2023/01/11 22:37

ML-Newsについて

機械学習の技術に関する情報は流速も早いし、分野も多様でキャッチアップが大変です。Twitterで機械学習用のリストを作っても、普段は機械学習以外の話題が多く流れており、効率的に情報収集するのは困難です。

ML-NewsはSNSを情報源とした機械学習に特化したニュースサイトです。機械学習に関する論文、ブログ、ライブラリ、コンペティション、発表資料、勉強会などの最新の情報を効率的に収集できます。

機械学習を応用した自然言語処理、画像認識、情報検索などの分野の情報や機械学習で必要になるデータ基盤やMLOpsの話題もカバーしています。
安定したサイト運営のためにGitHub sponsorを募集しています。

お知らせ

  • 2021/12/31: デザインを刷新しました
  • 2021/04/08: 日本語とKaggleのカテゴリを新設しました