[2201.10271v1] Convolutional Xformers for Vision

Vision transformers (ViTs) have found only limited practical use in processing images, in spite of their state-of-the-art accuracy on certain benchmarks. The reason for their limited use include their need for larger training datasets and more computational resources compared to convolutional neural networks (CNNs), owing to the quadratic complexity of their self-attention mechanism. We propose a linear attention-convolution hybrid architecture -- Convolutional X-formers for Vision (CXV) -- to overcome these limitations. We replace the quadratic attention with linear attention mechanisms, such as Performer, Nyströmformer, and Linear Transformer, to reduce its GPU usage. Inductive prior for image data is provided by convolutional sub-layers, thereby eliminating the need for class token and positional embeddings used by the ViTs. We also propose a new training method where we use two different optimizers during different phases of training and show that it improves the top-1 image classi

1 mentions: @Maxwell_110
Date:

Referring Tweets

@Maxwell_110
@Maxwell_110 Conv X-formers for Vision (CXV) 📝 t.co/mvEdHyEdiy CXV は attention + conv の hybrid モデル Linear Attention の採用, Conv/Layer Norm/Residual 等の配置最適化(Fig. 1)だけで,低計算コストでの高精度を報告 なお,学習進度に応じて AdamW => SGD と Optimizer を切り替えている t.co/8GCbc9lKtA

Related Entries

[1810.11654v3] 3D MRI brain tumor segmentation using autoencoder regularization
Read more [1810.11654v3] 3D MRI brain tumor segmentation using autoencoder regularization
0 users, 1 mentions 2021/08/21 01:37
[2104.04691] Unidentified Video Objects: A Benchmark for Dense, Open-World Segmentation
Read more [2104.04691] Unidentified Video Objects: A Benchmark for Dense, Open-World Segmentation
0 users, 1 mentions 2021/08/25 22:37
Multilingual Translation via Grafting Pre-trained Language Models - ACL Anthology
Read more Multilingual Translation via Grafting Pre-trained Language Models - ACL Anthology
0 users, 1 mentions 2022/02/08 22:37
[2108.01099] Shift-Robust GNNs: Overcoming the Limitations of Localized Graph Training Data
Read more [2108.01099] Shift-Robust GNNs: Overcoming the Limitations of Localized Graph Training Data
0 users, 1 mentions 2022/04/21 22:37
Sberbank Russian Housing Market | Kaggle
Read more Sberbank Russian Housing Market | Kaggle
1 users, 1 mentions 2022/07/14 10:37
Kaggle_meetup_3rd LT ( Sberbank Russian Housing Market ) - Speaker Deck
Read more Kaggle_meetup_3rd LT ( Sberbank Russian Housing Market ) - Speaker Deck
0 users, 1 mentions 2022/07/14 12:11

ML-Newsについて

機械学習の技術に関する情報は流速も早いし、分野も多様でキャッチアップが大変です。Twitterで機械学習用のリストを作っても、普段は機械学習以外の話題が多く流れており、効率的に情報収集するのは困難です。

ML-NewsはSNSを情報源とした機械学習に特化したニュースサイトです。機械学習に関する論文、ブログ、ライブラリ、コンペティション、発表資料、勉強会などの最新の情報を効率的に収集できます。

機械学習を応用した自然言語処理、画像認識、情報検索などの分野の情報や機械学習で必要になるデータ基盤やMLOpsの話題もカバーしています。
安定したサイト運営のためにGitHub sponsorを募集しています。

お知らせ

  • 2021/12/31: デザインを刷新しました
  • 2021/04/08: 日本語とKaggleのカテゴリを新設しました