【DeepLearning研修】Transformerの基礎と応用 --第3回 Transformerの画像での応用

動画タイプ: 一般
公開日時: 2024年11月28日 17:00
動画長さ: 30:20
再生回数: 436回
高評価数: 17
コメント数: -
エンゲージメント率: 3.9%
データ確認日時: 2024年12月5日 13:31

動画概要

本動画は「    • 【DeepLearning研修】Transformerの基礎と応用   」の第3回の動画です。Transformerの画像での応用について説明しています．また自然言語との融合でどのようなタスクができるようになったかを説明します。
資料はslideshareで公開しています（https://www.slideshare.net/slideshow/...

【参考文献】
・Deep Residual Learning for Image Recognition
https://arxiv.org/abs/1512.03385
・An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
https://arxiv.org/abs/2010.11929
・ON THE RELATIONSHIP BETWEEN SELF-ATTENTION AND CONVOLUTIONAL LAYERS
https://arxiv.org/abs/1911.03584
・Image Style Transfer Using Convolutional Neural Networks
https://ieeexplore.ieee.org/document/...
・Are Convolutional Neural Networks or Transformers more like human vision
https://arxiv.org/abs/2105.07197
・HOW DO VISION TRANSFORMERS WORK?
https://arxiv.org/abs/2202.06709
・Grad-CAM: Visual Explanations from Deep Networks via Gradient-based Localization
https://arxiv.org/abs/1610.02391
・Quantifying Attention Flow in Transformers
https://arxiv.org/abs/2005.00928
・Transformer Interpretability Beyond Attention Visualization
https://arxiv.org/abs/2012.09838
・End-to-End Object Detection with Transformers
https://arxiv.org/abs/2005.12872
・SegFormer: Simple and Efficient Design for Semantic Segmentation with Transformers
https://arxiv.org/abs/2105.15203
・Training data-efficient image transformers & distillation through attention
https://arxiv.org/abs/2012.12877
・Swin Transformer: Hierarchical Vision Transformer using Shifted Windows
https://arxiv.org/abs/2103.14030
・Masked Autoencoders Are Scalable Vision Learners
https://arxiv.org/abs/2111.06377
・Emerging Properties in Self-Supervised Vision Transformers
https://arxiv.org/abs/2104.14294
・Scaling Laws for Neural Language Models
https://arxiv.org/abs/2001.08361
・Learning Transferable Visual Models From Natural Language Supervision
https://arxiv.org/abs/2103.00020
・Scaling Rectified Flow Transformers for High-Resolution Image Synthesis
https://arxiv.org/abs/2403.03206
・Sora: A Review on Background, Technology, Limitations, and Opportunities of Large Vision Models
https://arxiv.org/abs/2402.17177
・SSII2024技術マップ
https://confit.atlas.jp/guide/event/s...

ソニーが提供するオープンソースのディープラーニング（深層学習）フレームワークソフトウェアのNeural Network Libraries（ https://nnabla.org/, https://github.com/sony/nnabla/ ）に関連する情報を紹介する動画チャンネルを開設しました（    / nnabla   ）。Neural Network Librariesのチュートリアル・Tipsに加え、最先端のディープラーニングの技術情報（講義、最先端論文紹介）などを発信していきます。チャンネル登録と応援よろしくおねがいします！

同じくソニーが提供する直感的なGUIベースの深層学習開発環境のNeural Network Console（ https://dl.sony.com/ ）が発信する大人気のYouTubeチャンネル（    / @neuralnetworkconsole   ）でもディープラーニングの技術講座やツールのチュートリアルを多数公開しています。こちらもチャンネル登録と応援よろしくおねがいします。

nnabla ディープラーニングチャンネル

【DeepLearning研修】Transformerの基礎と応用 --第3回 Transformerの画像での応用

動画概要

最新ニュース

人気のニュース2026.06.06～