This paper shows that masked autoencoders (MAE) are scalable self-supervised learners for computer vision. Our MAE approach is simple: we mask random patches of the input image and reconstruct the missing pixels. It is based on two core designs. First, we develop an asymmetric encoder-decoder architecture, with an encoder that operates only on the visible subset of patches (without mask tokens), along with a lightweight decoder that reconstructs the original image from the latent representation and mask tokens. Second, we find that masking a high proportion of the input image, e.g., 75%, yields a nontrivial and meaningful self-supervisory task. Coupling these two designs enables us to train large models efficiently and effectively: we accelerate training (by 3x or more) and improve accuracy. Our scalable approach allows for learning high-capacity models that generalize well: e.g., a vanilla ViT-Huge model achieves the best accuracy (87.8%) among methods that use only ImageNet-1K data. Transfer performance in downstream tasks outperforms supervised pre-training and shows promising scaling behavior.

0
18
下載
關閉預覽

相關內容

自動編碼器是一種人工神經網絡,用於以無監督的方式學習有效的數據編碼。自動編碼器的目的是通過訓練網絡忽略信號“噪聲”來學習一組數據的表示(編碼),通常用於降維。與簡化方麵一起,學習了重構方麵,在此,自動編碼器嚐試從簡化編碼中生成盡可能接近其原始輸入的表示形式,從而得到其名稱。基本模型存在幾種變體,其目的是迫使學習的輸入表示形式具有有用的屬性。自動編碼器可有效地解決許多應用問題,從麵部識別到獲取單詞的語義。

近些年來,深度學習領域出現了一大批能力、容量均不斷增長的架構。在不斷升級的硬件的支持下,今天的模型已經能夠輕鬆地消化數百萬張圖像,而且開始向數以億計的標記圖像進發。

在自然語言處理中,這種數據需求已經成功地通過自監督預訓練來解決。基於 GPT 自回歸語言建模和 BERT 掩蔽自編碼的解決方案在概念上非常簡單:它們刪除一部分數據,並學習預測刪除的內容。這些方法可以用來訓練包含數千億參數的可泛化 NLP 模型。

掩蔽自編碼器是一種更通用的去噪自編碼器,也適用於計算機視覺。其實,與視覺密切相關的研究早於 BERT。在 BERT 成功之後,人們對這一想法也產生了極大的興趣。但盡管如此,視覺自編碼方法的發展還是落後於 NLP。何愷明等研究者想知道:是什麼造成了這種差異?

他們嚐試從以下幾個角度來回答這一問題:

1、架構差異。在計算機視覺領域,卷積網絡是過去十年的主流架構。不過,隨著 Vision Transformers(ViT)的推出,這種架構上的差異已經逐漸縮小,應該不會再成為障礙。

2、信息密度差異。語言是人類產生的高度語義化信號,信息非常密集。當訓練一個模型來預測每個句子中缺失的寥寥數詞時,這項任務似乎能誘發複雜的語言理解。但視覺任務就不同了:圖像是自然信號,擁有大量的空間冗餘。例如,一個缺失的 patch 可以根據相鄰的 patch 恢複,而不需要對其他部分、對象和場景有很多的高級理解。

為了克服這種差異並鼓勵學習有用的特征,研究者展示了:一個簡單的策略在計算機視覺中也能非常有效:掩蔽很大一部分隨機 patch。這種策略在很大程度上減少了冗餘,並創造了一個具有挑戰性的自監督任務,該任務需要超越低級圖像統計的整體理解。下圖 2 - 圖 4 展示了這一重建任務的定性結果。

3、自編碼器的解碼器(將潛在表征映射回輸入)在文本和圖像重建任務中起著不同的作用。在計算機視覺任務中,解碼器重建的是像素,因此其輸出的語義水平低於一般的識別任務。這與語言相反,語言任務中的解碼器預測的是包含豐富語義信息的缺失單詞。雖然在 BERT 中,解碼器可能是微不足道的(一個 MLP),但何愷明等研究者發現,對於圖像,解碼器的設計對於學到的潛在表示的語義水平起著關鍵作用。

基於以上分析,研究者提出了一種簡單、有效且可擴展的掩蔽自編碼器(MAE)用於視覺表征學習。該 MAE 從輸入圖像中掩蔽了隨機 patch 並重建像素空間中缺失的 patch。它具有非對稱的編碼器 - 解碼器設計。其中,編碼器僅對 patch 的可見子集(沒有掩碼 token)進行操作,解碼器則是輕量級的,可以從潛在表征和掩碼 token 中重建輸入(圖 1)。

在這個非對稱編碼器 - 解碼器中,將掩碼 token 轉移到小型解碼器會導致計算量大幅減少。在這種設計下,非常高的掩蔽率(例如 75%)可以實現雙贏:它優化了準確性,同時允許編碼器僅處理一小部分(例如 25%)的 patch。這可以將整體預訓練時間減少至原來的 1/3 或更低,同時減少內存消耗,使我們能夠輕鬆地將 MAE 擴展到大型模型。

MAE 可以學習非常大容量的模型,而且泛化性能良好。通過 MAE 預訓練,研究者可以在 ImageNet-1K 上訓練 ViT-Large/-Huge 等需要大量數據的模型,提高泛化性能。例如,在 ImageNet-1K 數據集上,原始 ViT-Huge 模型經過微調後可以實現 87.8% 的準確率。這比以前所有僅使用 ImageNet-1K 數據的模型效果都要好。

成為VIP會員查看完整內容
1
18
0

CVPR2021一共有1663篇論文接受,如下: Greedy Hierarchical Variational Autoencoders for Large-Scale Video Prediction

Bohan Wu, Suraj Nair, Roberto Martin-Martin, Li Fei-Fei, Chelsea Finn

[pdf] [supp]

[bibtex]

Over-the-Air Adversarial Flickering Attacks Against Video Recognition Networks

Roi Pony, Itay Naeh, Shie Mannor

[pdf] [supp] [arXiv]

[bibtex]

Person30K: A Dual-Meta Generalization Network for Person Re-Identification

Yan Bai, Jile Jiao, Wang Ce, Jun Liu, Yihang Lou, Xuetao Feng, Ling-Yu Duan

[pdf]

[bibtex]

Privacy Preserving Localization and Mapping From Uncalibrated Cameras

Marcel Geppert, Viktor Larsson, Pablo Speciale, Johannes L. Schonberger, Marc Pollefeys

[pdf] [supp]

[bibtex]

成為VIP會員查看完整內容
1
21
0

來自UIUC最新《自監督學習》教程,

  • 數據預測
  • 彩色化
  • Transformation 預測
  • 上下文預測,拚圖遊戲解決,旋轉預測
  • 深度聚類和實例預測
  • 對比學習
  • PIRL, MoCo, SimCLR, SWaV
  • 自我監督
  • 音頻、視頻、語言

成為VIP會員查看完整內容
2
65
0

神經語言生成(NLG)——使用神經網絡模型生成連貫的文本——是自動化文本創建最有前途的方法之一。近年來,隨著深度語境語言建模(如LSTMs、GPT、GPT2)和遷移學習(如ELMo、BERT)的發展,神經文本生成出現了範式轉變。雖然這些工具極大地改善了NLG的狀態,但是對於低資源任務,最先進的NLG模型仍然麵臨許多挑戰: 生成的文本缺乏多樣性,描述的情況違反常識性規則,使用事實信息的困難,以及設計可靠的評估指標的困難。在本教程中,我們將概述當前神經網絡架構的最新技術,以及它們如何形成文本生成的最新研究方向。我們將討論這些模型在生成連貫文本時如何以及為什麼成功或失敗,並對幾個應用程序提供見解。

目錄內容:

  • 導論
  • 神經網絡建模
  • 訓練與編碼
  • 基準與評價
  • 構建神經生成模型
成為VIP會員查看完整內容
0
34
0

在一個常見的機器學習問題中,使用對訓練數據集估計的模型,根據觀察到的特征預測未來的結果值。當測試數據和訓練數據來自相同的分布時,許多學習算法被提出並證明是成功的。然而,對於給定的訓練數據分布,性能最好的模型通常會利用特征之間微妙的統計關係,這使得它們在應用於分布與訓練數據不同的測試數據時,可能更容易出現預測錯誤。對於學術研究和實際應用來說,如何開發能夠穩定和穩健地轉換數據的學習模型是至關重要的。

因果推理是指根據效果發生的條件得出因果關係的結論的過程,是一種強大的統計建模工具,用於解釋和穩定學習。本教程側重於因果推理和穩定學習,旨在從觀察數據中探索因果知識,提高機器學習算法的可解釋性和穩定性。首先,我們將介紹因果推論,並介紹一些最近的數據驅動方法,以估計因果效應從觀測數據,特別是在高維設置。為了彌補因果推理和機器學習之間的差距,我們首先給出了穩定性和魯棒性學習算法的定義,然後將介紹一些最近的穩定學習算法來提高預測的穩定性和可解釋性。最後,我們將討論穩定學習的應用和未來的發展方向,並提供穩定學習的基準。

http://kdd2020tutorial.thumedialab.com/

成為VIP會員查看完整內容
3
60
0

主題:GANs in computer vision: Introduction to generative learning

主要內容:在這個綜述係列文章中,我們將重點討論計算機視覺應用程序的大量GANs。具體地說,我們將慢慢地建立在導致產生性對抗網絡(GAN)進化的思想和原則之上。我們將遇到不同的任務,如條件圖像生成,3D對象生成,視頻合成。

目錄:

  • 對抗學習
  • GAN(生成對抗網絡)
  • 條件生成對抗網
  • 基於深度卷積
  • 生成對抗網絡的無監督表示學習
  • Info GAN: Info最大化生成對抗網的表征學習

一般來說,數據生成方法存在於各種各樣的現代深度學習應用中,從計算機視覺到自然語言處理。在這一點上,我們可以用肉眼生成幾乎無法區分的生成數據。生成性學習大致可分為兩大類:a)變分自編碼器(VAE)和b)生成性對抗網絡(GAN)。

成為VIP會員查看完整內容
0
49
0

自監督學習(Self-Supervised Learning)是一種介於無監督和監督學習之間的一種新範式,旨在減少對大量帶注釋數據的挑戰性需求。它通過定義無注釋(annotation-free)的前置任務(pretext task),為特征學習提供代理監督信號。jason718整理了關於自監督學習最新的論文合集,非常值得查看!

地址:https://github.com/jason718/awesome-self-supervised-learning

A curated list of awesome Self-Supervised Learning resources. Inspired byawesome-deep-vision,awesome-adversarial-machine-learning,awesome-deep-learning-papers, andawesome-architecture-search

Why Self-Supervised?

Self-Supervised Learning has become an exciting direction in AI community.

  • Jitendra Malik: "Supervision is the opium of the AI researcher"
  • Alyosha Efros: "The AI revolution will not be supervised"
  • Yann LeCun: "self-supervised learning is the cake, supervised learning is the icing on the cake, reinforcement learning is the cherry on the cake"

Contributing

We Need You!

Please help contribute this list by contactingmeor addpull request

Markdown format:

- Paper Name. [[pdf]](link) [[code]](link) - Author 1, Author 2, and Author 3. *Conference Year*

Table of Contents

Computer Vision

Survey

  • Self-supervised Visual Feature Learning with Deep Neural Networks: A Survey.[pdf]
    • Longlong Jing and Yingli Tian.

Image Representation Learning

Benchmark code

FAIR Self-Supervision Benchmark[repo]: various benchmark (and legacy) tasks for evaluating quality of visual representations learned by various self-supervision approaches.

2015

  • Unsupervised Visual Representation Learning by Context Prediction.[pdf][code]

    • Doersch, Carl and Gupta, Abhinav and Efros, Alexei A.ICCV 2015
  • Unsupervised Learning of Visual Representations using Videos.[pdf][code]

    • Wang, Xiaolong and Gupta, Abhinav.ICCV 2015
  • Learning to See by Moving.[pdf][code]

    • Agrawal, Pulkit and Carreira, Joao and Malik, Jitendra.ICCV 2015
  • Learning image representations tied to ego-motion.[pdf][code]

    • Jayaraman, Dinesh and Grauman, Kristen.ICCV 2015

2016

  • Joint Unsupervised Learning of Deep Representations and Image Clusters.[pdf][code-torch][code-caffe]

    • Jianwei Yang, Devi Parikh, Dhruv Batra.CVPR 2016
  • Unsupervised Deep Embedding for Clustering Analysis.[pdf][code]

    • Junyuan Xie, Ross Girshick, and Ali Farhadi.ICML 2016
  • Slow and steady feature analysis: higher order temporal coherence in video.[pdf]

    • Jayaraman, Dinesh and Grauman, Kristen.CVPR 2016
  • Context Encoders: Feature Learning by Inpainting.[pdf][code]

    • Pathak, Deepak and Krahenbuhl, Philipp and Donahue, Jeff and Darrell, Trevor and Efros, Alexei A.CVPR 2016
  • Colorful Image Colorization.[pdf][code]

    • Zhang, Richard and Isola, Phillip and Efros, Alexei A.ECCV 2016
  • Unsupervised Learning of Visual Representations by Solving Jigsaw Puzzles.[pdf][code]

    • Noroozi, Mehdi and Favaro, Paolo.ECCV 2016
  • Ambient Sound Provides Supervision for Visual Learning.[pdf][code]

    • Owens, Andrew and Wu, Jiajun and McDermott, Josh and Freeman, William and Torralba, Antonio.ECCV 2016
  • Learning Representations for Automatic Colorization.[pdf][code]

    • Larsson, Gustav and Maire, Michael and Shakhnarovich, Gregory.ECCV 2016
  • Unsupervised Visual Representation Learning by Graph-based Consistent Constraints.[pdf][code]

    • Li, Dong and Hung, Wei-Chih and Huang, Jia-Bin and Wang, Shengjin and Ahuja, Narendra and Yang, Ming-Hsuan.ECCV 2016

2017

  • Adversarial Feature Learning.[pdf][code]

    • Donahue, Jeff and Krahenbuhl, Philipp and Darrell, Trevor.ICLR 2017
  • Self-supervised learning of visual features through embedding images into text topic spaces.[pdf][code]

    • L. Gomez* and Y. Patel* and M. Rusiñol and D. Karatzas and C.V. Jawahar.CVPR 2017
  • Split-Brain Autoencoders: Unsupervised Learning by Cross-Channel Prediction.[pdf][code]

    • Zhang, Richard and Isola, Phillip and Efros, Alexei A.CVPR 2017
  • Learning Features by Watching Objects Move.[pdf][code]

    • Pathak, Deepak and Girshick, Ross and Dollar, Piotr and Darrell, Trevor and Hariharan, Bharath.CVPR 2017
  • Colorization as a Proxy Task for Visual Understanding.[pdf][code]

    • Larsson, Gustav and Maire, Michael and Shakhnarovich, Gregory.CVPR 2017
  • DeepPermNet: Visual Permutation Learning.[pdf][code]

    • Cruz, Rodrigo Santa and Fernando, Basura and Cherian, Anoop and Gould, Stephen.CVPR 2017
  • Unsupervised Learning by Predicting Noise.[pdf][code]

    • Bojanowski, Piotr and Joulin, Armand.ICML 2017
  • Multi-task Self-Supervised Visual Learning.[pdf]

    • Doersch, Carl and Zisserman, Andrew.ICCV 2017
  • Representation Learning by Learning to Count.[pdf]

    • Noroozi, Mehdi and Pirsiavash, Hamed and Favaro, Paolo.ICCV 2017
  • Transitive Invariance for Self-supervised Visual Representation Learning.[pdf]

    • Wang, Xiaolong and He, Kaiming and Gupta, Abhinav.ICCV 2017
  • Look, Listen and Learn.[pdf]

    • Relja, Arandjelovic and Zisserman, Andrew.ICCV 2017
  • Unsupervised Representation Learning by Sorting Sequences.[pdf][code]

    • Hsin-Ying Lee, Jia-Bin Huang, Maneesh Kumar Singh, and Ming-Hsuan Yang.ICCV 2017

2018

  • Unsupervised Feature Learning via Non-parameteric Instance Discrimination[pdf][code]

    • Zhirong Wu, Yuanjun Xiong and X Yu Stella and Dahua Lin.CVPR 2018
  • Learning Image Representations by Completing Damaged Jigsaw Puzzles.[pdf]

    • Kim, Dahun and Cho, Donghyeon and Yoo, Donggeun and Kweon, In So.WACV 2018
  • Unsupervised Representation Learning by Predicting Image Rotations.[pdf][code]

    • Spyros Gidaris and Praveer Singh and Nikos Komodakis.ICLR 2018
  • Learning Latent Representations in Neural Networks for Clustering through Pseudo Supervision and Graph-based Activity Regularization.[pdf][code]

    • Ozsel Kilinc and Ismail Uysal.ICLR 2018
  • Improvements to context based self-supervised learning.[pdf]

    • Terrell Mundhenk and Daniel Ho and Barry Chen.CVPR 2018
  • Self-Supervised Feature Learning by Learning to Spot Artifacts.[pdf][code]

    • Simon Jenni and Universität Bern and Paolo Favaro.CVPR 2018
  • Boosting Self-Supervised Learning via Knowledge Transfer.[pdf]

    • Mehdi Noroozi and Ananth Vinjimoor and Paolo Favaro and Hamed Pirsiavash.CVPR 2018
  • Cross-domain Self-supervised Multi-task Feature Learning Using Synthetic Imagery.[pdf][code]

    • Zhongzheng Ren and Yong Jae Lee.CVPR 2018
  • ShapeCodes: Self-Supervised Feature Learning by Lifting Views to Viewgrids.[pdf]

    • Dinesh Jayaraman*, UC Berkeley; Ruohan Gao, University of Texas at Austin; Kristen Grauman.ECCV 2018
  • Deep Clustering for Unsupervised Learning of Visual Features[pdf]

    • Mathilde Caron, Piotr Bojanowski, Armand Joulin, Matthijs Douze.ECCV 2018
  • Cross Pixel Optical-Flow Similarity for Self-Supervised Learning.[pdf]

    • Aravindh Mahendran, James Thewlis, Andrea Vedaldi.ACCV 2018

2019

  • Representation Learning with Contrastive Predictive Coding.[pdf]

    • Aaron van den Oord, Yazhe Li, Oriol Vinyals.
  • Self-Supervised Learning via Conditional Motion Propagation.[pdf][code]

    • Xiaohang Zhan, Xingang Pan, Ziwei Liu, Dahua Lin, and Chen Change Loy.CVPR 2019
  • Self-Supervised Representation Learning by Rotation Feature Decoupling.[pdf][code]

    • Zeyu Feng; Chang Xu; Dacheng Tao.CVPR 2019
  • Revisiting Self-Supervised Visual Representation Learning.[pdf][code]

    • Alexander Kolesnikov; Xiaohua Zhai; Lucas Beye. CVPR 2019
  • AET vs. AED: Unsupervised Representation Learning by Auto-Encoding Transformations rather than Data.[pdf][code]

    • Liheng Zhang, Guo-Jun Qi, Liqiang Wang, Jiebo Luo.CVPR 2019
  • Unsupervised Deep Learning by Neighbourhood Discovery.[pdf].[code].

    • Jiabo Huang, Qi Dong, Shaogang Gong, Xiatian Zhu.ICML 2019
  • Contrastive Multiview Coding.[pdf][code]

    • Yonglong Tian and Dilip Krishnan and Phillip Isola.
  • Large Scale Adversarial Representation Learning.[pdf]

    • Jeff Donahue, Karen Simonyan.
  • Learning Representations by Maximizing Mutual Information Across Views.[pdf][code]

    • Philip Bachman, R Devon Hjelm, William Buchwalter
  • Selfie: Self-supervised Pretraining for Image Embedding.[pdf]

    • Trieu H. Trinh, Minh-Thang Luong, Quoc V. Le
  • Data-Efficient Image Recognition with Contrastive Predictive Coding[pdf]

    • Olivier J. He ́naff, Ali Razavi, Carl Doersch, S. M. Ali Eslami, Aaron van den Oord
  • Using Self-Supervised Learning Can Improve Model Robustness and Uncertainty[pdf][code]

    • Dan Hendrycks, Mantas Mazeika, Saurav Kadavath, Dawn Song.NeurIPS 2019
  • Boosting Few-Shot Visual Learning with Self-Supervision[pdf]

    • pyros Gidaris, Andrei Bursuc, Nikos Komodakis, Patrick Pérez, and Matthieu Cord.ICCV 2019
  • Self-Supervised Generalisation with Meta Auxiliary Learning[pdf][code]

    • Shikun Liu, Andrew J. Davison, Edward Johns.NeurIPS 2019
  • Wasserstein Dependency Measure for Representation Learning[pdf][code]

    • Sherjil Ozair, Corey Lynch, Yoshua Bengio, Aaron van den Oord, Sergey Levine, Pierre Sermanet.NeurIPS 2019
  • Scaling and Benchmarking Self-Supervised Visual Representation Learning[pdf][code]

    • Priya Goyal, Dhruv Mahajan, Abhinav Gupta, Ishan Misra.ICCV 2019

2020

  • A critical analysis of self-supervision, or what we can learn from a single image[pdf][code]

    • Yuki M. Asano, Christian Rupprecht, Andrea Vedaldi.ICLR 2020
  • On Mutual Information Maximization for Representation Learning[pdf][code]

    • Michael Tschannen, Josip Djolonga, Paul K. Rubenstein, Sylvain Gelly, Mario Lucic.ICLR 2020
  • Understanding the Limitations of Variational Mutual Information Estimators[pdf][code]

    • Jiaming Song, Stefano Ermon.ICLR 2020
  • Automatic Shortcut Removal for Self-Supervised Representation Learning[pdf]

    • Matthias Minderer, Olivier Bachem, Neil Houlsby, Michael Tschannen
  • Momentum Contrast for Unsupervised Visual Representation Learning[pdf]

    • Kaiming He, Haoqi Fan, Yuxin Wu, Saining Xie, Ross Girshick.FAIR
  • A Simple Framework for Contrastive Learning of Visual Representations[pdf]

    • Ting Chen, Simon Kornblith, Mohammad Norouzi, Geoffrey Hinton
  • ClusterFit: Improving Generalization of Visual Representations[pdf]

    • Xueting Yan*, Ishan Misra*, Abhinav Gupta, Deepti Ghadiyaram**, Dhruv Mahajan**.CVPR 2020
  • Self-Supervised Learning of Pretext-Invariant Representations[pdf]

    • Ishan Misra, Laurens van der Maaten.CVPR 2020

Video Representation Learning

  • Unsupervised Learning of Video Representations using LSTMs.[pdf][code]

    • Srivastava, Nitish and Mansimov, Elman and Salakhudinov, Ruslan.ICML 2015
  • Shuffle and Learn: Unsupervised Learning using Temporal Order Verification.[pdf][code]

    • Ishan Misra, C. Lawrence Zitnick and Martial Hebert.ECCV 2016
  • LSTM Self-Supervision for Detailed Behavior Analysis[pdf]

    • Biagio Brattoli*, Uta Büchler*, Anna-Sophia Wahl, Martin E. Schwab, and Björn Ommer.CVPR 2017
  • Self-Supervised Video Representation Learning With Odd-One-Out Networks.[pdf]

    • Basura Fernando and Hakan Bilen and Efstratios Gavves and Stephen Gould.CVPR 2017
  • Unsupervised Learning of Long-Term Motion Dynamics for Videos.[pdf]

    • Luo, Zelun and Peng, Boya and Huang, De-An and Alahi, Alexandre and Fei-Fei, Li.CVPR 2017
  • Geometry Guided Convolutional Neural Networks for Self-Supervised Video Representation Learning.[pdf]

    • Chuang Gan and Boqing Gong and Kun Liu and Hao Su and Leonidas J. Guibas.CVPR 2018
  • Improving Spatiotemporal Self-Supervision by Deep Reinforcement Learning.[pdf]

    • Biagio Brattoli*, Uta Büchler*, and Björn Ommer.ECCV 2018
  • Self-supervised learning of a facial attribute embedding from video.[pdf]

    • Wiles, O., Koepke, A.S., Zisserman, A.BMVC 2018
  • Self-Supervised Video Representation Learning with Space-Time Cubic Puzzles.[pdf]

    • Kim, Dahun and Cho, Donghyeon and Yoo, Donggeun and Kweon, In So.AAAI 2019
  • Self-Supervised Spatio-Temporal Representation Learning for Videos by Predicting Motion and Appearance Statistics.[pdf]

    • Jiangliu Wang; Jianbo Jiao; Linchao Bao; Shengfeng He; Yunhui Liu; Wei Liu. CVPR 2019
  • DynamoNet: Dynamic Action and Motion Network.[pdf]

    • Ali Diba; Vivek Sharma, Luc Van Gool, Rainer Stiefelhagen.ICCV 2019
  • Learning Correspondence from the Cycle-consistency of Time.[pdf][code]

    • Xiaolong Wang*, Allan Jabri* and Alexei A. Efros.CVPR 2019
  • Joint-task Self-supervised Learning for Temporal Correspondence.[pdf][code]

    • Xueting Li*, Sifei Liu*, Shalini De Mello, Xiaolong Wang, Jan Kautz, and Ming-Hsuan Yang.NIPS 2019

Geometry

  • Self-supervised Learning of Motion Capture.[pdf][code][web]

    • Tung, Hsiao-Yu and Tung, Hsiao-Wei and Yumer, Ersin and Fragkiadaki, Katerina.NIPS 2017
  • Unsupervised Learning of Depth and Ego-Motion from Video.[pdf][code][web]

    • Zhou, Tinghui and Brown, Matthew and Snavely, Noah and Lowe, David G.CVPR 2017
  • Active Stereo Net: End-to-End Self-Supervised Learning for Active Stereo Systems.[project]

    • Yinda Zhang*, Sean Fanello, Sameh Khamis, Christoph Rhemann, Julien Valentin, Adarsh Kowdle, Vladimir Tankovich, Shahram Izadi, Thomas Funkhouser.ECCV 2018
  • Self-Supervised Relative Depth Learning for Urban Scene Understanding.[pdf][project]

    • Huaizu Jiang*, Erik Learned-Miller, Gustav Larsson, Michael Maire, Greg Shakhnarovich.ECCV 2018
  • Geometry-Aware Learning of Maps for Camera Localization.[pdf][code]

    • Samarth Brahmbhatt, Jinwei Gu, Kihwan Kim, James Hays, and Jan Kautz. CVPR 2018
  • Self-supervised Learning of Geometrically Stable Features Through Probabilistic Introspection.[pdf][web]

    • David Novotny, Samuel Albanie, Diane Larlus, Andrea Vedaldi. CVPR 2018
  • Self-Supervised Learning of 3D Human Pose Using Multi-View Geometry.[pdf]

    • Muhammed Kocabas; Salih Karagoz; Emre Akbas. CVPR 2019
  • SelFlow: Self-Supervised Learning of Optical Flow.[pdf]

    • Jiangliu Wang; Jianbo Jiao; Linchao Bao; Shengfeng He; Yunhui Liu; Wei Liu. CVPR 2019
  • Unsupervised Learning of Landmarks by Descriptor Vector Exchange.[pdf][code][web]

    • James Thewlis, Samuel Albanie, Hakan Bilen, Andrea Vedaldi. ICCV 2019

Audio

  • Audio-Visual Scene Analysis with Self-Supervised Multisensory Features.[pdf][code]

    • Andrew Owens, Alexei A. Efros.ECCV 2018
  • Objects that Sound.[pdf]

    • R. Arandjelović, A. Zisserman.ECCV 2018
  • Learning to Separate Object Sounds by Watching Unlabeled Video.[pdf][project]

    • Ruohan Gao, Rogerio Feris, Kristen Grauman.ECCV 2018
  • The Sound of Pixels.[pdf][project]

    • Zhao, Hang and Gan, Chuang and Rouditchenko, Andrew and Vondrick, Carl and McDermott, Josh and Torralba, Antonio.ECCV 2018
  • Learnable PINs: Cross-Modal Embeddings for Person Identity.[pdf][web]

    • Arsha Nagrani, Samuel Albanie, Andrew Zisserman. ECCV 2018
  • Cooperative Learning of Audio and Video Models from Self-Supervised Synchronization.[pdf]

    • Bruno Korbar,Dartmouth College, Du Tran, Lorenzo Torresani.NIPS 2018
  • Self-Supervised Generation of Spatial Audio for 360° Video.[pdf]

    • Pedro Morgado, Nuno Nvasconcelos, Timothy Langlois, Oliver Wang.NIPS 2018
  • TriCycle: Audio Representation Learning from Sensor Network Data Using Self-Supervision[pdf]

    • Mark Cartwright, Jason Cramer, Justin Salamon, Juan Pablo Bello.WASPAA 2019

Others

  • Self-learning Scene-specific Pedestrian Detectors using a Progressive Latent Model.[pdf]
    • Qixiang Ye, Tianliang Zhang, Qiang Qiu, Baochang Zhang, Jie Chen, Guillermo Sapiro.CVPR 2017
  • Free Supervision from Video Games.[pdf][project+code]
    • Philipp Krähenbühl.CVPR 2018
  • Fighting Fake News: Image Splice Detection via Learned Self-Consistency[pdf][code]
    • Minyoung Huh*, Andrew Liu*, Andrew Owens, Alexei A. Efros.ECCV 2018
  • Self-supervised Tracking by Colorization (Tracking Emerges by Colorizing Videos).[pdf]
    • Carl Vondrick*, Abhinav Shrivastava, Alireza Fathi, Sergio Guadarrama, Kevin Murphy.ECCV 2018
  • High-Fidelity Image Generation With Fewer Labels.[pdf]
    • Mario Lucic*, Michael Tschannen*, Marvin Ritter*, Xiaohua Zhai, Olivier Bachem, Sylvain Gelly.
  • Self-supervised Fitting of Articulated Meshes to Point Clouds.
    • Chun-Liang Li, Tomas Simon, Jason Saragih, Barnabás Póczos and Yaser Sheikh.CVPR 2019
  • SCOPS: Self-Supervised Co-Part Segmentation.
    • Wei-Chih Hung, Varun Jampani, Sifei Liu, Pavlo Molchanov, Ming-Hsuan Yang, and Jan Kautz.CVPR 2019
  • Self-Supervised GANs via Auxiliary Rotation Loss.
    • Ting Chen; Xiaohua Zhai; Marvin Ritter; Mario Lucic; Neil Houlsby.CVPR 2019
  • Self-Supervised Adaptation of High-Fidelity Face Models for Monocular Performance Tracking.
    • Jae Shin Yoon; Takaaki Shiratori; Shoou-I Yu; Hyun Soo Park.CVPR 2019
  • Multi-Task Self-Supervised Object Detection via Recycling of Bounding Box Annotations.
    • Wonhee Lee; Joonil Na; Gunhee Kim.CVPR 2019
  • Self-Supervised Convolutional Subspace Clustering Network.
    • Junjian Zhang; Chun-Guang Li; Chong You; Xianbiao Qi; Honggang Zhang; Jun Guo; Zhouchen Lin.CVPR 2019
  • Reinforced Cross-Modal Matching and Self-Supervised Imitation Learning for Vision-Language Navigation.
    • Xin Wang; Qiuyuan Huang; Asli Celikyilmaz; Jianfeng Gao; Dinghan Shen; Yuan-Fang Wang; William Yang Wang; Lei Zhang.CVPR 2019
  • Unsupervised 3D Pose Estimation With Geometric Self-Supervision.
    • Ching-Hang Chen; Ambrish Tyagi; Amit Agrawal; Dylan Drover; Rohith MV; Stefan Stojanov; James M. Rehg.CVPR 2019
  • Learning to Generate Grounded Image Captions without Localization Supervision.[pdf]
    • Chih-Yao Ma; Yannis Kalantidis; Ghassan AlRegib; Peter Vajda; Marcus Rohrbach; Zsolt Kira.
  • VideoBERT: A Joint Model for Video and Language Representation Learning[pdf]
    • Chen Sun, Austin Myers, Carl Vondrick, Kevin Murphy, Cordelia Schmid.ICCV 2019
  • S4L: Self-Supervised Semi-Supervised Learning[pdf]
    • Xiaohua Zhai, Avital Oliver, Alexander Kolesnikov, Lucas Beyer
  • Countering Noisy Labels By Learning From Auxiliary Clean Labels[pdf]
    • Tsung Wei Tsai, Chongxuan Li, Jun Zhu

Machine Learning

  • Self-taught Learning: Transfer Learning from Unlabeled Data.[pdf]

    • Raina, Rajat and Battle, Alexis and Lee, Honglak and Packer, Benjamin and Ng, Andrew Y.ICML 2007
  • Representation Learning: A Review and New Perspectives.[pdf]

    • Bengio, Yoshua and Courville, Aaron and Vincent, Pascal.TPAMI 2013.

Reinforcement Learning

  • Curiosity-driven Exploration by Self-supervised Prediction.[pdf][code]

    • Deepak Pathak, Pulkit Agrawal, Alexei A. Efros, and Trevor Darrell.ICML 2017
  • Large-Scale Study of Curiosity-Driven Learning.[pdf]

    • Yuri Burda*, Harri Edwards*, Deepak Pathak*, Amos Storkey, Trevor Darrell and Alexei A. Efros
  • Playing hard exploration games by watching YouTube.[pdf]

    • Yusuf Aytar, Tobias Pfaff, David Budden, Tom Le Paine, Ziyu Wang, Nando de Freitas.NIPS 2018
  • Unsupervised State Representation Learning in Atari.[pdf][code]

    • Ankesh Anand, Evan Racah, Sherjil Ozair, Yoshua Bengio, Marc-Alexandre Côté, R Devon Hjelm.NeurIPS 2019

Robotics

2006

  • Improving Robot Navigation Through Self-Supervised Online Learning[pdf]

    • Boris Sofman, Ellie Lin, J. Andrew Bagnell, Nicolas Vandapel, and Anthony Stentz
  • Reverse Optical Flow for Self-Supervised Adaptive Autonomous Robot Navigation[pdf]

    • A. Lookingbill, D. Lieb, J. Rogers and J. Curry

2009

  • Learning Long-Range Vision for Autonomous Off-Road Driving[pdf]
    • Raia Hadsell, Pierre Sermanet, Jan Ben, Ayse Erkan, Marco Scoffier, Koray Kavukcuoglu, Urs Muller, Yann LeCun

2012

  • Self-supervised terrain classification for planetary surface exploration rovers[pdf]
    • Christopher A. Brooks, Karl Iagnemma

2014

  • Terrain Traversability Analysis Using Multi-Sensor Data Correlation by a Mobile Robot[pdf]
    • Mohammed Abdessamad Bekhti, Yuichi Kobayashi and Kazuki Matsumura

2015

  • Online self-supervised learning for dynamic object segmentation[pdf]

    • Vitor Guizilini and Fabio Ramos, The International Journal of Robotics Research
  • Self-Supervised Online Learning of Basic Object Push Affordances[pdf]

    • Barry Ridge, Ales Leonardis, Ales Ude, Miha Denisa, and Danijel Skocaj
  • Self-supervised learning of grasp dependent tool affordances on the iCub Humanoid robot[pdf]

    • Tanis Mar, Vadim Tikhanoff, Giorgio Metta, and Lorenzo Natale

2016

  • Persistent self-supervised learning principle: from stereo to monocular vision for obstacle avoidance[pdf]

    • Kevin van Hecke, Guido de Croon, Laurens van der Maaten, Daniel Hennes, and Dario Izzo
  • The Curious Robot: Learning Visual Representations via Physical Interactions.[pdf]

    • Lerrel Pinto and Dhiraj Gandhi and Yuanfeng Han and Yong-Lae Park and Abhinav Gupta.ECCV 2016
  • Learning to Poke by Poking: Experiential Learning of Intuitive Physics.[pdf]

    • Agrawal, Pulkit and Nair, Ashvin V and Abbeel, Pieter and Malik, Jitendra and Levine, Sergey.NIPS 2016
  • Supersizing Self-supervision: Learning to Grasp from 50K Tries and 700 Robot Hours.[pdf]

    • Pinto, Lerrel and Gupta, Abhinav.ICRA 2016

2017

  • Supervision via Competition: Robot Adversaries for Learning Tasks.[pdf]

    • Pinto, Lerrel and Davidson, James and Gupta, Abhinav.ICRA 2017
  • Multi-view Self-supervised Deep Learning for 6D Pose Estimation in the Amazon Picking Challenge.[pdf][Project]

    • Andy Zeng, Kuan-Ting Yu, Shuran Song, Daniel Suo, Ed Walker Jr., Alberto Rodriguez, Jianxiong Xiao.ICRA 2017
  • Combining Self-Supervised Learning and Imitation for Vision-Based Rope Manipulation.[pdf][Project]

    • Ashvin Nair*, Dian Chen*, Pulkit Agrawal*, Phillip Isola, Pieter Abbeel, Jitendra Malik, Sergey Levine.ICRA 2017
  • Learning to Fly by Crashing[pdf]

    • Dhiraj Gandhi, Lerrel Pinto, Abhinav GuptaIROS 2017
  • Self-supervised learning as an enabling technology for future space exploration robots: ISS experiments on monocular distance learning[pdf]

    • K. van Hecke, G. C. de Croon, D. Hennes, T. P. Setterfield, A. Saenz- Otero, and D. Izzo
  • Unsupervised Perceptual Rewards for Imitation Learning.[pdf][project]

    • Sermanet, Pierre and Xu, Kelvin and Levine, Sergey.RSS 2017
  • Self-Supervised Visual Planning with Temporal Skip Connections.[pdf]

    • Frederik Ebert, Chelsea Finn, Alex X. Lee, Sergey Levine.CoRL2017

2018

  • CASSL: Curriculum Accelerated Self-Supervised Learning.[pdf]

    • Adithyavairavan Murali, Lerrel Pinto, Dhiraj Gandhi, Abhinav Gupta.ICRA 2018
  • Time-Contrastive Networks: Self-Supervised Learning from Video.[pdf][Project]

    • Pierre Sermanet and Corey Lynch and Yevgen Chebotar and Jasmine Hsu and Eric Jang and Stefan Schaal and Sergey Levine.ICRA 2018
  • Self-Supervised Deep Reinforcement Learning with Generalized Computation Graphs for Robot Navigation.[pdf]

    • Gregory Kahn, Adam Villaflor, Bosen Ding, Pieter Abbeel, Sergey Levine.ICRA 2018
  • Learning Actionable Representations from Visual Observations.[pdf][Project]

    • Dwibedi, Debidatta and Tompson, Jonathan and Lynch, Corey and Sermanet, Pierre.IROS 2018
  • Learning Synergies between Pushing and Grasping with Self-supervised Deep Reinforcement Learning.[pdf][Project]

    • Andy Zeng, Shuran Song, Stefan Welker, Johnny Lee, Alberto Rodriguez, Thomas Funkhouser.IROS 2018
  • Visual Reinforcement Learning with Imagined Goals.[pdf][Project]

    • Ashvin Nair*, Vitchyr Pong*, Murtaza Dalal, Shikhar Bahl, Steven Lin, Sergey Levine.NeurIPS 2018
  • Grasp2Vec: Learning Object Representations from Self-Supervised Grasping.[pdf][Project]

    • Eric Jang*, Coline Devin*, Vincent Vanhoucke, Sergey Levine.CoRL 2018
  • Robustness via Retrying: Closed-Loop Robotic Manipulation with Self-Supervised Learning.[pdf][Project]

    • Frederik Ebert, Sudeep Dasari, Alex X. Lee, Sergey Levine, Chelsea Finn.CoRL 2018

2019

  • Learning Long-Range Perception Using Self-Supervision from Short-Range Sensors and Odometry.[pdf]

    • Mirko Nava, Jerome Guzzi, R. Omar Chavez-Garcia, Luca M. Gambardella, Alessandro Giusti.Robotics and Automation Letters
  • Learning Latent Plans from Play.[pdf][Project]

    • Corey Lynch, Mohi Khansari, Ted Xiao, Vikash Kumar, Jonathan Tompson, Sergey Levine, Pierre Sermanet

2020

  • Adversarial Skill Networks: Unsupervised Robot Skill Learning from Video.[pdf][Project]
    • Oier Mees, Markus Merklinger, Gabriel Kalweit, Wolfram BurgardICRA 2020

NLP

  • BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding.[pdf][link]

    • Jacob Devlin, Ming-Wei Chang, Kenton Lee, Kristina Toutanova.NAACL 2019 Best Long Paper
  • Self-Supervised Dialogue Learning[pdf]

    • Jiawei Wu, Xin Wang, William Yang Wang.ACL 2019
  • Self-Supervised Learning for Contextualized Extractive Summarization[pdf]

    • Hong Wang, Xin Wang, Wenhan Xiong, Mo Yu, Xiaoxiao Guo, Shiyu Chang, William Yang Wang.ACL 2019
  • A Mutual Information Maximization Perspective of Language Representation Learning[pdf]

    • Lingpeng Kong, Cyprien de Masson d'Autume, Lei Yu, Wang Ling, Zihang Dai, Dani Yogatama.ICLR 2020
  • VL-BERT: Pre-training of Generic Visual-Linguistic Representations[pdf][code]

    • Weijie Su, Xizhou Zhu, Yue Cao, Bin Li, Lewei Lu, Furu Wei, Jifeng Dai.ICLR 2020

ASR

  • Learning Robust and Multilingual Speech Representations[pdf]

    • Kazuya Kawakami, Luyu Wang, Chris Dyer, Phil Blunsom, Aaron van den Oord
  • Unsupervised pretraining transfers well across languages[pdf][code]

    • Morgane Riviere, Armand Joulin, Pierre-Emmanuel Mazare, Emmanuel Dupoux
  • wav2vec: Unsupervised Pre-Training for Speech Recognition[pdf][code]

    • Steffen Schneider, Alexei Baevski, Ronan Collobert, Michael Auli.INTERSPEECH 2019
  • vq-wav2vec: Self-Supervised Learning of Discrete Speech Representations[pdf]

    • Alexei Baevski, Steffen Schneider, Michael Auli.ICLR 2020
  • Effectiveness of self-supervised pre-training for speech recognition[pdf]

    • Alexei Baevski, Michael Auli, Abdelrahman Mohamed
  • Towards Unsupervised Speech Recognition and Synthesis with Quantized Speech Representation Learning[pdf]

    • Alexander H. Liu, Tao Tu, Hung-yi Lee, Lin-shan Lee
  • Self-Training for End-to-End Speech Recognition[pdf]

    • Jacob Kahn, Ann Lee, Awni Hannun.ICASSP 2020
  • Generative Pre-Training for Speech with Autoregressive Predictive Coding[pdf][code]

    • Yu-An Chung, James Glass.ICASSP 2020

Talks

  • The power of Self-Learning Systems. Demis Hassabis (DeepMind).[link]
  • Supersizing Self-Supervision: Learning Perception and Action without Human Supervision. Abhinav Gupta (CMU).[link]
  • Self-supervision, Meta-supervision, Curiosity: Making Computers Study Harder. Alyosha Efros (UCB)[link]
  • Unsupervised Visual Learning Tutorial.CVPR 2018[part 1][part 2]
  • Self-Supervised Learning. Andrew Zisserman (Oxford & Deepmind).[pdf]
  • Graph Embeddings, Content Understanding, & Self-Supervised Learning. Yann LeCun. (NYU & FAIR)[pdf][video]
  • Self-supervised learning: could machines learn like humans? Yann LeCun @EPFL.[video]
  • Week 9 (b): CS294-158 Deep Unsupervised Learning(Spring 2019). Alyosha Efros @UC Berkeley.[video]

Thesis

  • Supervision Beyond Manual Annotations for Learning Visual Representations. Carl Doersch.[pdf].
  • Image Synthesis for Self-Supervised Visual Representation Learning. Richard Zhang.[pdf].
  • Visual Learning beyond Direct Supervision. Tinghui Zhou.[pdf].
  • Visual Learning with Minimal Human Supervision. Ishan Misra.[pdf].

Blog

  • Self-Supervised Representation Learning. Lilian Weng.[link].
  • The Illustrated Self-Supervised Learning. Amit Chaudhary.[link]

License

To the extent possible under law,Zhongzheng Renhas waived all copyright and related or neighboring rights to this work.

成為VIP會員查看完整內容
5
92
0

目錄

  • German BERT deepset - Open Sourcing German BERT
  • CamemBERT
  • Flau-BERT [1912.05372] FlauBERT: Unsupervised Language Model Pre-training for French
  • AlBERTo
  • RobBERT [2001.06286] RobBERT: RobBERT: a Dutch RoBERTa-based Language Model
  • RuBERT [1912.09582] BERTje: A Dutch BERT Model
  • BETO
  • BERTje
  • Portuguese BERT
  • German BERT&Italian BERT
成為VIP會員查看完整內容
1
37
0

文章名字

NLP Transfer Learning In 3 Steps

文章簡介

BERT(Devlin等人,2018)可能是最流行的NLP遷移學習方法。Huggingface的實現提供了許多不錯的特性,並在漂亮的API背後抽象出了細節。PyTorch Lightning是一個輕量級框架(實際上更像是重構PyTorch代碼),它允許使用PyTorch的任何人(如學生、研究人員和生產團隊)輕鬆擴展深度學習代碼,同時使其可重複。它還通過教練旗提供42+項高級研究功能。閃電沒有添加抽象的PyTorch,這意味著它與其他偉大的包,如擁抱臉玩得很好!在本教程中,我們將使用它們的BERT實現在Lightning中執行微調任務。在本教程中,我們將通過3個步驟為NLP進行遷移學習: 我們將從huggingface圖書館導入BERT。 我們將創建一個LightningModule,它使用BERT提取的特征進行微調 我們將使用燈光教練機訓練BertMNLIFinetuner。

文章作者

William Falcon,博士生,人工智能(紐約大學,Facebook人工智能研究)。最近一直致力於自然語言預訓練模型研究,並取得了最大突破。主張機器學習要麵向實踐,麵向實際,立誌解決當前問題,AI必須要有商業驅動,方能足夠長遠的發展。

成為VIP會員查看完整內容
0
43
0

*《Stabilizing Transformers for Reinforcement Learning》E Parisotto, H. F Song, J W. Rae, R Pascanu, C Gulcehre, S M. Jayakumar, M Jaderberg, R L Kaufman, A Clark, S Noury, M M. Botvinick, N Heess, R Hadsell [DeepMind] (2019)

成為VIP會員查看完整內容
0
30
0
小貼士
Top
微信掃碼谘詢專知VIP會員
Top