在深度學習中,卷積神經網絡(CNN或ConvNet)是一類深度神經網絡,最常用於分析視覺圖像。基於它們的共享權重架構和平移不變性特征,它們也被稱為位移不變或空間不變的人工神經網絡(SIANN)。它們在圖像和視頻識別,推薦係統,圖像分類,醫學圖像分析,自然語言處理,和財務時間序列中都有應用。

知識薈萃

卷積神經網絡(CNN)從入門到精通——一個過來人的總結

基礎入門

深度學習是一門實踐科學,實驗發展遠遠甩開了理論研究,因此本文的架構采用理論與實踐相結合的模式。

粗略了解

首先可以去專知深度學習條目下看看相關文章,

針對卷積神經網絡,我們可以通過如下文章了解基本概念

卷積神經網絡工作原理直觀的解釋?https://www.zhihu.com/question/39022858

技術向:一文讀懂卷積神經網絡CNNhttp://dataunion.org/11692.html

深度學習元老Yann Lecun詳解卷積神經網絡https://www.leiphone.com/news/201608/zaB48AcZ1AFm1TaP.html

CNN筆記:通俗理解卷積神經網絡https://www.2cto.com/kf/201607/522441.html

了解完基本概念之後,還需要對CNN有一個直觀理解,深度學習可視化是一個非常不錯的選擇

Visualizing and Understanding Convolutional Networks中文筆記http://www.gageet.com/2014/10235.php

英文原文,感興趣的可以看一下https://arxiv.org/abs/1311.2901

基本實踐

在開始具體的實踐之前,可以先去tensorflow的playground嚐試一番,地址http://playground.tensorflow.org/,指導http://f.dataguru.cn/article-9324-1.html

之後就可以在自己的電腦上實驗了,首先,使用GPU是必須的:

安裝cudahttp://blog.csdn.net/u010480194/article/details/54287335

安裝cudnnhttp://blog.csdn.net/lucifer_zzq/article/details/76675239

之後就是選擇適合自己的框架

現在最火的深度學習框架是什麼?https://www.zhihu.com/question/52517062?answer_deleted_redirect=true

深度 | 主流深度學習框架對比:看你最適合哪一款?http://mp.weixin.qq.com/s?__biz=MzA3MzI4MjgzMw==&mid=2650719118&idx=2&sn=fad8b7cad70cc6a227f88ae07a89db66#rd

當然,還有一個專門評價框架的github項目,更新比較勤https://github.com/hunkim/DeepLearningStars

如果有選擇困難症的話,不負責任地推薦兩個框架:tensorflow和pytorch,tensorflow可視化和工程銜接做得很好,pytorch實現比較自由,用起來很舒服

tensorflow官網http://www.tensorflow.org/

pytorch 官網http://pytorch.org/

基本按照官網上的指示一步步地安裝就沒啥大問題了,如果真遇到問題,可以上一個神奇的網站https://stackoverflow.com/搜索解決方法,基本上都能找到

還需要熟悉一個重要的工具githubhttps://github.com/,不論是自己管理代碼還是借鑒別人的代碼都很方便,想要教程的話可以參考這篇回答https://www.zhihu.com/question/20070065

當然,要是偷懶不想看的話,可以用IDE來輔助管理,例如pycharmhttp://www.jetbrains.com/pycharm/,教程http://blog.csdn.net/u013088062/article/details/50349833

一個可視化的交互工具也是非常重要的,這裏推薦神器jupyter notebookhttp://python.jobbole.com/87527/?repeat=w3tc

以上準備工作都做好了,就可以開始自己的入門教程了。事實上官網的教程非常不錯,但要是嫌棄全英文看著困難的話,也可以看看以下教程

tensorflow

TensorFlow 如何入門?https://www.zhihu.com/question/49909565

TensorFlow入門http://hacker.duanshishi.com/?p=1639

穀歌的官方tutorial其實挺完善的,不想看英文可以看看這個中文翻譯http://wiki.jikexueyuan.com/project/tensorflow-zh/

pytorch

PyTorch深度學習:60分鍾入門(Translation)https://zhuanlan.zhihu.com/p/25572330

新手如何入門pytorch?https://www.zhihu.com/question/55720139

超簡單!pytorch入門教程(一):Tensorhttp://www.jianshu.com/p/5ae644748f21

如果對python不熟悉的話,可以先看看這兩個教程python2:http://www.runoob.com/python/python-tutorial.html,python3:http://www.runoob.com/python3/python3-tutorial.html

如果隻是玩票性質的,不想在框架上浪費太多時間的話,可以試試keras

Keras入門教程http://www.360doc.com/content/17/0624/12/1489589_666148811.shtml

進階學習

經過了前麵的入門,相信大家已經對卷積神經網絡有了一個基本概念了,同時對如何實現CNN也有了基本的了解。而進階學習的學習同樣也是兩個方麵

理論深入

首先是反向傳播算法,入門時雖然用不著看,因為常用的框架都有自動求導,但是想要進一步一定要弄清楚。教程http://blog.csdn.net/u014313009/article/details/51039334

接著熟悉一下CNN的幾個經典模型

基礎模型AlexNet

文章:ImageNet Classification with Deep Convolutional Neural Networkshttp://ml.informatik.uni-freiburg.de/former/_media/teaching/ws1314/dl/talk_simon_group2.pdf

講解:http://blog.csdn.net/u014088052/article/details/50898842

代碼:tensorflowhttps://github.com/kratzert/finetune_alexnet_with_tensorflowpytorchhttps://github.com/aaron-xichen/pytorch-playground

一個時代ResNet

文章:Deep Residual Learning for Image Recognitionhttps://arxiv.org/abs/1512.03385

講解:http://blog.csdn.net/wspba/article/details/56019373

代碼:tensorflowhttps://github.com/ry/tensorflow-resnetpytorchhttps://github.com/isht7/pytorch-deeplab-resnet

最近挺好用DenseNet

文章:Densely Connected Convolutional Networkshttps://arxiv.org/pdf/1608.06993.pdf

講解:http://blog.csdn.net/u014380165/article/details/75142664

代碼:原版https://github.com/liuzhuang13/DenseNettensorflowhttps://github.com/YixuanLi/densenet-tensorflowpytorchhttps://github.com/bamos/densenet.pytorch

推薦先看講解,然後閱讀源碼,一方麵可以加深對模型的理解,另一方麵也可以從別人的源碼中學習各種框架新姿勢。

當然,我們不能僅僅停留在表麵上,這裏推薦一本非常有名的書《Deep Learning》,這裏是中文版的鏈接https://github.com/exacity/deeplearningbook-chinese

更為基礎的理論研究目前還處於缺失狀態

實踐深入

要是有耐心的同學,可以學習一下斯坦福新開的課程https://www.bilibili.com/video/av9156347/

具體到實踐中,有非常多需要學習的點。在學習之前,最好先看看調參技巧

深度學習調參有哪些技巧?https://www.zhihu.com/question/25097993

過去有本調參聖經Neural Networks: Tricks of the Trade ,太老了,不推薦看。

dropout,lrn這些過去常用的模塊最近已經用得越來越少了,就不贅述了,有關正則化,推薦BatchNormhttps://www.zhihu.com/question/38102762, 思想簡單,效果好

雖然有了BatchNorm之後訓練基本已經非常穩定了,但最好還是學習一下梯度裁剪http://blog.csdn.net/zyf19930610/article/details/71743291

激活函數也是一個非常重要的點,不過在卷積神經網絡中基本無腦用ReLuhttp://www.cnblogs.com/neopenx/p/4453161.html就行了,計算快,ReLu+BatchNorm可以說是萬金油。當然,像一些具體的任務還是需要具體分析,例如GAN就不適合用這種簡單粗暴的激活函數。

結構上基本完善了,接下來就是優化了,優化的算法有很多,最常見的是SGD與Adam。

所有優化算法概覽http://www.mamicode.com/info-detail-1931210.html

好的算法可以更快地收斂或者有更好的效果,不過大多數實驗中SGD與Adam已經夠用了。

大神們的經驗也是要看一下的:Yoshua Bengio等大神傳授:26條深度學習經驗http://www.csdn.net/article/2015-09-16/2825716

細化研究

前麵的這些學完之後,就是具體的研究項目了,大家可以去這個github上找自己感興趣的論文https://github.com/terryum/awesome-deep-learning-papers,下麵列舉了一些和卷積神經網絡相關的優秀論文。

Understanding / Generalization / Transfer

Distilling the knowledge in a neural network (2015), G. Hinton et al.http://arxiv.org/pdf/1503.02531

Deep neural networks are easily fooled: High confidence predictions for unrecognizable images (2015), A. Nguyen et al.http://arxiv.org/pdf/1412.1897

How transferable are features in deep neural networks? (2014), J. Yosinski et al.http://papers.nips.cc/paper/5347-how-transferable-are-features-in-deep-neural-networks.pdf

CNN features off-the-Shelf: An astounding baseline for recognition (2014), A. Razavian et al.http://www.cv-foundation.org//openaccess/content_cvpr_workshops_2014/W15/papers/Razavian_CNN_Features_Off-the-Shelf_2014_CVPR_paper.pdf

Learning and transferring mid-Level image representations using convolutional neural networks (2014), M. Oquab et al.http://www.cv-foundation.org/openaccess/content_cvpr_2014/papers/Oquab_Learning_and_Transferring_2014_CVPR_paper.pdf

Visualizing and understanding convolutional networks (2014), M. Zeiler and R. Fergushttp://arxiv.org/pdf/1311.2901

Decaf: A deep convolutional activation feature for generic visual recognition (2014), J. Donahue et al.http://arxiv.org/pdf/1310.1531

Optimization / Training Techniques

Training very deep networks (2015), R. Srivastava et al.http://papers.nips.cc/paper/5850-training-very-deep-networks.pdf

Batch normalization: Accelerating deep network training by reducing internal covariate shift (2015), S. Loffe and C. Szegedyhttp://arxiv.org/pdf/1502.03167

Delving deep into rectifiers: Surpassing human-level performance on imagenet classification (2015), K. He et al.http://www.cv-foundation.org/openaccess/content_iccv_2015/papers/He_Delving_Deep_into_ICCV_2015_paper.pdf

Dropout: A simple way to prevent neural networks from overfitting (2014), N. Srivastava et al.http://jmlr.org/papers/volume15/srivastava14a/srivastava14a.pdf

Adam: A method for stochastic optimization (2014), D. Kingma and J. Bahttp://arxiv.org/pdf/1412.6980

Improving neural networks by preventing co-adaptation of feature detectors (2012), G. Hinton et al.http://arxiv.org/pdf/1207.0580.pdf

Random search for hyper-parameter optimization (2012) J. Bergstra and Y. Bengiohttp://www.jmlr.org/papers/volume13/bergstra12a/bergstra12a

Convolutional Neural Network Models

Rethinking the inception architecture for computer vision (2016), C. Szegedy et al.http://www.cv-foundation.org/openaccess/content_cvpr_2016/papers/Szegedy_Rethinking_the_Inception_CVPR_2016_paper.pdf

Inception-v4, inception-resnet and the impact of residual connections on learning (2016), C. Szegedy et al.http://arxiv.org/pdf/1602.07261

Identity Mappings in Deep Residual Networks (2016), K. He et al.https://arxiv.org/pdf/1603.05027v2.pdf

Deep residual learning for image recognition (2016), K. He et al.http://arxiv.org/pdf/1512.03385

Spatial transformer network (2015), M. Jaderberg et al.,http://papers.nips.cc/paper/5854-spatial-transformer-networks.pdf

Going deeper with convolutions (2015), C. Szegedy et al.http://www.cv-foundation.org/openaccess/content_cvpr_2015/papers/Szegedy_Going_Deeper_With_2015_CVPR_paper.pdf

Very deep convolutional networks for large-scale image recognition (2014), K. Simonyan and A. Zissermanhttp://arxiv.org/pdf/1409.1556

Return of the devil in the details: delving deep into convolutional nets (2014), K. Chatfield et al.http://arxiv.org/pdf/1405.3531

OverFeat: Integrated recognition, localization and detection using convolutional networks (2013), P. Sermanet et al.http://arxiv.org/pdf/1312.6229

Maxout networks (2013), I. Goodfellow et al.http://arxiv.org/pdf/1302.4389v4

Network in network (2013), M. Lin et al.http://arxiv.org/pdf/1312.4400

ImageNet classification with deep convolutional neural networks (2012), A. Krizhevsky et al.http://papers.nips.cc/paper/4824-imagenet-classification-with-deep-convolutional-neural-networks.pdf

Image: Segmentation / Object Detection

You only look once: Unified, real-time object detection (2016), J. Redmon et al.http://www.cv-foundation.org/openaccess/content_cvpr_2016/papers/Redmon_You_Only_Look_CVPR_2016_paper.pdf

Fully convolutional networks for semantic segmentation (2015), J. Long et al.http://www.cv-foundation.org/openaccess/content_cvpr_2015/papers/Long_Fully_Convolutional_Networks_2015_CVPR_paper.pdf

Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks (2015), S. Ren et al.http://papers.nips.cc/paper/5638-faster-r-cnn-towards-real-time-object-detection-with-region-proposal-networks.pdf

Fast R-CNN (2015), R. Girshickhttp://www.cv-foundation.org/openaccess/content_iccv_2015/papers/Girshick_Fast_R-CNN_ICCV_2015_paper.pdf

Rich feature hierarchies for accurate object detection and semantic segmentation (2014), R. Girshick et al.http://www.cv-foundation.org/openaccess/content_cvpr_2014/papers/Girshick_Rich_Feature_Hierarchies_2014_CVPR_paper.pdf

Spatial pyramid pooling in deep convolutional networks for visual recognition (2014), K. He et al.http://arxiv.org/pdf/1406.4729

Semantic image segmentation with deep convolutional nets and fully connected CRFs, L. Chen et al.https://arxiv.org/pdf/1412.7062

Learning hierarchical features for scene labeling (2013), C. Farabet et al.https://hal-enpc.archives-ouvertes.fr/docs/00/74/20/77/PDF/farabet-pami-13.pdf

Image / Video / Etc

Image Super-Resolution Using Deep Convolutional Networks (2016), C. Dong et al.https://arxiv.org/pdf/1501.00092v3.pdf

A neural algorithm of artistic style (2015), L. Gatys et al.https://arxiv.org/pdf/1508.06576

Deep visual-semantic alignments for generating image descriptions (2015), A. Karpathy and L. Fei-Feihttp://www.cv-foundation.org/openaccess/content_cvpr_2015/papers/Karpathy_Deep_Visual-Semantic_Alignments_2015_CVPR_paper.pdf

Show, attend and tell: Neural image caption generation with visual attention (2015), K. Xu et al.http://arxiv.org/pdf/1502.03044

Show and tell: A neural image caption generator (2015), O. Vinyals et al.http://www.cv-foundation.org/openaccess/content_cvpr_2015/papers/Vinyals_Show_and_Tell_2015_CVPR_paper.pdf

Long-term recurrent convolutional networks for visual recognition and description (2015), J. Donahue et al.http://www.cv-foundation.org/openaccess/content_cvpr_2015/papers/Donahue_Long-Term_Recurrent_Convolutional_2015_CVPR_paper.pdf

VQA: Visual question answering (2015), S. Antol et al.http://www.cv-foundation.org/openaccess/content_iccv_2015/papers/Antol_VQA_Visual_Question_ICCV_2015_paper.pdf

DeepFace: Closing the gap to human-level performance in face verification (2014), Y. Taigman et al.http://www.cv-foundation.org/openaccess/content_cvpr_2014/papers/Taigman_DeepFace_Closing_the_2014_CVPR_paper.pdf

Large-scale video classification with convolutional neural networks (2014), A. Karpathy et al.http://vision.stanford.edu/pdf/karpathy14.pdf

Two-stream convolutional networks for action recognition in videos (2014), K. Simonyan et al.http://papers.nips.cc/paper/5353-two-stream-convolutional-networks-for-action-recognition-in-videos.pdf

3D convolutional neural networks for human action recognition (2013), S. Ji et al.http://machinelearning.wustl.edu/mlpapers/paper_files/icml2010_JiXYY10.pdf

更多的需要可以參考專知的另一篇deeplearning相關的文章//www.webtourguide.com/topic/2001228999615594/awesome,其中有很多具體細化的領域以及相關文章,這裏就不重複了。

VIP內容

卷積神經網絡(CNNs)和其他深度網絡在各種計算機視覺任務中實現了前所未有的突破,從圖像分類到目標檢測、語義分割、圖像字幕、視覺問答和視覺對話。雖然這些模型具有卓越的性能,但它們缺乏可分解為單個直觀組件的能力,這使得它們難以解釋。因此,當今天的智能係統出現故障時,它們通常會在沒有警告或解釋的情況下出現令人尷尬的故障,讓用戶盯著一個不連貫的輸出,想知道係統為什麼會這樣做。為了能夠建立對智能係統的信任,並將其有意義地融入我們的日常生活,我們必須建立“透明”的模型,能夠解釋它們為什麼預測它們所預測的東西。

本教程將向參與者介紹超越性能的計算機視覺模型的不同方麵。Ramprasaath R. Selvaraju將專注於可解釋AI方法,以及理解決策過程如何幫助修正模型的各種特征。Sara Hooker將討論視覺模型的可信度和社會影響。Zhou Bolei將專注於解剖視覺模型的交互方麵及其對視覺編輯應用的影響。Aleksander Madry將專注於視覺模型的魯棒性。因此,在本教程中,除了在視覺模型中同樣重要的測試集性能之外,還會有不同視角的統一。

成為VIP會員查看完整內容
0
16
0

最新論文

This paper investigates unsupervised learning of Full-Waveform Inversion (FWI), which has been widely used in geophysics to estimate subsurface velocity maps from seismic data. This problem is mathematically formulated by a second order partial differential equation (PDE), but is hard to solve. Moreover, acquiring velocity map is extremely expensive, making it impractical to scale up a supervised approach to train the mapping from seismic data to velocity maps with convolutional neural networks (CNN). We address these difficulties by integrating PDE and CNN in a loop, thus shifting the paradigm to unsupervised learning that only requires seismic data. In particular, we use finite difference to approximate the forward modeling of PDE as a differentiable operator (from velocity map to seismic data) and model its inversion by CNN (from seismic data to velocity map). Hence, we transform the supervised inversion task into an unsupervised seismic data reconstruction task. We also introduce a new large-scale dataset OpenFWI, to establish a more challenging benchmark for the community. Experiment results show that our model (using seismic data alone) yields comparable accuracy to the supervised counterpart (using both seismic data and velocity map). Furthermore, it outperforms the supervised model when involving more seismic data.

0
0
0
下載
預覽
Top