在深度學習中,卷積神經網絡(CNN或ConvNet)是一類深度神經網絡,最常用於分析視覺圖像。基於它們的共享權重架構和平移不變性特征,它們也被稱為位移不變或空間不變的人工神經網絡(SIANN)。它們在圖像和視頻識別,推薦係統,圖像分類,醫學圖像分析,自然語言處理,和財務時間序列中都有應用。

知識薈萃

卷積神經網絡(CNN)從入門到精通——一個過來人的總結

基礎入門

深度學習是一門實踐科學,實驗發展遠遠甩開了理論研究,因此本文的架構采用理論與實踐相結合的模式。

粗略了解

首先可以去專知深度學習條目下看看相關文章,

針對卷積神經網絡,我們可以通過如下文章了解基本概念

卷積神經網絡工作原理直觀的解釋?https://www.zhihu.com/question/39022858

技術向:一文讀懂卷積神經網絡CNN http://dataunion.org/11692.html

深度學習元老Yann Lecun詳解卷積神經網絡https://www.leiphone.com/news/201608/zaB48AcZ1AFm1TaP.html

CNN筆記:通俗理解卷積神經網絡https://www.2cto.com/kf/201607/522441.html

了解完基本概念之後,還需要對CNN有一個直觀理解,深度學習可視化是一個非常不錯的選擇

Visualizing and Understanding Convolutional Networks中文筆記http://www.gageet.com/2014/10235.php

英文原文,感興趣的可以看一下https://arxiv.org/abs/1311.2901

基本實踐

在開始具體的實踐之前,可以先去tensorflow的playground嚐試一番,地址http://playground.tensorflow.org/,指導http://f.dataguru.cn/article-9324-1.html

之後就可以在自己的電腦上實驗了,首先,使用GPU是必須的:

安裝cudahttp://blog.csdn.net/u010480194/article/details/54287335

安裝cudnnhttp://blog.csdn.net/lucifer_zzq/article/details/76675239

之後就是選擇適合自己的框架

現在最火的深度學習框架是什麼?https://www.zhihu.com/question/52517062?answer_deleted_redirect=true

深度 | 主流深度學習框架對比:看你最適合哪一款?http://mp.weixin.qq.com/s?__biz=MzA3MzI4MjgzMw==&mid=2650719118&idx=2&sn=fad8b7cad70cc6a227f88ae07a89db66#rd

當然,還有一個專門評價框架的github項目,更新比較勤https://github.com/hunkim/DeepLearningStars

如果有選擇困難症的話,不負責任地推薦兩個框架:tensorflow和pytorch,tensorflow可視化和工程銜接做得很好,pytorch實現比較自由,用起來很舒服

tensorflow官網http://www.tensorflow.org/

pytorch 官網http://pytorch.org/

基本按照官網上的指示一步步地安裝就沒啥大問題了,如果真遇到問題,可以上一個神奇的網站https://stackoverflow.com/搜索解決方法,基本上都能找到

還需要熟悉一個重要的工具github https://github.com/,不論是自己管理代碼還是借鑒別人的代碼都很方便,想要教程的話可以參考這篇回答https://www.zhihu.com/question/20070065

當然,要是偷懶不想看的話,可以用IDE來輔助管理,例如pycharmhttp://www.jetbrains.com/pycharm/,教程http://blog.csdn.net/u013088062/article/details/50349833

一個可視化的交互工具也是非常重要的,這裏推薦神器jupyter notebook http://python.jobbole.com/87527/?repeat=w3tc

以上準備工作都做好了,就可以開始自己的入門教程了。事實上官網的教程非常不錯,但要是嫌棄全英文看著困難的話,也可以看看以下教程

tensorflow

TensorFlow 如何入門?https://www.zhihu.com/question/49909565

TensorFlow入門http://hacker.duanshishi.com/?p=1639

穀歌的官方tutorial其實挺完善的,不想看英文可以看看這個中文翻譯http://wiki.jikexueyuan.com/project/tensorflow-zh/

pytorch

PyTorch深度學習:60分鍾入門(Translation)https://zhuanlan.zhihu.com/p/25572330

新手如何入門pytorch?https://www.zhihu.com/question/55720139

超簡單!pytorch入門教程(一):Tensorhttp://www.jianshu.com/p/5ae644748f21

如果對python不熟悉的話,可以先看看這兩個教程python2:http://www.runoob.com/python/python-tutorial.html,python3:http://www.runoob.com/python3/python3-tutorial.html

如果隻是玩票性質的,不想在框架上浪費太多時間的話,可以試試keras

Keras入門教程http://www.360doc.com/content/17/0624/12/1489589_666148811.shtml

進階學習

經過了前麵的入門,相信大家已經對卷積神經網絡有了一個基本概念了,同時對如何實現CNN也有了基本的了解。而進階學習的學習同樣也是兩個方麵

理論深入

首先是反向傳播算法,入門時雖然用不著看,因為常用的框架都有自動求導,但是想要進一步一定要弄清楚。教程http://blog.csdn.net/u014313009/article/details/51039334

接著熟悉一下CNN的幾個經典模型

基礎模型AlexNet

文章:ImageNet Classification with Deep Convolutional Neural Networkshttp://ml.informatik.uni-freiburg.de/former/_media/teaching/ws1314/dl/talk_simon_group2.pdf

講解:http://blog.csdn.net/u014088052/article/details/50898842

代碼:tensorflowhttps://github.com/kratzert/finetune_alexnet_with_tensorflow pytorchhttps://github.com/aaron-xichen/pytorch-playground

一個時代ResNet

文章:Deep Residual Learning for Image Recognitionhttps://arxiv.org/abs/1512.03385

講解:http://blog.csdn.net/wspba/article/details/56019373

代碼:tensorflowhttps://github.com/ry/tensorflow-resnet pytorchhttps://github.com/isht7/pytorch-deeplab-resnet

最近挺好用DenseNet

文章:Densely Connected Convolutional Networks https://arxiv.org/pdf/1608.06993.pdf

講解:http://blog.csdn.net/u014380165/article/details/75142664

代碼:原版https://github.com/liuzhuang13/DenseNet tensorflowhttps://github.com/YixuanLi/densenet-tensorflow pytorchhttps://github.com/bamos/densenet.pytorch

推薦先看講解,然後閱讀源碼,一方麵可以加深對模型的理解,另一方麵也可以從別人的源碼中學習各種框架新姿勢。

當然,我們不能僅僅停留在表麵上,這裏推薦一本非常有名的書《Deep Learning》,這裏是中文版的鏈接https://github.com/exacity/deeplearningbook-chinese

更為基礎的理論研究目前還處於缺失狀態

實踐深入

要是有耐心的同學,可以學習一下斯坦福新開的課程https://www.bilibili.com/video/av9156347/

具體到實踐中,有非常多需要學習的點。在學習之前,最好先看看調參技巧

深度學習調參有哪些技巧?https://www.zhihu.com/question/25097993

過去有本調參聖經Neural Networks: Tricks of the Trade ,太老了,不推薦看。

dropout,lrn這些過去常用的模塊最近已經用得越來越少了,就不贅述了,有關正則化,推薦BatchNorm https://www.zhihu.com/question/38102762, 思想簡單,效果好

雖然有了BatchNorm之後訓練基本已經非常穩定了,但最好還是學習一下梯度裁剪http://blog.csdn.net/zyf19930610/article/details/71743291

激活函數也是一個非常重要的點,不過在卷積神經網絡中基本無腦用ReLuhttp://www.cnblogs.com/neopenx/p/4453161.html就行了,計算快,ReLu+BatchNorm可以說是萬金油。當然,像一些具體的任務還是需要具體分析,例如GAN就不適合用這種簡單粗暴的激活函數。

結構上基本完善了,接下來就是優化了,優化的算法有很多,最常見的是SGD與Adam。

所有優化算法概覽http://www.mamicode.com/info-detail-1931210.html

好的算法可以更快地收斂或者有更好的效果,不過大多數實驗中SGD與Adam已經夠用了。

大神們的經驗也是要看一下的:Yoshua Bengio等大神傳授:26條深度學習經驗http://www.csdn.net/article/2015-09-16/2825716

細化研究

前麵的這些學完之後,就是具體的研究項目了,大家可以去這個github上找自己感興趣的論文https://github.com/terryum/awesome-deep-learning-papers,下麵列舉了一些和卷積神經網絡相關的優秀論文。

Understanding / Generalization / Transfer

Distilling the knowledge in a neural network (2015), G. Hinton et al. http://arxiv.org/pdf/1503.02531

Deep neural networks are easily fooled: High confidence predictions for unrecognizable images (2015), A. Nguyen et al. http://arxiv.org/pdf/1412.1897

How transferable are features in deep neural networks? (2014), J. Yosinski et al.http://papers.nips.cc/paper/5347-how-transferable-are-features-in-deep-neural-networks.pdf

CNN features off-the-Shelf: An astounding baseline for recognition (2014), A. Razavian et al. http://www.cv-foundation.org//openaccess/content_cvpr_workshops_2014/W15/papers/Razavian_CNN_Features_Off-the-Shelf_2014_CVPR_paper.pdf

Learning and transferring mid-Level image representations using convolutional neural networks (2014), M. Oquab et al. http://www.cv-foundation.org/openaccess/content_cvpr_2014/papers/Oquab_Learning_and_Transferring_2014_CVPR_paper.pdf

Visualizing and understanding convolutional networks (2014), M. Zeiler and R. Fergus http://arxiv.org/pdf/1311.2901

Decaf: A deep convolutional activation feature for generic visual recognition (2014), J. Donahue et al. http://arxiv.org/pdf/1310.1531

Optimization / Training Techniques

Training very deep networks (2015), R. Srivastava et al.http://papers.nips.cc/paper/5850-training-very-deep-networks.pdf

Batch normalization: Accelerating deep network training by reducing internal covariate shift (2015), S. Loffe and C. Szegedy http://arxiv.org/pdf/1502.03167

Delving deep into rectifiers: Surpassing human-level performance on imagenet classification (2015), K. He et al. http://www.cv-foundation.org/openaccess/content_iccv_2015/papers/He_Delving_Deep_into_ICCV_2015_paper.pdf

Dropout: A simple way to prevent neural networks from overfitting (2014), N. Srivastava et al. http://jmlr.org/papers/volume15/srivastava14a/srivastava14a.pdf

Adam: A method for stochastic optimization (2014), D. Kingma and J. Bahttp://arxiv.org/pdf/1412.6980

Improving neural networks by preventing co-adaptation of feature detectors (2012), G. Hinton et al. http://arxiv.org/pdf/1207.0580.pdf

Random search for hyper-parameter optimization (2012) J. Bergstra and Y. Bengiohttp://www.jmlr.org/papers/volume13/bergstra12a/bergstra12a

Convolutional Neural Network Models

Rethinking the inception architecture for computer vision (2016), C. Szegedy et al. http://www.cv-foundation.org/openaccess/content_cvpr_2016/papers/Szegedy_Rethinking_the_Inception_CVPR_2016_paper.pdf

Inception-v4, inception-resnet and the impact of residual connections on learning (2016), C. Szegedy et al.http://arxiv.org/pdf/1602.07261

Identity Mappings in Deep Residual Networks (2016), K. He et al. https://arxiv.org/pdf/1603.05027v2.pdf

Deep residual learning for image recognition (2016), K. He et al. http://arxiv.org/pdf/1512.03385

Spatial transformer network (2015), M. Jaderberg et al., http://papers.nips.cc/paper/5854-spatial-transformer-networks.pdf

Going deeper with convolutions (2015), C. Szegedy et al.http://www.cv-foundation.org/openaccess/content_cvpr_2015/papers/Szegedy_Going_Deeper_With_2015_CVPR_paper.pdf

Very deep convolutional networks for large-scale image recognition (2014), K. Simonyan and A. Zisserman http://arxiv.org/pdf/1409.1556

Return of the devil in the details: delving deep into convolutional nets (2014), K. Chatfield et al. http://arxiv.org/pdf/1405.3531

OverFeat: Integrated recognition, localization and detection using convolutional networks (2013), P. Sermanet et al.http://arxiv.org/pdf/1312.6229

Maxout networks (2013), I. Goodfellow et al. http://arxiv.org/pdf/1302.4389v4

Network in network (2013), M. Lin et al. http://arxiv.org/pdf/1312.4400

ImageNet classification with deep convolutional neural networks (2012), A. Krizhevsky et al.http://papers.nips.cc/paper/4824-imagenet-classification-with-deep-convolutional-neural-networks.pdf

Image: Segmentation / Object Detection

You only look once: Unified, real-time object detection (2016), J. Redmon et al.http://www.cv-foundation.org/openaccess/content_cvpr_2016/papers/Redmon_You_Only_Look_CVPR_2016_paper.pdf

Fully convolutional networks for semantic segmentation (2015), J. Long et al. http://www.cv-foundation.org/openaccess/content_cvpr_2015/papers/Long_Fully_Convolutional_Networks_2015_CVPR_paper.pdf

Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks (2015), S. Ren et al.http://papers.nips.cc/paper/5638-faster-r-cnn-towards-real-time-object-detection-with-region-proposal-networks.pdf

Fast R-CNN (2015), R. Girshick http://www.cv-foundation.org/openaccess/content_iccv_2015/papers/Girshick_Fast_R-CNN_ICCV_2015_paper.pdf

Rich feature hierarchies for accurate object detection and semantic segmentation (2014), R. Girshick et al.http://www.cv-foundation.org/openaccess/content_cvpr_2014/papers/Girshick_Rich_Feature_Hierarchies_2014_CVPR_paper.pdf

Spatial pyramid pooling in deep convolutional networks for visual recognition (2014), K. He et al. http://arxiv.org/pdf/1406.4729

Semantic image segmentation with deep convolutional nets and fully connected CRFs, L. Chen et al. https://arxiv.org/pdf/1412.7062

Learning hierarchical features for scene labeling (2013), C. Farabet et al. https://hal-enpc.archives-ouvertes.fr/docs/00/74/20/77/PDF/farabet-pami-13.pdf

Image / Video / Etc

Image Super-Resolution Using Deep Convolutional Networks (2016), C. Dong et al. https://arxiv.org/pdf/1501.00092v3.pdf

A neural algorithm of artistic style (2015), L. Gatys et al. https://arxiv.org/pdf/1508.06576

Deep visual-semantic alignments for generating image descriptions (2015), A. Karpathy and L. Fei-Feihttp://www.cv-foundation.org/openaccess/content_cvpr_2015/papers/Karpathy_Deep_Visual-Semantic_Alignments_2015_CVPR_paper.pdf

Show, attend and tell: Neural image caption generation with visual attention (2015), K. Xu et al. http://arxiv.org/pdf/1502.03044

Show and tell: A neural image caption generator (2015), O. Vinyals et al. http://www.cv-foundation.org/openaccess/content_cvpr_2015/papers/Vinyals_Show_and_Tell_2015_CVPR_paper.pdf

Long-term recurrent convolutional networks for visual recognition and description (2015), J. Donahue et al.http://www.cv-foundation.org/openaccess/content_cvpr_2015/papers/Donahue_Long-Term_Recurrent_Convolutional_2015_CVPR_paper.pdf

VQA: Visual question answering (2015), S. Antol et al.http://www.cv-foundation.org/openaccess/content_iccv_2015/papers/Antol_VQA_Visual_Question_ICCV_2015_paper.pdf

DeepFace: Closing the gap to human-level performance in face verification (2014), Y. Taigman et al.http://www.cv-foundation.org/openaccess/content_cvpr_2014/papers/Taigman_DeepFace_Closing_the_2014_CVPR_paper.pdf

Large-scale video classification with convolutional neural networks (2014), A. Karpathy et al. http://vision.stanford.edu/pdf/karpathy14.pdf

Two-stream convolutional networks for action recognition in videos (2014), K. Simonyan et al. http://papers.nips.cc/paper/5353-two-stream-convolutional-networks-for-action-recognition-in-videos.pdf

3D convolutional neural networks for human action recognition (2013), S. Ji et al.http://machinelearning.wustl.edu/mlpapers/paper_files/icml2010_JiXYY10.pdf

更多的需要可以參考專知的另一篇deeplearning相關的文章//www.webtourguide.com/topic/2001228999615594/awesome,其中有很多具體細化的領域以及相關文章,這裏就不重複了。

精品內容

【博士論文】自然場景中不規則文字的檢測和識別研究
專知會員服務
23+閱讀 · 2022年12月18日
使用多層膠囊網絡的國防軍事目標檢測
專知會員服務
29+閱讀 · 2022年8月14日
【Nature.Mac.Intel】基於DNA調控電路的分子卷積神經網絡
專知會員服務
9+閱讀 · 2022年8月7日
圖時卷積神經網絡:架構與理論分析
專知會員服務
20+閱讀 · 2022年7月3日
IJCAI2022 Oral: 探究和解釋圖像分類任務中存在的頻域偏見
專知會員服務
11+閱讀 · 2022年5月12日
微信掃碼谘詢專知VIP會員