在深度學習中,卷積神經網絡(CNN或ConvNet)是一類深度神經網絡,最常用於分析視覺圖像。基於它們的共享權重架構和平移不變性特征,它們也被稱為位移不變或空間不變的人工神經網絡(SIANN)。它們在圖像和視頻識別,推薦係統,圖像分類,醫學圖像分析,自然語言處理,和財務時間序列中都有應用。

知識薈萃

卷積神經網絡(CNN)從入門到精通——一個過來人的總結

基礎入門

深度學習是一門實踐科學,實驗發展遠遠甩開了理論研究,因此本文的架構采用理論與實踐相結合的模式。

粗略了解

首先可以去專知深度學習條目下看看相關文章,

針對卷積神經網絡,我們可以通過如下文章了解基本概念

卷積神經網絡工作原理直觀的解釋?https://www.zhihu.com/question/39022858

技術向:一文讀懂卷積神經網絡CNNhttp://dataunion.org/11692.html

深度學習元老Yann Lecun詳解卷積神經網絡https://www.leiphone.com/news/201608/zaB48AcZ1AFm1TaP.html

CNN筆記:通俗理解卷積神經網絡https://www.2cto.com/kf/201607/522441.html

了解完基本概念之後,還需要對CNN有一個直觀理解,深度學習可視化是一個非常不錯的選擇

Visualizing and Understanding Convolutional Networks中文筆記http://www.gageet.com/2014/10235.php

英文原文,感興趣的可以看一下https://arxiv.org/abs/1311.2901

基本實踐

在開始具體的實踐之前,可以先去tensorflow的playground嚐試一番,地址http://playground.tensorflow.org/,指導http://f.dataguru.cn/article-9324-1.html

之後就可以在自己的電腦上實驗了,首先,使用GPU是必須的:

安裝cudahttp://blog.csdn.net/u010480194/article/details/54287335

安裝cudnnhttp://blog.csdn.net/lucifer_zzq/article/details/76675239

之後就是選擇適合自己的框架

現在最火的深度學習框架是什麼?https://www.zhihu.com/question/52517062?answer_deleted_redirect=true

深度 | 主流深度學習框架對比:看你最適合哪一款?http://mp.weixin.qq.com/s?__biz=MzA3MzI4MjgzMw==&mid=2650719118&idx=2&sn=fad8b7cad70cc6a227f88ae07a89db66#rd

當然,還有一個專門評價框架的github項目,更新比較勤https://github.com/hunkim/DeepLearningStars

如果有選擇困難症的話,不負責任地推薦兩個框架:tensorflow和pytorch,tensorflow可視化和工程銜接做得很好,pytorch實現比較自由,用起來很舒服

tensorflow官網http://www.tensorflow.org/

pytorch 官網http://pytorch.org/

基本按照官網上的指示一步步地安裝就沒啥大問題了,如果真遇到問題,可以上一個神奇的網站https://stackoverflow.com/搜索解決方法,基本上都能找到

還需要熟悉一個重要的工具githubhttps://github.com/,不論是自己管理代碼還是借鑒別人的代碼都很方便,想要教程的話可以參考這篇回答https://www.zhihu.com/question/20070065

當然,要是偷懶不想看的話,可以用IDE來輔助管理,例如pycharmhttp://www.jetbrains.com/pycharm/,教程http://blog.csdn.net/u013088062/article/details/50349833

一個可視化的交互工具也是非常重要的,這裏推薦神器jupyter notebookhttp://python.jobbole.com/87527/?repeat=w3tc

以上準備工作都做好了,就可以開始自己的入門教程了。事實上官網的教程非常不錯,但要是嫌棄全英文看著困難的話,也可以看看以下教程

tensorflow

TensorFlow 如何入門?https://www.zhihu.com/question/49909565

TensorFlow入門http://hacker.duanshishi.com/?p=1639

穀歌的官方tutorial其實挺完善的,不想看英文可以看看這個中文翻譯http://wiki.jikexueyuan.com/project/tensorflow-zh/

pytorch

PyTorch深度學習:60分鍾入門(Translation)https://zhuanlan.zhihu.com/p/25572330

新手如何入門pytorch?https://www.zhihu.com/question/55720139

超簡單!pytorch入門教程(一):Tensorhttp://www.jianshu.com/p/5ae644748f21

如果對python不熟悉的話,可以先看看這兩個教程python2:http://www.runoob.com/python/python-tutorial.html,python3:http://www.runoob.com/python3/python3-tutorial.html

如果隻是玩票性質的,不想在框架上浪費太多時間的話,可以試試keras

Keras入門教程http://www.360doc.com/content/17/0624/12/1489589_666148811.shtml

進階學習

經過了前麵的入門,相信大家已經對卷積神經網絡有了一個基本概念了,同時對如何實現CNN也有了基本的了解。而進階學習的學習同樣也是兩個方麵

理論深入

首先是反向傳播算法,入門時雖然用不著看,因為常用的框架都有自動求導,但是想要進一步一定要弄清楚。教程http://blog.csdn.net/u014313009/article/details/51039334

接著熟悉一下CNN的幾個經典模型

基礎模型AlexNet

文章:ImageNet Classification with Deep Convolutional Neural Networkshttp://ml.informatik.uni-freiburg.de/former/_media/teaching/ws1314/dl/talk_simon_group2.pdf

講解:http://blog.csdn.net/u014088052/article/details/50898842

代碼:tensorflowhttps://github.com/kratzert/finetune_alexnet_with_tensorflowpytorchhttps://github.com/aaron-xichen/pytorch-playground

一個時代ResNet

文章:Deep Residual Learning for Image Recognitionhttps://arxiv.org/abs/1512.03385

講解:http://blog.csdn.net/wspba/article/details/56019373

代碼:tensorflowhttps://github.com/ry/tensorflow-resnetpytorchhttps://github.com/isht7/pytorch-deeplab-resnet

最近挺好用DenseNet

文章:Densely Connected Convolutional Networkshttps://arxiv.org/pdf/1608.06993.pdf

講解:http://blog.csdn.net/u014380165/article/details/75142664

代碼:原版https://github.com/liuzhuang13/DenseNettensorflowhttps://github.com/YixuanLi/densenet-tensorflowpytorchhttps://github.com/bamos/densenet.pytorch

推薦先看講解,然後閱讀源碼,一方麵可以加深對模型的理解,另一方麵也可以從別人的源碼中學習各種框架新姿勢。

當然,我們不能僅僅停留在表麵上,這裏推薦一本非常有名的書《Deep Learning》,這裏是中文版的鏈接https://github.com/exacity/deeplearningbook-chinese

更為基礎的理論研究目前還處於缺失狀態

實踐深入

要是有耐心的同學,可以學習一下斯坦福新開的課程https://www.bilibili.com/video/av9156347/

具體到實踐中,有非常多需要學習的點。在學習之前,最好先看看調參技巧

深度學習調參有哪些技巧?https://www.zhihu.com/question/25097993

過去有本調參聖經Neural Networks: Tricks of the Trade ,太老了,不推薦看。

dropout,lrn這些過去常用的模塊最近已經用得越來越少了,就不贅述了,有關正則化,推薦BatchNormhttps://www.zhihu.com/question/38102762, 思想簡單,效果好

雖然有了BatchNorm之後訓練基本已經非常穩定了,但最好還是學習一下梯度裁剪http://blog.csdn.net/zyf19930610/article/details/71743291

激活函數也是一個非常重要的點,不過在卷積神經網絡中基本無腦用ReLuhttp://www.cnblogs.com/neopenx/p/4453161.html就行了,計算快,ReLu+BatchNorm可以說是萬金油。當然,像一些具體的任務還是需要具體分析,例如GAN就不適合用這種簡單粗暴的激活函數。

結構上基本完善了,接下來就是優化了,優化的算法有很多,最常見的是SGD與Adam。

所有優化算法概覽http://www.mamicode.com/info-detail-1931210.html

好的算法可以更快地收斂或者有更好的效果,不過大多數實驗中SGD與Adam已經夠用了。

大神們的經驗也是要看一下的:Yoshua Bengio等大神傳授:26條深度學習經驗http://www.csdn.net/article/2015-09-16/2825716

細化研究

前麵的這些學完之後,就是具體的研究項目了,大家可以去這個github上找自己感興趣的論文https://github.com/terryum/awesome-deep-learning-papers,下麵列舉了一些和卷積神經網絡相關的優秀論文。

Understanding / Generalization / Transfer

Distilling the knowledge in a neural network (2015), G. Hinton et al.http://arxiv.org/pdf/1503.02531

Deep neural networks are easily fooled: High confidence predictions for unrecognizable images (2015), A. Nguyen et al.http://arxiv.org/pdf/1412.1897

How transferable are features in deep neural networks? (2014), J. Yosinski et al.http://papers.nips.cc/paper/5347-how-transferable-are-features-in-deep-neural-networks.pdf

CNN features off-the-Shelf: An astounding baseline for recognition (2014), A. Razavian et al.http://www.cv-foundation.org//openaccess/content_cvpr_workshops_2014/W15/papers/Razavian_CNN_Features_Off-the-Shelf_2014_CVPR_paper.pdf

Learning and transferring mid-Level image representations using convolutional neural networks (2014), M. Oquab et al.http://www.cv-foundation.org/openaccess/content_cvpr_2014/papers/Oquab_Learning_and_Transferring_2014_CVPR_paper.pdf

Visualizing and understanding convolutional networks (2014), M. Zeiler and R. Fergushttp://arxiv.org/pdf/1311.2901

Decaf: A deep convolutional activation feature for generic visual recognition (2014), J. Donahue et al.http://arxiv.org/pdf/1310.1531

Optimization / Training Techniques

Training very deep networks (2015), R. Srivastava et al.http://papers.nips.cc/paper/5850-training-very-deep-networks.pdf

Batch normalization: Accelerating deep network training by reducing internal covariate shift (2015), S. Loffe and C. Szegedyhttp://arxiv.org/pdf/1502.03167

Delving deep into rectifiers: Surpassing human-level performance on imagenet classification (2015), K. He et al.http://www.cv-foundation.org/openaccess/content_iccv_2015/papers/He_Delving_Deep_into_ICCV_2015_paper.pdf

Dropout: A simple way to prevent neural networks from overfitting (2014), N. Srivastava et al.http://jmlr.org/papers/volume15/srivastava14a/srivastava14a.pdf

Adam: A method for stochastic optimization (2014), D. Kingma and J. Bahttp://arxiv.org/pdf/1412.6980

Improving neural networks by preventing co-adaptation of feature detectors (2012), G. Hinton et al.http://arxiv.org/pdf/1207.0580.pdf

Random search for hyper-parameter optimization (2012) J. Bergstra and Y. Bengiohttp://www.jmlr.org/papers/volume13/bergstra12a/bergstra12a

Convolutional Neural Network Models

Rethinking the inception architecture for computer vision (2016), C. Szegedy et al.http://www.cv-foundation.org/openaccess/content_cvpr_2016/papers/Szegedy_Rethinking_the_Inception_CVPR_2016_paper.pdf

Inception-v4, inception-resnet and the impact of residual connections on learning (2016), C. Szegedy et al.http://arxiv.org/pdf/1602.07261

Identity Mappings in Deep Residual Networks (2016), K. He et al.https://arxiv.org/pdf/1603.05027v2.pdf

Deep residual learning for image recognition (2016), K. He et al.http://arxiv.org/pdf/1512.03385

Spatial transformer network (2015), M. Jaderberg et al.,http://papers.nips.cc/paper/5854-spatial-transformer-networks.pdf

Going deeper with convolutions (2015), C. Szegedy et al.http://www.cv-foundation.org/openaccess/content_cvpr_2015/papers/Szegedy_Going_Deeper_With_2015_CVPR_paper.pdf

Very deep convolutional networks for large-scale image recognition (2014), K. Simonyan and A. Zissermanhttp://arxiv.org/pdf/1409.1556

Return of the devil in the details: delving deep into convolutional nets (2014), K. Chatfield et al.http://arxiv.org/pdf/1405.3531

OverFeat: Integrated recognition, localization and detection using convolutional networks (2013), P. Sermanet et al.http://arxiv.org/pdf/1312.6229

Maxout networks (2013), I. Goodfellow et al.http://arxiv.org/pdf/1302.4389v4

Network in network (2013), M. Lin et al.http://arxiv.org/pdf/1312.4400

ImageNet classification with deep convolutional neural networks (2012), A. Krizhevsky et al.http://papers.nips.cc/paper/4824-imagenet-classification-with-deep-convolutional-neural-networks.pdf

Image: Segmentation / Object Detection

You only look once: Unified, real-time object detection (2016), J. Redmon et al.http://www.cv-foundation.org/openaccess/content_cvpr_2016/papers/Redmon_You_Only_Look_CVPR_2016_paper.pdf

Fully convolutional networks for semantic segmentation (2015), J. Long et al.http://www.cv-foundation.org/openaccess/content_cvpr_2015/papers/Long_Fully_Convolutional_Networks_2015_CVPR_paper.pdf

Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks (2015), S. Ren et al.http://papers.nips.cc/paper/5638-faster-r-cnn-towards-real-time-object-detection-with-region-proposal-networks.pdf

Fast R-CNN (2015), R. Girshickhttp://www.cv-foundation.org/openaccess/content_iccv_2015/papers/Girshick_Fast_R-CNN_ICCV_2015_paper.pdf

Rich feature hierarchies for accurate object detection and semantic segmentation (2014), R. Girshick et al.http://www.cv-foundation.org/openaccess/content_cvpr_2014/papers/Girshick_Rich_Feature_Hierarchies_2014_CVPR_paper.pdf

Spatial pyramid pooling in deep convolutional networks for visual recognition (2014), K. He et al.http://arxiv.org/pdf/1406.4729

Semantic image segmentation with deep convolutional nets and fully connected CRFs, L. Chen et al.https://arxiv.org/pdf/1412.7062

Learning hierarchical features for scene labeling (2013), C. Farabet et al.https://hal-enpc.archives-ouvertes.fr/docs/00/74/20/77/PDF/farabet-pami-13.pdf

Image / Video / Etc

Image Super-Resolution Using Deep Convolutional Networks (2016), C. Dong et al.https://arxiv.org/pdf/1501.00092v3.pdf

A neural algorithm of artistic style (2015), L. Gatys et al.https://arxiv.org/pdf/1508.06576

Deep visual-semantic alignments for generating image descriptions (2015), A. Karpathy and L. Fei-Feihttp://www.cv-foundation.org/openaccess/content_cvpr_2015/papers/Karpathy_Deep_Visual-Semantic_Alignments_2015_CVPR_paper.pdf

Show, attend and tell: Neural image caption generation with visual attention (2015), K. Xu et al.http://arxiv.org/pdf/1502.03044

Show and tell: A neural image caption generator (2015), O. Vinyals et al.http://www.cv-foundation.org/openaccess/content_cvpr_2015/papers/Vinyals_Show_and_Tell_2015_CVPR_paper.pdf

Long-term recurrent convolutional networks for visual recognition and description (2015), J. Donahue et al.http://www.cv-foundation.org/openaccess/content_cvpr_2015/papers/Donahue_Long-Term_Recurrent_Convolutional_2015_CVPR_paper.pdf

VQA: Visual question answering (2015), S. Antol et al.http://www.cv-foundation.org/openaccess/content_iccv_2015/papers/Antol_VQA_Visual_Question_ICCV_2015_paper.pdf

DeepFace: Closing the gap to human-level performance in face verification (2014), Y. Taigman et al.http://www.cv-foundation.org/openaccess/content_cvpr_2014/papers/Taigman_DeepFace_Closing_the_2014_CVPR_paper.pdf

Large-scale video classification with convolutional neural networks (2014), A. Karpathy et al.http://vision.stanford.edu/pdf/karpathy14.pdf

Two-stream convolutional networks for action recognition in videos (2014), K. Simonyan et al.http://papers.nips.cc/paper/5353-two-stream-convolutional-networks-for-action-recognition-in-videos.pdf

3D convolutional neural networks for human action recognition (2013), S. Ji et al.http://machinelearning.wustl.edu/mlpapers/paper_files/icml2010_JiXYY10.pdf

更多的需要可以參考專知的另一篇deeplearning相關的文章//www.webtourguide.com/topic/2001228999615594/awesome,其中有很多具體細化的領域以及相關文章,這裏就不重複了。

VIP內容

摘要: 卷積神經網絡(convolutional neural network, CNN)在圖像處理、語音識別、自然語言處理等領域實現了很好的性能.大規模的神經網絡模型通常遭遇計算、存儲等資源限製,稀疏神經網絡的出現有效地緩解了對計算和存儲的需求.盡管現有的領域專用加速器能夠有效處理稀疏網絡,它們通過算法和結構的緊耦合實現高能效,卻喪失了結構的靈活性.粗粒度數據流架構通過靈活的指令調度可以實現不同的神經網絡應用.基於該架構,密集卷積規則的計算特性使不同通道共享相同的一套指令執行,然而稀疏網絡中存在權值稀疏,使得這些指令中存在0值相關的無效指令,而現有的指令執行方式無法自動跳過它們從而產生無效計算.同時在執行不規則的稀疏網絡時,現有的指令映射方法造成了計算陣列的負載不均衡.這些問題阻礙了稀疏網絡性能的提升.基於不同通道共享一套指令的前提下,根據稀疏網絡的數據和指令特征增加指令控製單元實現權值數據中0值相關指令的檢測和跳過,同時使用負載均衡的指令映射算法解決稀疏網絡中指令執行不均衡問題.實驗表明:與密集網絡相比稀疏網絡實現了平均1.55倍的性能提升和63.77%的能耗減少.同時比GPU(cuSparse)和Cambricon-X實現的稀疏網絡分別快2.39倍(Alexnet)、2.28倍(VGG16)和1.14倍(Alexnet)、1.23倍(VGG16).

https://crad.ict.ac.cn/CN/10.7544/issn1000-1239.2021.20200112

成為VIP會員查看完整內容
0
9
0

最新論文

Agriculture is an essential industry in the both society and economy of a country. However, the pests and diseases cause a great amount of reduction in agricultural production while there is not sufficient guidance for farmers to avoid this disaster. To address this problem, we apply CNNs to plant disease recognition by building a classification model. Within the dataset of 3,642 images of apple leaves, We use a pre-trained image classification model Restnet34 based on a Convolutional neural network (CNN) with the Fastai framework in order to save the training time. Overall, the accuracy of classification is 93.765%.

0
0
0
下載
預覽
Top