深度神經網絡(DNN)是深度學習的一種框架,它是一種具備至少一個隱層的神經網絡。與淺層神經網絡類似,深度神經網絡也能夠為複雜非線性係統提供建模,但多出的層次為模型提供了更高的抽象層次,因而提高了模型的能力。

VIP內容

題目:On the Number of Linear Regions of Convolutional Neural Networks

摘要:

深度學習的一個基本問題是理解深度神經網絡(NNs)在實踐中的卓越性能。對神經網絡優越性的一種解釋是,它可以實現大量複雜的函數,即它們有強大的表現力。ReLU型神經網絡的表達能力可以用它能分割其輸入空間的最大線性區域數來量化。自2013年以來,關於全連接ReLU神經網絡線性區域數的研究已經取得了各種結果。然而,據我們所知,由於缺乏合適的數學工具,對於卷積神經網絡(CNNs)的線性區域的數量沒有明確的結果。本文給出了研究網絡神經網絡線性區域所需要的幾個數學結果,並利用這些結果得到了單層神經網絡的最大和平均線性區域數。進一步,我們得到了多層ReLU網絡的線性區域數的上界和下界。還得到了一些漸近結果。結果表明,更深層次的網絡比淺層次的網絡具有更強的表達能力,而在每個參數上,網絡比全連接網絡具有更強的表達能力。據我們所知,這篇論文是第一次研究CNN的線性區域數。最後給出了各種可能的發展方向。

成為VIP會員查看完整內容
0
3
0

最新論文

High quality AI solutions require joint optimization of AI algorithms and their hardware implementations. In this work, we are the first to propose a fully simultaneous, efficient differentiable DNN architecture and implementation co-search (EDD) methodology. We formulate the co-search problem by fusing DNN search variables and hardware implementation variables into one solution space, and maximize both algorithm accuracy and hardware implementation quality. The formulation is differentiable with respect to the fused variables, so that gradient descent algorithm can be applied to greatly reduce the search time. The formulation is also applicable for various devices with different objectives. In the experiments, we demonstrate the effectiveness of our EDD methodology by searching for three representative DNNs, targeting low-latency GPU implementation and FPGA implementations with both recursive and pipelined architectures. Each model produced by EDD achieves similar accuracy as the best existing DNN models searched by neural architecture search (NAS) methods on ImageNet, but with superior performance obtained within 12 GPU-hour searches. Our DNN targeting GPU is 1.40x faster than the state-of-the-art solution reported in Proxyless, and our DNN targeting FPGA delivers 1.45x higher throughput than the state-of-the-art solution reported in DNNBuilder.

0
0
0
下載
預覽
Top