Model complexity is a fundamental problem in deep learning. In this paper we conduct a systematic overview of the latest studies on model complexity in deep learning. Model complexity of deep learning can be categorized into expressive capacity and effective model complexity. We review the existing studies on those two categories along four important factors, including model framework, model size, optimization process and data complexity. We also discuss the applications of deep learning model complexity including understanding model generalization capability, model optimization, and model selection and design. We conclude by proposing several interesting future directions.

22
0
下載
關閉預覽

相關內容

ACM/IEEE第23屆模型驅動工程語言和係統國際會議,是模型驅動軟件和係統工程的首要會議係列,由ACM-SIGSOFT和IEEE-TCSE支持組織。自1998年以來,模型涵蓋了建模的各個方麵,從語言和方法到工具和應用程序。模特的參加者來自不同的背景,包括研究人員、學者、工程師和工業專業人士。MODELS 2019是一個論壇,參與者可以圍繞建模和模型驅動的軟件和係統交流前沿研究成果和創新實踐經驗。今年的版本將為建模社區提供進一步推進建模基礎的機會,並在網絡物理係統、嵌入式係統、社會技術係統、雲計算、大數據、機器學習、安全、開源等新興領域提出建模的創新應用以及可持續性。 官網鏈接:http://www.modelsconference.org/

Graphs are widely used as a popular representation of the network structure of connected data. Graph data can be found in a broad spectrum of application domains such as social systems, ecosystems, biological networks, knowledge graphs, and information systems. With the continuous penetration of artificial intelligence technologies, graph learning (i.e., machine learning on graphs) is gaining attention from both researchers and practitioners. Graph learning proves effective for many tasks, such as classification, link prediction, and matching. Generally, graph learning methods extract relevant features of graphs by taking advantage of machine learning algorithms. In this survey, we present a comprehensive overview on the state-of-the-art of graph learning. Special attention is paid to four categories of existing graph learning methods, including graph signal processing, matrix factorization, random walk, and deep learning. Major models and algorithms under these categories are reviewed respectively. We examine graph learning applications in areas such as text, images, science, knowledge graphs, and combinatorial optimization. In addition, we discuss several promising research directions in this field.

0
33
0
下載
預覽

In humans, Attention is a core property of all perceptual and cognitive operations. Given our limited ability to process competing sources, attention mechanisms select, modulate, and focus on the information most relevant to behavior. For decades, concepts and functions of attention have been studied in philosophy, psychology, neuroscience, and computing. For the last six years, this property has been widely explored in deep neural networks. Currently, the state-of-the-art in Deep Learning is represented by neural attention models in several application domains. This survey provides a comprehensive overview and analysis of developments in neural attention models. We systematically reviewed hundreds of architectures in the area, identifying and discussing those in which attention has shown a significant impact. We also developed and made public an automated methodology to facilitate the development of reviews in the area. By critically analyzing 650 works, we describe the primary uses of attention in convolutional, recurrent networks and generative models, identifying common subgroups of uses and applications. Furthermore, we describe the impact of attention in different application domains and their impact on neural networks' interpretability. Finally, we list possible trends and opportunities for further research, hoping that this review will provide a succinct overview of the main attentional models in the area and guide researchers in developing future approaches that will drive further improvements.

0
47
1
下載
預覽

Training machine learning models in a meaningful order, from the easy samples to the hard ones, using curriculum learning can provide performance improvements over the standard training approach based on random data shuffling, without any additional computational costs. Curriculum learning strategies have been successfully employed in all areas of machine learning, in a wide range of tasks. However, the necessity of finding a way to rank the samples from easy to hard, as well as the right pacing function for introducing more difficult data can limit the usage of the curriculum approaches. In this survey, we show how these limits have been tackled in the literature, and we present different curriculum learning instantiations for various tasks in machine learning. We construct a multi-perspective taxonomy of curriculum learning approaches by hand, considering various classification criteria. We further build a hierarchical tree of curriculum learning methods using an agglomerative clustering algorithm, linking the discovered clusters with our taxonomy. At the end, we provide some interesting directions for future work.

0
16
1
下載
預覽

Deep learning is usually described as an experiment-driven field under continuous criticizes of lacking theoretical foundations. This problem has been partially fixed by a large volume of literature which has so far not been well organized. This paper reviews and organizes the recent advances in deep learning theory. The literature is categorized in six groups: (1) complexity and capacity-based approaches for analyzing the generalizability of deep learning; (2) stochastic differential equations and their dynamic systems for modelling stochastic gradient descent and its variants, which characterize the optimization and generalization of deep learning, partially inspired by Bayesian inference; (3) the geometrical structures of the loss landscape that drives the trajectories of the dynamic systems; (4) the roles of over-parameterization of deep neural networks from both positive and negative perspectives; (5) theoretical foundations of several special structures in network architectures; and (6) the increasingly intensive concerns in ethics and security and their relationships with generalizability.

0
40
1
下載
預覽

Deep neural networks can achieve great successes when presented with large data sets and sufficient computational resources. However, their ability to learn new concepts quickly is quite limited. Meta-learning is one approach to address this issue, by enabling the network to learn how to learn. The exciting field of Deep Meta-Learning advances at great speed, but lacks a unified, insightful overview of current techniques. This work presents just that. After providing the reader with a theoretical foundation, we investigate and summarize key methods, which are categorized into i) metric-, ii) model-, and iii) optimization-based techniques. In addition, we identify the main open challenges, such as performance evaluations on heterogeneous benchmarks, and reduction of the computational costs of meta-learning.

0
5
0
下載
預覽

This paper surveys the field of transfer learning in the problem setting of Reinforcement Learning (RL). RL has been the key solution to sequential decision-making problems. Along with the fast advance of RL in various domains. including robotics and game-playing, transfer learning arises as an important technique to assist RL by leveraging and transferring external expertise to boost the learning process. In this survey, we review the central issues of transfer learning in the RL domain, providing a systematic categorization of its state-of-the-art techniques. We analyze their goals, methodologies, applications, and the RL frameworks under which these transfer learning techniques would be approachable. We discuss the relationship between transfer learning and other relevant topics from an RL perspective and also explore the potential challenges as well as future development directions for transfer learning in RL.

0
16
1
下載
預覽

A comprehensive artificial intelligence system needs to not only perceive the environment with different `senses' (e.g., seeing and hearing) but also infer the world's conditional (or even causal) relations and corresponding uncertainty. The past decade has seen major advances in many perception tasks such as visual object recognition and speech recognition using deep learning models. For higher-level inference, however, probabilistic graphical models with their Bayesian nature are still more powerful and flexible. In recent years, Bayesian deep learning has emerged as a unified probabilistic framework to tightly integrate deep learning and Bayesian models. In this general framework, the perception of text or images using deep learning can boost the performance of higher-level inference and in turn, the feedback from the inference process is able to enhance the perception of text or images. This survey provides a comprehensive introduction to Bayesian deep learning and reviews its recent applications on recommender systems, topic models, control, etc. Besides, we also discuss the relationship and differences between Bayesian deep learning and other related topics such as Bayesian treatment of neural networks.

0
47
0
下載
預覽

Transfer learning aims at improving the performance of target learners on target domains by transferring the knowledge contained in different but related source domains. In this way, the dependence on a large number of target domain data can be reduced for constructing target learners. Due to the wide application prospects, transfer learning has become a popular and promising area in machine learning. Although there are already some valuable and impressive surveys on transfer learning, these surveys introduce approaches in a relatively isolated way and lack the recent advances in transfer learning. As the rapid expansion of the transfer learning area, it is both necessary and challenging to comprehensively review the relevant studies. This survey attempts to connect and systematize the existing transfer learning researches, as well as to summarize and interpret the mechanisms and the strategies in a comprehensive way, which may help readers have a better understanding of the current research status and ideas. Different from previous surveys, this survey paper reviews over forty representative transfer learning approaches from the perspectives of data and model. The applications of transfer learning are also briefly introduced. In order to show the performance of different transfer learning models, twenty representative transfer learning models are used for experiments. The models are performed on three different datasets, i.e., Amazon Reviews, Reuters-21578, and Office-31. And the experimental results demonstrate the importance of selecting appropriate transfer learning models for different applications in practice.

0
91
2
下載
預覽

Deep learning has been shown successful in a number of domains, ranging from acoustics, images to natural language processing. However, applying deep learning to the ubiquitous graph data is non-trivial because of the unique characteristics of graphs. Recently, a significant amount of research efforts have been devoted to this area, greatly advancing graph analyzing techniques. In this survey, we comprehensively review different kinds of deep learning methods applied to graphs. We divide existing methods into three main categories: semi-supervised methods including Graph Neural Networks and Graph Convolutional Networks, unsupervised methods including Graph Autoencoders, and recent advancements including Graph Recurrent Neural Networks and Graph Reinforcement Learning. We then provide a comprehensive overview of these methods in a systematic manner following their history of developments. We also analyze the differences of these methods and how to composite different architectures. Finally, we briefly outline their applications and discuss potential future directions.

0
40
1
下載
預覽

This paper reviews recent studies in understanding neural-network representations and learning neural networks with interpretable/disentangled middle-layer representations. Although deep neural networks have exhibited superior performance in various tasks, the interpretability is always the Achilles' heel of deep neural networks. At present, deep neural networks obtain high discrimination power at the cost of low interpretability of their black-box representations. We believe that high model interpretability may help people to break several bottlenecks of deep learning, e.g., learning from very few annotations, learning via human-computer communications at the semantic level, and semantically debugging network representations. We focus on convolutional neural networks (CNNs), and we revisit the visualization of CNN representations, methods of diagnosing representations of pre-trained CNNs, approaches for disentangling pre-trained CNN representations, learning of CNNs with disentangled representations, and middle-to-end learning based on model interpretability. Finally, we discuss prospective trends in explainable artificial intelligence.

0
12
0
下載
預覽
小貼士
相關論文
Feng Xia,Ke Sun,Shuo Yu,Abdul Aziz,Liangtian Wan,Shirui Pan,Huan Liu
33+閱讀 · 2021年5月3日
Alana de Santana Correia,Esther Luna Colombini
47+閱讀 · 2021年3月31日
Petru Soviany,Radu Tudor Ionescu,Paolo Rota,Nicu Sebe
16+閱讀 · 2021年1月25日
Recent advances in deep learning theory
Fengxiang He,Dacheng Tao
40+閱讀 · 2020年12月20日
Mike Huisman,Jan N. van Rijn,Aske Plaat
5+閱讀 · 2020年10月7日
Transfer Learning in Deep Reinforcement Learning: A Survey
Zhuangdi Zhu,Kaixiang Lin,Jiayu Zhou
16+閱讀 · 2020年9月16日
A Survey on Bayesian Deep Learning
Hao Wang,Dit-Yan Yeung
47+閱讀 · 2020年7月2日
A Comprehensive Survey on Transfer Learning
Fuzhen Zhuang,Zhiyuan Qi,Keyu Duan,Dongbo Xi,Yongchun Zhu,Hengshu Zhu,Hui Xiong,Qing He
91+閱讀 · 2019年11月7日
Ziwei Zhang,Peng Cui,Wenwu Zhu
40+閱讀 · 2018年12月11日
Quanshi Zhang,Song-Chun Zhu
12+閱讀 · 2018年2月7日
相關資訊
多任務學習(Multi-task Learning)方法總結
極市平台
5+閱讀 · 2020年4月26日
局部學習的特征選擇:Local-Learning-Based Feature Selection
我愛讀PAMI
10+閱讀 · 2019年9月20日
Hierarchically Structured Meta-learning
CreateAMind
13+閱讀 · 2019年5月22日
小樣本學習(Few-shot Learning)綜述
PaperWeekly
118+閱讀 · 2019年4月1日
Unsupervised Learning via Meta-Learning
CreateAMind
32+閱讀 · 2019年1月3日
Disentangled的假設的探討
CreateAMind
8+閱讀 · 2018年12月10日
Hierarchical Imitation - Reinforcement Learning
CreateAMind
16+閱讀 · 2018年5月25日
Reinforcement Learning: An Introduction 2018第二版 500頁
CreateAMind
9+閱讀 · 2018年4月27日
Hierarchical Disentangled Representations
CreateAMind
3+閱讀 · 2018年4月15日
【論文】圖上的表示學習綜述
機器學習研究會
9+閱讀 · 2017年9月24日
Top
微信掃碼谘詢專知VIP會員
Top