In 1954, Alston S. Householder published Principles of Numerical Analysis, one of the first modern treatments on matrix decomposition that favored a (block) LU decomposition-the factorization of a matrix into the product of lower and upper triangular matrices. And now, matrix decomposition has become a core technology in machine learning, largely due to the development of the back propagation algorithm in fitting a neural network. The sole aim of this survey is to give a self-contained introduction to concepts and mathematical tools in numerical linear algebra and matrix analysis in order to seamlessly introduce matrix decomposition techniques and their applications in subsequent sections. However, we clearly realize our inability to cover all the useful and interesting results concerning matrix decomposition and given the paucity of scope to present this discussion, e.g., the separated analysis of the Euclidean space, Hermitian space, Hilbert space, and things in the complex domain. We refer the reader to literature in the field of linear algebra for a more detailed introduction to the related fields.

27
0
下載
關閉預覽

相關內容

隨著科學技術的迅速發展,古典的線性代數知識已不能滿足現代科技的需要,矩陣的理論和方法業已成為現代科技領域必不可少的工具。諸如數值分析、優化理論、微分方程、概率統計、控製論、力學、電子學、網絡等學科領域都與矩陣理論有著密切的聯係,甚至在經濟管理、金融、保險、社會科學等領域,矩陣理論和方法也有著十分重要的應用。當今電子計算機及計算技術的迅速發展為矩陣理論的應用開辟了更廣闊的前景。因此,學習和掌握矩陣的基本理論和方法,對於工科研究生來說是必不可少的。全國的工科院校已普遍把“矩陣論”作為研究生的必修課。

We investigate the problem of approximating the matrix function $f(A)$ by $r(A)$, with $f$ a Markov function, $r$ a rational interpolant of $f$, and $A$ a symmetric Toeplitz matrix. In a first step, we obtain a new upper bound for the relative interpolation error $1-r/f$ on the spectral interval of $A$. By minimizing this upper bound over all interpolation points, we obtain a new, simple and sharp a priori bound for the relative interpolation error. We then consider three different approaches of representing and computing the rational interpolant $r$. Theoretical and numerical evidence is given that any of these methods for a scalar argument allows to achieve high precision, even in the presence of finite precision arithmetic. We finally investigate the problem of efficiently evaluating $r(A)$, where it turns out that the relative error for a matrix argument is only small if we use a partial fraction decomposition for $r$ following Antoulas and Mayo. An important role is played by a new stopping criterion which ensures to automatically find the degree of $r$ leading to a small error, even in presence of finite precision arithmetic.

0
0
0
下載
預覽

Motivated by applications in single-cell biology and metagenomics, we consider matrix reordering based on the noisy disordered matrix model. We first establish the fundamental statistical limit for the matrix reordering problem in a decision-theoretic framework and show that a constrained least square estimator is rate-optimal. Given the computational hardness of the optimal procedure, we analyze a popular polynomial-time algorithm, spectral seriation, and show that it is suboptimal. We then propose a novel polynomial-time adaptive sorting algorithm with guaranteed improvement on the performance. The superiority of the adaptive sorting algorithm over the existing methods is demonstrated in simulation studies and in the analysis of two real single-cell RNA sequencing datasets.

0
0
0
下載
預覽

Motivated by applications to the theory of rank-metric codes, we study the problem of estimating the number of common complements of a family of subspaces over a finite field in terms of the cardinality of the family and its intersection structure. We derive upper and lower bounds for this number, along with their asymptotic versions as the field size tends to infinity. We then use these bounds to describe the general behaviour of common complements with respect to sparseness and density, showing that the decisive property is whether or not the number of spaces to be complemented is negligible with respect to the field size. By specializing our results to matrix spaces, we obtain upper and lower bounds for the number of MRD codes in the rank metric. In particular, we answer an open question in coding theory, proving that MRD codes are sparse for all parameter sets as the field size grows, with only very few exceptions. We also investigate the density of MRD codes as their number of columns tends to infinity, obtaining a new asymptotic bound. Using properties of the Euler function from number theory, we then show that our bound improves on known results for most parameter sets. We conclude the paper by establishing general structural properties of the density function of rank-metric codes.

0
0
0
下載
預覽

Low rank matrix recovery problems, including matrix completion and matrix sensing, appear in a broad range of applications. In this work we present GNMR -- an extremely simple iterative algorithm for low rank matrix recovery, based on a Gauss-Newton linearization. On the theoretical front, we derive recovery guarantees for GNMR in both the matrix sensing and matrix completion settings. Some of these results improve upon the best currently known for other methods. A key property of GNMR is that it implicitly keeps the factor matrices approximately balanced throughout its iterations. On the empirical front, we show that for matrix completion with uniform sampling, GNMR performs better than several popular methods, especially when given very few observations close to the information limit.

0
0
0
下載
預覽

We present a new finite-sample analysis of M-estimators of locations in $\mathbb{R}^d$ using the tool of the influence function. In particular, we show that the deviations of an M-estimator can be controlled thanks to its influence function (or its score function) and then, we use concentration inequality on M-estimators to investigate the robust estimation of the mean in high dimension in a corrupted setting (adversarial corruption setting) for bounded and unbounded score functions. For a sample of size $n$ and covariance matrix $\Sigma$, we attain the minimax speed $\sqrt{Tr(\Sigma)/n}+\sqrt{\|\Sigma\|_{op}\log(1/\delta)/n}$ with probability larger than $1-\delta$ in a heavy-tailed setting. One of the major advantages of our approach compared to others recently proposed is that our estimator is tractable and fast to compute even in very high dimension with a complexity of $O(nd\log(Tr(\Sigma)))$ where $n$ is the sample size and $\Sigma$ is the covariance matrix of the inliers. In practice, the code that we make available for this article proves to be very fast.

0
0
0
下載
預覽

Multiplying matrices is among the most fundamental and compute-intensive operations in machine learning. Consequently, there has been significant work on efficiently approximating matrix multiplies. We introduce a learning-based algorithm for this task that greatly outperforms existing methods. Experiments using hundreds of matrices from diverse domains show that it often runs $100\times$ faster than exact matrix products and $10\times$ faster than current approximate methods. In the common case that one matrix is known ahead of time, our method also has the interesting property that it requires zero multiply-adds. These results suggest that a mixture of hashing, averaging, and byte shuffling$-$the core operations of our method$-$could be a more promising building block for machine learning than the sparsified, factorized, and/or scalar quantized matrix products that have recently been the focus of substantial research and hardware investment.

0
6
0
下載
預覽

This paper is an attempt to explain all the matrix calculus you need in order to understand the training of deep neural networks. We assume no math knowledge beyond what you learned in calculus 1, and provide links to help you refresh the necessary math where needed. Note that you do not need to understand this material before you start learning to train and use deep learning in practice; rather, this material is for those who are already familiar with the basics of neural networks, and wish to deepen their understanding of the underlying math. Don't worry if you get stuck at some point along the way---just go back and reread the previous section, and try writing down and working through some examples. And if you're still stuck, we're happy to answer your questions in the Theory category at forums.fast.ai. Note: There is a reference section at the end of the paper summarizing all the key matrix calculus rules and terminology discussed here. See related articles at http://explained.ai

0
6
0
下載
預覽

Interest point descriptors have fueled progress on almost every problem in computer vision. Recent advances in deep neural networks have enabled task-specific learned descriptors that outperform hand-crafted descriptors on many problems. We demonstrate that commonly used metric learning approaches do not optimally leverage the feature hierarchies learned in a Convolutional Neural Network (CNN), especially when applied to the task of geometric feature matching. While a metric loss applied to the deepest layer of a CNN, is often expected to yield ideal features irrespective of the task, in fact the growing receptive field as well as striding effects cause shallower features to be better at high precision matching tasks. We leverage this insight together with explicit supervision at multiple levels of the feature hierarchy for better regularization, to learn more effective descriptors in the context of geometric matching tasks. Further, we propose to use activation maps at different layers of a CNN, as an effective and principled replacement for the multi-resolution image pyramids often used for matching tasks. We propose concrete CNN architectures employing these ideas, and evaluate them on multiple datasets for 2D and 3D geometric matching as well as optical flow, demonstrating state-of-the-art results and generalization across datasets.

0
9
0
下載
預覽

This paper describes a suite of algorithms for constructing low-rank approximations of an input matrix from a random linear image of the matrix, called a sketch. These methods can preserve structural properties of the input matrix, such as positive-semidefiniteness, and they can produce approximations with a user-specified rank. The algorithms are simple, accurate, numerically stable, and provably correct. Moreover, each method is accompanied by an informative error bound that allows users to select parameters a priori to achieve a given approximation quality. These claims are supported by numerical experiments with real and synthetic data.

0
4
0
下載
預覽

Since the invention of word2vec, the skip-gram model has significantly advanced the research of network embedding, such as the recent emergence of the DeepWalk, LINE, PTE, and node2vec approaches. In this work, we show that all of the aforementioned models with negative sampling can be unified into the matrix factorization framework with closed forms. Our analysis and proofs reveal that: (1) DeepWalk empirically produces a low-rank transformation of a network's normalized Laplacian matrix; (2) LINE, in theory, is a special case of DeepWalk when the size of vertices' context is set to one; (3) As an extension of LINE, PTE can be viewed as the joint factorization of multiple networks' Laplacians; (4) node2vec is factorizing a matrix related to the stationary distribution and transition probability tensor of a 2nd-order random walk. We further provide the theoretical connections between skip-gram based network embedding algorithms and the theory of graph Laplacian. Finally, we present the NetMF method as well as its approximation algorithm for computing network embedding. Our method offers significant improvements over DeepWalk and LINE for conventional network mining tasks. This work lays the theoretical foundation for skip-gram based network embedding methods, leading to a better understanding of latent network representation learning.

0
15
1
下載
預覽
小貼士
相關論文
Anina Gruica,Alberto Ravagnani
0+閱讀 · 1月17日
Pini Zilber,Boaz Nadler
0+閱讀 · 1月15日
Davis Blalock,John Guttag
6+閱讀 · 2021年6月21日
The Matrix Calculus You Need For Deep Learning
Terence Parr,Jeremy Howard
6+閱讀 · 2018年7月2日
Mohammed E. Fathy,Quoc-Huy Tran,M. Zeeshan Zia,Paul Vernaza,Manmohan Chandraker
9+閱讀 · 2018年3月27日
Joel A. Tropp,Alp Yurtsever,Madeleine Udell,Volkan Cevher
4+閱讀 · 2018年1月2日
Jiezhong Qiu,Yuxiao Dong,Hao Ma,Jian Li,Kuansan Wang,Jie Tang
15+閱讀 · 2017年12月12日
相關VIP內容
專知會員服務
52+閱讀 · 2021年6月14日
專知會員服務
86+閱讀 · 2021年6月4日
專知會員服務
45+閱讀 · 2021年4月24日
因果圖,Causal Graphs,52頁ppt
專知會員服務
154+閱讀 · 2020年4月19日
專知會員服務
117+閱讀 · 2020年1月16日
強化學習最新教程,17頁pdf
專知會員服務
75+閱讀 · 2019年10月11日
【哈佛大學商學院課程Fall 2019】機器學習可解釋性
專知會員服務
56+閱讀 · 2019年10月9日
相關資訊
A Technical Overview of AI & ML in 2018 & Trends for 2019
待字閨中
10+閱讀 · 2018年12月24日
Hierarchical Disentangled Representations
CreateAMind
3+閱讀 · 2018年4月15日
機器學習線性代數速查
機器學習研究會
12+閱讀 · 2018年2月25日
【推薦】免費書(草稿):數據科學的數學基礎
機器學習研究會
12+閱讀 · 2017年10月1日
強化學習族譜
CreateAMind
11+閱讀 · 2017年8月2日
Top
微信掃碼谘詢專知VIP會員
Top