# Auto-Differentiating Linear Algebra

@article{Seeger2017AutoDifferentiatingLA, title={Auto-Differentiating Linear Algebra}, author={Matthias W. Seeger and Asmus Hetzel and Zhenwen Dai and Neil D. Lawrence}, journal={ArXiv}, year={2017}, volume={abs/1710.08717} }

Development systems for deep learning, such as Theano, Torch, TensorFlow, or MXNet, are easy-to-use tools for creating complex neural network models. Since gradient computations are automatically baked in, and execution is mapped to high performance hardware, these models can be trained end-to-end on large amounts of data. However, it is currently not easy to implement many basic machine learning primitives in these systems (such as Gaussian processes, least squares estimation, principal… Expand

#### 19 Citations

Differentiable Programming Tensor Networks

- Computer Science, Physics
- Physical Review X
- 2019

This work presents essential techniques to differentiate through the tensor networks contractions, including stable AD for tensor decomposition and efficient backpropagation through fixed point iterations, and removes laborious human efforts in deriving and implementing analytical gradients for Tensor network programs. Expand

A Simple and Efficient Tensor Calculus for Machine Learning

- Computer Science
- ArXiv
- 2020

It is shown that using Ricci notation is not necessary for an efficient tensor calculus and an equally efficient method for the simpler Einstein notation is developed and turns out that turning to Einstein notation enables further improvements that lead to even better efficiency. Expand

Scalable Hyperparameter Transfer Learning

- Computer Science
- NeurIPS
- 2018

This work proposes a multi-task adaptive Bayesian linear regression model for transfer learning in BO, whose complexity is linear in the function evaluations: one Bayesianlinear regression model is associated to each black-box function optimization problem (or task), while transfer learning is achieved by coupling the models through a shared deep neural net. Expand

Computing Higher Order Derivatives of Matrix and Tensor Expressions

- Computer Science
- NeurIPS
- 2018

This work presents an algorithmic framework for computing matrix and tensor derivatives that extends seamlessly to higher order derivatives and shows a speedup between one and four orders of magnitude over state-of-the-art frameworks when evaluatingHigher order derivatives. Expand

Banded Matrix Operators for Gaussian Markov Models in the Automatic Differentiation Era

- Computer Science, Mathematics
- AISTATS
- 2019

The aim of the paper is to make modern inference methods available for Gaussian models with banded precision available by equipping an automatic differentiation framework, such as TensorFlow or PyTorch, with some linear algebra operators dedicated to banded matrices. Expand

Use and implementation of autodifferentiation in tensor network methods with complex scalars

- Computer Science, Physics
- 2019

The feasibility of implementation of autodifferentiation in standard tensor network toolkits is commented on and the current status when the method is applied to cases where the underlying scalars are complex, not real and the final result is a real-valued scalar is summarised. Expand

Provably Correct Automatic Subdifferentiation for Qualified Programs

- Computer Science, Mathematics
- NeurIPS
- 2018

The main result shows that, under certain restrictions on the library of non-smooth functions, provably correct generalized sub-derivatives can be computed at a computational cost that is within a (dimension-free) factor of $6$ of the cost of computing the scalar function itself. Expand

Deep Factors for Forecasting

- Computer Science, Mathematics
- ICML
- 2019

A hybrid model that incorporates the benefits of both classical and deep neural networks is proposed, which is data-driven and scalable via a latent, global, deep component, and handles uncertainty through a local classical model. Expand

QR and LQ Decomposition Matrix Backpropagation Algorithms for Square, Wide, and Deep Matrices and Their Software Implementation

- Mathematics, Computer Science
- ArXiv
- 2020

This article presents matrix backpropagation algorithms for the QR decomposition of matrices $A_{m, n}$, that are either square (m = n), wide (m n), with rank $k = min(m, n)$. Furthermore, we derive… Expand

Differentiate Everything with a Reversible Programming Language

- Computer Science
- ArXiv
- 2020

A reversible eDSL NiLang in Julia is developed that can differentiate a general program while being compatible with Julia’s ecosystem and demonstrates that a source-to-source AD framework can achieve the state-of-the-art performance. Expand

#### References

SHOWING 1-10 OF 28 REFERENCES

Deep Kernel Learning

- Computer Science, Mathematics
- AISTATS
- 2016

We introduce scalable deep kernels, which combine the structural properties of deep learning architectures with the non-parametric flexibility of kernel methods. Specifically, we transform the inputs… Expand

Adam: A Method for Stochastic Optimization

- Computer Science, Mathematics
- ICLR
- 2015

This work introduces Adam, an algorithm for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimates of lower-order moments, and provides a regret bound on the convergence rate that is comparable to the best known results under the online convex optimization framework. Expand

Deep Generative Stochastic Networks Trainable by Backprop

- Mathematics, Computer Science
- ICML
- 2014

Theorems that generalize recent work on the probabilistic interpretation of denoising autoencoders are provided and obtain along the way an interesting justification for dependency networks and generalized pseudolikelihood. Expand

Generative Moment Matching Networks

- Computer Science, Mathematics
- ICML
- 2015

This work forms a method that generates an independent sample via a single feedforward pass through a multilayer perceptron, as in the recently proposed generative adversarial networks, using MMD to learn to generate codes that can then be decoded to produce samples. Expand

Auto-Encoding Variational Bayes

- Mathematics, Computer Science
- ICLR
- 2014

A stochastic variational inference and learning algorithm that scales to large datasets and, under some mild differentiability conditions, even works in the intractable case is introduced. Expand

Variational Auto-encoded Deep Gaussian Processes

- Computer Science, Mathematics
- ICLR
- 2016

A new formulation of the variational lower bound is derived that allows for most of the computation to be distributed in a way that enables to handle datasets of the size of mainstream deep learning tasks. Expand

Machine learning - a probabilistic perspective

- Computer Science
- Adaptive computation and machine learning series
- 2012

This textbook offers a comprehensive and self-contained introduction to the field of machine learning, based on a unified, probabilistic approach, and is suitable for upper-level undergraduates with an introductory-level college math background and beginning graduate students. Expand

Gaussian Processes for Machine Learning

- Computer Science
- Adaptive computation and machine learning
- 2009

The treatment is comprehensive and self-contained, targeted at researchers and students in machine learning and applied statistics, and deals with the supervised learning problem for both regression and classification. Expand

Stochastic Back-propagation and Variational Inference in Deep Latent Gaussian Models

- Mathematics, Computer Science
- ArXiv
- 2014

We marry ideas from deep neural networks and approximate Bayesian inference to derive a generalised class of deep, directed generative models, endowed with a new algorithm for scalable inference and… Expand

Generative Adversarial Networks

- Mathematics, Computer Science
- ArXiv
- 2014

We propose a new framework for estimating generative models via an adversarial process, in which we simultaneously train two models: a generative model G that captures the data distribution, and a… Expand