Explore >> Select a destination


You are here

windowsontheory.org
| | theorydish.blog
6.2 parsecs away

Travel
| | The chain rule is a fundamental result in calculus. Roughly speaking, it states that if a variable $latex c$ is a differentiable function of intermediate variables $latex b_1,\ldots,b_n$, and each intermediate variable $latex b_i$ is itself a differentiable function of $latex a$, then we can compute the derivative $latex \frac{{\mathrm d} c}{{\mathrm d} a}$ as...
| | marcospereira.me
6.3 parsecs away

Travel
| | In this post we summarize the math behind deep learning and implement a simple network that achieves 85% accuracy classifying digits from the MNIST dataset.
| | iclr-blogposts.github.io
8.4 parsecs away

Travel
| | The product between the Hessian of a function and a vector, the Hessian-vector product (HVP), is a fundamental quantity to study the variation of a function. It is ubiquitous in traditional optimization and machine learning. However, the computation of HVPs is often considered prohibitive in the context of deep learning, driving practitioners to use proxy quantities to evaluate the loss geometry. Standard automatic differentiation theory predicts that the computational complexity of an HVP is of the same order of magnitude as the complexity of computing a gradient. The goal of this blog post is to provide a practical counterpart to this theoretical result, showing that modern automatic differentiation frameworks, JAX and PyTorch, allow for efficient computation of these HVPs in standard deep learning cost functions.
| | finnstats.com
65.6 parsecs away

Travel
| Best Books For Deep Learning. We've compiled a list of the top deep learning books for you. Check it out now.