Explore >> Select a destination


You are here

thenumb.at
| | jhui.github.io
3.4 parsecs away

Travel
| | [AI summary] The provided text discusses various mathematical and computational concepts relevant to deep learning, including poor conditioning in matrices, underflow/overflow in softmax functions, Jacobian and Hessian matrices, learning rate optimization using Taylor series, Newton's method, saddle points, constrained optimization with Lagrange multipliers, and KKT conditions. These concepts are crucial for understanding numerical stability, optimization algorithms, and solving constrained problems in machine learning.
| | iclr-blogposts.github.io
2.4 parsecs away

Travel
| | The product between the Hessian of a function and a vector, the Hessian-vector product (HVP), is a fundamental quantity to study the variation of a function. It is ubiquitous in traditional optimization and machine learning. However, the computation of HVPs is often considered prohibitive in the context of deep learning, driving practitioners to use proxy quantities to evaluate the loss geometry. Standard automatic differentiation theory predicts that the computational complexity of an HVP is of the same order of magnitude as the complexity of computing a gradient. The goal of this blog post is to provide a practical counterpart to this theoretical result, showing that modern automatic differentiation frameworks, JAX and PyTorch, allow for efficient computat...
| | jingnanshi.com
1.3 parsecs away

Travel
| | Tutorial on automatic differentiation
| | liorsinai.github.io
5.3 parsecs away

Travel
| A series on automatic differentiation in Julia. Part 1 provides an overview and defines explicit chain rules.