|
You are here |
liorsinai.github.io | ||
| | | | |
windowsontheory.org
|
|
| | | | | (Updated and expanded 12/17/2021) I am teaching deep learning this week in Harvard's CS 182 (Artificial Intelligence) course. As I'm preparing the back-propagation lecture, Preetum Nakkiran told me about Andrej Karpathy's awesome micrograd package which implements automatic differentiation for scalar variables in very few lines of code. I couldn't resist using this to show how... | |
| | | | |
jingnanshi.com
|
|
| | | | | Tutorial on automatic differentiation | |
| | | | |
iclr-blogposts.github.io
|
|
| | | | | The product between the Hessian of a function and a vector, the Hessian-vector product (HVP), is a fundamental quantity to study the variation of a function. It is ubiquitous in traditional optimization and machine learning. However, the computation of HVPs is often considered prohibitive in the context of deep learning, driving practitioners to use proxy quantities to evaluate the loss geometry. Standard automatic differentiation theory predicts that the computational complexity of an HVP is of the same order of magnitude as the complexity of computing a gradient. The goal of this blog post is to provide a practical counterpart to this theoretical result, showing that modern automatic differentiation frameworks, JAX and PyTorch, allow for efficient computat... | |
| | | | |
sefiks.com
|
|
| | | Heaviside step function is one of the most common activation function in neural networks. The functionproduces binary output. That is the reason why it alsocalled as binary step function. That's why, they are very useful for binary classification studies. | ||