|
You are here |
sriku.org | ||
| | | | |
marcospereira.me
|
|
| | | | | In this post we summarize the math behind deep learning and implement a simple network that achieves 85% accuracy classifying digits from the MNIST dataset. | |
| | | | |
zserge.com
|
|
| | | | | Neural network and deep learning introduction for those who skipped the math class but wants to follow the trend | |
| | | | |
robotchinwag.com
|
|
| | | | | Deriving the gradients for the backward pass for matrix multiplication using tensor calculus | |
| | | | |
francisbach.com
|
|
| | | [AI summary] This text discusses the scaling laws of optimization in machine learning, focusing on asymptotic expansions for both strongly convex and non-strongly convex cases. It covers the derivation of performance bounds using techniques like Laplace's method and the behavior of random minimizers. The text also explains the 'weird' behavior observed in certain plots, where non-strongly convex bounds become tight under specific conditions. The analysis connects theoretical results to practical considerations in optimization algorithms. | ||