Explore >> Select a destination


You are here

newvick.com
| | programmathically.com
4.4 parsecs away

Travel
| | Sharing is caringTweetIn this post, we develop an understanding of why gradients can vanish or explode when training deep neural networks. Furthermore, we look at some strategies for avoiding exploding and vanishing gradients. The vanishing gradient problem describes a situation encountered in the training of neural networks where the gradients used to update the weights []
| | dennybritz.com
4.8 parsecs away

Travel
| | This the thirdpart of the Recurrent Neural Network Tutorial.
| | windowsontheory.org
2.0 parsecs away

Travel
| | (Updated and expanded 12/17/2021) I am teaching deep learning this week in Harvard's CS 182 (Artificial Intelligence) course. As I'm preparing the back-propagation lecture, Preetum Nakkiran told me about Andrej Karpathy's awesome micrograd package which implements automatic differentiation for scalar variables in very few lines of code. I couldn't resist using this to show how...
| | somosviajeros.com
9.5 parsecs away

Travel
| Guía de Taiwan con mis post de este viaje a una isla en medio de un conflicto geopolítico con China y con gente maravillosa