|
You are here |
360digitmg.com | ||
| | | | |
programmathically.com
|
|
| | | | | Sharing is caringTweetIn this post, we develop an understanding of why gradients can vanish or explode when training deep neural networks. Furthermore, we look at some strategies for avoiding exploding and vanishing gradients. The vanishing gradient problem describes a situation encountered in the training of neural networks where the gradients used to update the weights [] | |
| | | | |
michael-lewis.com
|
|
| | | | | This is a short summary of some of the terminology used in machine learning, with an emphasis on neural networks. I've put it together primarily to help my own understanding, phrasing it largely in non-mathematical terms. As such it may be of use to others who come from more of a programming than a mathematical background. | |
| | | | |
www.analyticsvidhya.com
|
|
| | | | | Improve your deep learning model performance by understanding 4 key challenges and the tricks to overcome them. | |
| | | | |
tomhume.org
|
|
| | | I don't remember how I came across it, but this is one of the most exciting papers I've read recently. The authors train a neural network that tries to identify the next in a sequence of MNIST samples, presented in digit order. The interesting part is that when they include a proxy for energy usage in the loss function (i.e. train it to be more energy-efficient), the resulting network seems to exhibit the characteristics of predictive coding: some units seem to be responsible for predictions, others for encoding prediction error. | ||