Explore >> Select a destination


You are here

www.lesswrong.com
| | programmathically.com
2.8 parsecs away

Travel
| | Sharing is caringTweetIn this post, we develop an understanding of why gradients can vanish or explode when training deep neural networks. Furthermore, we look at some strategies for avoiding exploding and vanishing gradients. The vanishing gradient problem describes a situation encountered in the training of neural networks where the gradients used to update the weights []
| | www.paepper.com
3.1 parsecs away

Travel
| | [AI summary] This article explains how to train a simple neural network using Numpy in Python without relying on frameworks like TensorFlow or PyTorch, focusing on the implementation of ReLU activation, weight initialization, and gradient descent for optimization.
| | adl1995.github.io
0.8 parsecs away

Travel
| | [AI summary] The article explains various activation functions used in neural networks, their properties, and applications, including binary step, tanh, ReLU, and softmax functions.
| | blog.fastforwardlabs.com
13.2 parsecs away

Travel
| This article is available as a notebook on Github. Please refer to that notebook for a more detailed discussion and code fixes and updates. Despite all the recent excitement around deep learning, neural networks have a reputation among non-specialists as complicated to build and difficult to interpret. And while interpretability remains an issue, there are now high-level neural network libraries that enable developers to quickly build neural network models without worrying about the numerical details of floating point operations and linear algebra.