Explore >> Select a destination


You are here

thedarkside.frantzmiccoli.com
| | teddykoker.com
8.2 parsecs away

Travel
| | Gradient-descent-based optimizers have long been used as the optimization algorithm of choice for deep learning models. Over the years, various modifications to the basic mini-batch gradient descent have been proposed, such as adding momentum or Nesterovs Accelerated Gradient (Sutskever et al., 2013), as well as the popular Adam optimizer (Kingma & Ba, 2014). The paper Learning to Learn by Gradient Descent by Gradient Descent (Andrychowicz et al., 2016) demonstrates how the optimizer itself can be replac...
| | stribny.name
9.6 parsecs away

Travel
| | Fields in Artificial Intelligence and what libraries to use to address them.
| | 360digitmg.com
12.0 parsecs away

Travel
| | The nonlinear pattern will not be captured by the mere existence of hidden layers. The activation function that will be employed must not be linear.
| | aimatters.wordpress.com
50.2 parsecs away

Travel
| Note: Here's the Python source code for this project in a Jupyter notebook on GitHub I've written before about the benefits of reinventing the wheel and this is one of those occasions where it was definitely worth the effort. Sometimes, there is just no substitute for trying to implement an algorithm to really understand what's...