You are here |
parametricity.com | ||
| | | |
jdh.hamkins.org
|
|
| | | | I'd like to share a simple proof I've discovered recently of a surprising fact: there is a universal algorithm, capable of computing any given function! Wait, what? What on earth do I ... | |
| | | |
ianwrightsite.wordpress.com
|
|
| | | | Riemann's Zeta function is an infinite sublation of Hegelian integers. | |
| | | |
blog.sigfpe.com
|
|
| | | | ||
| | | |
www.paepper.com
|
|
| | New blog series: Deep Learning Papers visualized This is the first post of a new series I am starting where I explain the content of a paper in a visual picture-based way. To me, this helps tremendously to better grasp the ideas and remember them and I hope this will be the same for many of you as well. Today's paper: Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour by Goyal et al. The first paper I've chosen is well-known when it comes to training deep learning models on multiple GPUs. Here is the link to the paper of Goyal et al. on arxiv. The basic idea of the paper is this: when you are doing deep learning research today, you are using more and more data and more complex models. As the complexity and size rises, of course also the computational needs rise tremendously. This means that you typically need much longer to train a model to convergence. But if you need longer to train a model, your feedback loop is long which is frustrating as you already get many other ideas in the mean time, but as it takes so long to train, you cannot try them all out. So what can you do? |