|
You are here |
fa.bianp.net | ||
| | | | |
jingnanshi.com
|
|
| | | | | Tutorial on automatic differentiation | |
| | | | |
teddykoker.com
|
|
| | | | | A few posts back I wrote about a common parameter optimization method known as Gradient Ascent. In this post we will see how a similar method can be used to create a model that can classify data. This time, instead of using gradient ascent to maximize a reward function, we will use gradient descent to minimize a cost function. Lets start by importing all the libraries we need: | |
| | | | |
antoinevastel.com
|
|
| | | | | In this article we will see how we can build a recommender system for movies using Python and exploiting the sparsity of the data. | |
| | | | |
www.paepper.com
|
|
| | | Today's paper: Rethinking 'Batch' in BatchNorm by Wu & Johnson BatchNorm is a critical building block in modern convolutional neural networks. Its unique property of operating on "batches" instead of individual samples introduces significantly different behaviors from most other operations in deep learning. As a result, it leads to many hidden caveats that can negatively impact model's performance in subtle ways. This is a citation from the paper's abstract and the emphasis is mine which caught my attention. Let's explore these subtle ways which can negatively impact your model's performance! The paper of Wu & Johnson can be found on arxiv. | ||