|
You are here |
jaketae.github.io | ||
| | | | |
teddykoker.com
|
|
| | | | | Google AI recently released a paper, Rethinking Attention with Performers (Choromanski et al., 2020), which introduces Performer, a Transformer architecture which estimates the full-rank-attention mechanism using orthogonal random features to approximate the softmax kernel with linear space and time complexity. In this post we will investigate how this works, and how it is useful for the machine learning community. | |
| | | | |
randorithms.com
|
|
| | | | | The Taylor series is a widely-used method to approximate a function, with many applications. Given a function \(y = f(x)\), we can express \(f(x)\) in terms ... | |
| | | | |
sigmoidprime.com
|
|
| | | | | An exploration of Transformer-XL, a modified Transformer optimized for longer context length. | |
| | | | |
wtfleming.github.io
|
|
| | | [AI summary] This post discusses achieving 99.1% accuracy in binary image classification of cats and dogs using an ensemble of ResNet models with PyTorch. | ||