Explore >> Select a destination


You are here

jaketae.github.io
| | teddykoker.com
4.1 parsecs away

Travel
| | Google AI recently released a paper, Rethinking Attention with Performers (Choromanski et al., 2020), which introduces Performer, a Transformer architecture which estimates the full-rank-attention mechanism using orthogonal random features to approximate the softmax kernel with linear space and time complexity. In this post we will investigate how this works, and how it is useful for the machine learning community.
| | randorithms.com
5.0 parsecs away

Travel
| | The Taylor series is a widely-used method to approximate a function, with many applications. Given a function \(y = f(x)\), we can express \(f(x)\) in terms ...
| | sigmoidprime.com
5.0 parsecs away

Travel
| | An exploration of Transformer-XL, a modified Transformer optimized for longer context length.
| | wtfleming.github.io
16.6 parsecs away

Travel
| [AI summary] This post discusses achieving 99.1% accuracy in binary image classification of cats and dogs using an ensemble of ResNet models with PyTorch.