Outer Web | Explore

Explore >> Select a destination

You are here		jaketae.github.io Linear Attention Computation in Nyströmformer - Jake Tae
\|	\|	teddykoker.com Performers: The Kernel Trick, Random Fourier Features, and Attention \| Teddy Koker	4.1 parsecs away Travel
\|	\|	Google AI recently released a paper, Rethinking Attention with Performers (Choromanski et al., 2020), which introduces Performer, a Transformer architecture which estimates the full-rank-attention mechanism using orthogonal random features to approximate the softmax kernel with linear space and time complexity. In this post we will investigate how this works, and how it is useful for the machine learning community.	4.1 parsecs away Travel
\|	\|	randorithms.com How to Find the Taylor Series of an Inverse Function - Randorithms	5.0 parsecs away Travel
\|	\|	The Taylor series is a widely-used method to approximate a function, with many applications. Given a function \(y = f(x)\), we can express \(f(x)\) in terms ...	5.0 parsecs away Travel
\|	\|	sigmoidprime.com Transformer-XL: A Memory-Augmented Transformer	5.0 parsecs away Travel
\|	\|	An exploration of Transformer-XL, a modified Transformer optimized for longer context length.	5.0 parsecs away Travel
\|	\|	wtfleming.github.io Cats vs Dogs - Part 3 - 99.1% Accuracy - Binary Image Classification with PyTorch and an Ensemble of ResNet Models	16.6 parsecs away Travel
\|		[AI summary] This post discusses achieving 99.1% accuracy in binary image classification of cats and dogs using an ensemble of ResNet models with PyTorch.	16.6 parsecs away Travel