Outer Web | Explore

Explore >> Select a destination

You are here		jaykmody.com An Intuition for Attention \| Jay Mody
\|	\|	marcospereira.me Backpropagation From Scratch-Marcos Pereira	6.3 parsecs away Travel
\|	\|	In this post we summarize the math behind deep learning and implement a simple network that achieves 85% accuracy classifying digits from the MNIST dataset.	6.3 parsecs away Travel
\|	\|	sigmoidprime.com Transformer-XL: A Memory-Augmented Transformer	8.2 parsecs away Travel
\|	\|	An exploration of Transformer-XL, a modified Transformer optimized for longer context length.	8.2 parsecs away Travel
\|	\|	swethatanamala.github.io NLP - Attention Is All You Need · Swetha's Blog	7.7 parsecs away Travel
\|	\|	In this paper, authors proposed a new simple network architecture, the Transformer, based solely on attention mechanisms, removing convolutions and recurrences entirely. Transformer is the first transduction model relying entirely...	7.7 parsecs away Travel
\|	\|	www.swyx.io Supervised Learning: Neural Networks	50.1 parsecs away Travel
\|		That one time we tried to emulate our brains with computer chips	50.1 parsecs away Travel