Outer Web | Explore

Explore >> Select a destination

You are here		harvardnlp.github.io The Annotated Transformer
\|	\|	teddykoker.com NLP from Scratch: Annotated Attention \| Teddy Koker	2.3 parsecs away Travel
\|	\|	This post is the first in a series of articles about natural language processing (NLP), a subfield of machine learning concerning the interaction between computers and human language. This article will be focused on attention, a mechanism that forms the backbone of many state-of-the art language models, including Googles BERT (Devlin et al., 2018), and OpenAIs GPT-2 (Radford et al., 2019).	2.3 parsecs away Travel
\|	\|	sigmoidprime.com Transformer-XL: A Memory-Augmented Transformer	3.2 parsecs away Travel
\|	\|	An exploration of Transformer-XL, a modified Transformer optimized for longer context length.	3.2 parsecs away Travel
\|	\|	blog.eleuther.ai Rotary Embeddings: A Relative Revolution \| EleutherAI Blog	3.1 parsecs away Travel
\|	\|	Rotary Positional Embedding (RoPE) is a new type of position encoding that unifies absolute and relative approaches. We put it to the test.	3.1 parsecs away Travel
\|	\|	www.hamza.se Neural Networks: Simpler Than You Think \| Hamza Khuswan	12.2 parsecs away Travel
\|		A walkthrough of implementing a neural network from scratch in Python, exploring what makes these seemingly complex systems actually quite straightforward.	12.2 parsecs away Travel