Outer Web | Explore

/explore

Click through on any links that interest you or select the planets on the right to continue exploring the Outer Web.

You are here		sebastianraschka.com Improving LoRA: Implementing Weight-Decomposed Low-Rank Adaptation (DoRA) from Scratch
\|	\|	sigmoidprime.com Transformer-XL: A Memory-Augmented Transformer	3.4 parsecs away Travel
\|	\|	An exploration of Transformer-XL, a modified Transformer optimized for longer context length.	3.4 parsecs away Travel
\|	\|	jaketae.github.io LoRA - Jake Tae	2.3 parsecs away Travel
\|	\|	I recently completed another summer internship at Meta (formerly Facebook). I was surprised to learn that one of the intern friends I met was an avid reader of my blog. Encouraged by the positive feedback from my intern friends, I decided to write another post before the end of summer. This post is dedicated to the mandem: Yassir, Amal, Ryan, Elvis, and Sam.	2.3 parsecs away Travel
\|	\|	vickiboykis.com GGUF, the long way around \| Vicki Boykis	3.1 parsecs away Travel
\|	\|	What are ML artifacts?	3.1 parsecs away Travel
\|	\|	harvardnlp.github.io The Annotated Transformer	4.5 parsecs away Travel
\|		[AI summary] The provided code is a comprehensive implementation of the Transformer model, including data loading, model architecture, training, and visualization. It also includes functions for decoding and visualizing attention mechanisms across different layers of the model. The code is structured to support both training and inference, with examples provided for running the model and visualizing attention patterns.	4.5 parsecs away Travel