Explore >> Select a destination


You are here

blog.eleuther.ai
| | sebastianraschka.com
2.8 parsecs away

Travel
| | I'm Sebastian: a machine learning & AI researcher, programmer, and author. As Staff Research Engineer Lightning AI, I focus on the intersection of AI research, software development, and large language models (LLMs).
| | sigmoidprime.com
1.4 parsecs away

Travel
| | An exploration of Transformer-XL, a modified Transformer optimized for longer context length.
| | peterbloem.nl
2.7 parsecs away

Travel
| | [AI summary] The text provides an in-depth overview of the Transformer architecture, its evolution, and its applications. It begins by introducing the Transformer as a foundational model for sequence modeling, highlighting its ability to handle long-range dependencies through self-attention mechanisms. The text then explores various extensions and improvements, such as the introduction of positional encodings, the development of models like Transformer-XL and Sparse Transformers to address the quadratic complexity of attention, and the use of techniques like gradient checkpointing and half-precision training to scale up model size. It also discusses the generality of the Transformer, its potential in multi-modal learning, and its future implications across d...
| | natureofcode.com
14.0 parsecs away

Travel
| I began with inanimate objects living in a world of forces, and I gave them desires, autonomy, and the ability to take action according to a system of