Explore >> Select a destination


You are here

jaykmody.com
| | marcospereira.me
6.3 parsecs away

Travel
| | In this post we summarize the math behind deep learning and implement a simple network that achieves 85% accuracy classifying digits from the MNIST dataset.
| | sigmoidprime.com
8.2 parsecs away

Travel
| | An exploration of Transformer-XL, a modified Transformer optimized for longer context length.
| | swethatanamala.github.io
7.7 parsecs away

Travel
| | In this paper, authors proposed a new simple network architecture, the Transformer, based solely on attention mechanisms, removing convolutions and recurrences entirely. Transformer is the first transduction model relying entirely...
| | www.swyx.io
50.1 parsecs away

Travel
| That one time we tried to emulate our brains with computer chips