|
You are here |
blog.briankitano.com | ||
| | | | |
sigmoidprime.com
|
|
| | | | | An exploration of Transformer-XL, a modified Transformer optimized for longer context length. | |
| | | | |
nlp.seas.harvard.edu
|
|
| | | | | The Annotated Transformer | |
| | | | |
comsci.blog
|
|
| | | | | In this tutorial, we will implement transformers step-by-step and understand their implementation. There are other great tutorials on the implementation of transformers, but they usually dive into the complex parts too early, like they directly start implementing additional parts like masks and multi-head attention, but it is not very intuitional without first building the core part of the transformers. | |
| | | | |
thedarkside.frantzmiccoli.com
|
|
| | | The deep learning community has been relying on powerful libraries enabling more than I can dream of in terms of mathematical capabilities. Back in the days, I worked on an artificial neural network project where we implemented the derivatives where we would need them. Seeing those projects made me willing to toy around with their capacities for other models, not necessarily artificial neural... | ||