|
You are here |
www.oranlooney.com | ||
| | | | |
jaketae.github.io
|
|
| | | | | In this post, we will take a look at Nyström approximation, a technique that I came across in Nyströmformer: A Nyström-based Algorithm for Approximating Self-Attention by Xiong et al. This is yet another interesting paper that seeks to make the self-attention algorithm more efficient down to linear runtime. While there are many intricacies to the Nyström method, the goal of this post is to provide a high level intuition of how the method can be used to approximate large matrices, and how this method was used in the aforementioned paper. | |
| | | | |
francisbach.com
|
|
| | | | | ||
| | | | |
michael-lewis.com
|
|
| | | | | This is a short summary of some of the terminology used in machine learning, with an emphasis on neural networks. I've put it together primarily to help my own understanding, phrasing it largely in non-mathematical terms. As such it may be of use to others who come from more of a programming than a mathematical background. | |
| | | | |
sausheong.github.io
|
|
| | | I have written a lot of computer programs in my career, most of the time to solve various problems or perform some tasks (or sometimes just for fun). For most part, other than bugs, as long as I tell the computer what to do very clearly (in whichever the programming language I use) it will obediently follow my instructions. This is because computer programs are really good at executing algorithms - instructions that follow defined steps and patterns that are precise and often repetitious. | ||