Explore >> Select a destination


You are here

teddykoker.com
| | jaketae.github.io
4.1 parsecs away

Travel
| | In this post, we will take a look at Nyström approximation, a technique that I came across in Nyströmformer: A Nyström-based Algorithm for Approximating Self-Attention by Xiong et al. This is yet another interesting paper that seeks to make the self-attention algorithm more efficient down to linear runtime. While there are many intricacies to the Nyström method, the goal of this post is to provide a high level intuition of how the method can be used to approximate large matrices, and how this method was used in the aforementioned paper.
| | windowsontheory.org
3.9 parsecs away

Travel
| | Previous post: ML theory with bad drawings Next post: What do neural networks learn and when do they learn it, see also all seminar posts and course webpage. Lecture video (starts in slide 2 since I hit record button 30 seconds too late - sorry!) - slides (pdf) - slides (Powerpoint with ink and animation)...
| | francisbach.com
3.5 parsecs away

Travel
| | [AI summary] The blog post discusses the spectral properties of kernel matrices, focusing on the analysis of eigenvalues and their estimation using tools like the matrix Bernstein inequality. It also covers the estimation of the number of integer vectors with a given L1 norm and the relationship between these counts and combinatorial structures. The post includes a detailed derivation of bounds for the difference between true and estimated eigenvalues, highlighting the role of the degrees of freedom and the impact of regularization in kernel methods. Additionally, it touches on the importance of spectral analysis in machine learning and its applications in various domains.
| | futurism.com
15.6 parsecs away

Travel
| This post was originally written by Manan Shah as a response to a question on Quora.