Outer Web | Explore

Explore >> Select a destination

You are here		cgad.ski When Gradient Descent Is a Kernel Method
\|	\|	bdtechtalks.com A simple guide to gradient descent in machine learning - TechTalks	1.8 parsecs away Travel
\|	\|	Gradient descent is the main technique for training machine learning and deep learning models. Read all about it.	1.8 parsecs away Travel
\|	\|	windowsontheory.org A blitz through classical statistical learning theory - Windows On Theory	2.5 parsecs away Travel
\|	\|	Previous post: ML theory with bad drawings Next post: What do neural networks learn and when do they learn it, see also all seminar posts and course webpage. Lecture video (starts in slide 2 since I hit record button 30 seconds too late - sorry!) - slides (pdf) - slides (Powerpoint with ink and animation)...	2.5 parsecs away Travel
\|	\|	francisbach.com Unraveling spectral properties of kernel matrices - I - Machine Learning Research Blog	2.6 parsecs away Travel
\|	\|	[AI summary] The blog post discusses the spectral properties of kernel matrices, focusing on the analysis of eigenvalues and their estimation using tools like the matrix Bernstein inequality. It also covers the estimation of the number of integer vectors with a given L1 norm and the relationship between these counts and combinatorial structures. The post includes a detailed derivation of bounds for the difference between true and estimated eigenvalues, highlighting the role of the degrees of freedom and the impact of regularization in kernel methods. Additionally, it touches on the importance of spectral analysis in machine learning and its applications in various domains.	2.6 parsecs away Travel
\|	\|	programmathically.com Understanding The Exploding and Vanishing Gradients Problem - Programmathically	17.9 parsecs away Travel
\|		Sharing is caringTweetIn this post, we develop an understanding of why gradients can vanish or explode when training deep neural networks. Furthermore, we look at some strategies for avoiding exploding and vanishing gradients. The vanishing gradient problem describes a situation encountered in the training of neural networks where the gradients used to update the weights []	17.9 parsecs away Travel