|
You are here |
teddykoker.com | ||
| | | | |
sigmoidprime.com
|
|
| | | | | An exploration of Transformer-XL, a modified Transformer optimized for longer context length. | |
| | | | |
francisbach.com
|
|
| | | | | [AI summary] This text discusses the scaling laws of optimization in machine learning, focusing on asymptotic expansions for both strongly convex and non-strongly convex cases. It covers the derivation of performance bounds using techniques like Laplace's method and the behavior of random minimizers. The text also explains the 'weird' behavior observed in certain plots, where non-strongly convex bounds become tight under specific conditions. The analysis connects theoretical results to practical considerations in optimization algorithms. | |
| | | | |
blog.eleuther.ai
|
|
| | | | | Rotary Positional Embedding (RoPE) is a new type of position encoding that unifies absolute and relative approaches. We put it to the test. | |
| | | | |
initialcommit.com
|
|
| | | Learn Machine Learning (ML) and be ahead of the pack in the Software industry. Machine Learning is a sub-field of Artificial Intelligence (A.I), which is heavily used in modern software systems. Machine Learning algorithms can improve software (a robot) and it's ability to solve problems through gaining experience and knowledge. | ||