|
You are here |
www.paepper.com | ||
| | | | |
sebastianraschka.com
|
|
| | | | | I'm an LLM Research Engineer with over a decade of experience in artificial intelligence. My work bridges academia and industry, with roles including senior staff at an AI company and a statistics professor. My expertise lies in LLM research and the development of high-performance AI systems, with a deep focus on practical, code-driven implementations. | |
| | | | |
erikbern.com
|
|
| | | | | I made a New Year's resolution: every plot I make during 2018 will contain uncertainty estimates. Nine months in and I have learned a lot, so I put together a summary of some of the most useful methods. | |
| | | | |
possiblywrong.wordpress.com
|
|
| | | | | Introduction Let's play a game: I will repeatedly flip a fair coin, showing you the result of each flip, until you say to stop, at which point you win an amount equal to the fraction of observed flips that were heads. What is your strategy for deciding when to stop? This weekend 6/28 is "Two-Pi... | |
| | | | |
iclr-blogposts.github.io
|
|
| | | The product between the Hessian of a function and a vector, the Hessian-vector product (HVP), is a fundamental quantity to study the variation of a function. It is ubiquitous in traditional optimization and machine learning. However, the computation of HVPs is often considered prohibitive in the context of deep learning, driving practitioners to use proxy quantities to evaluate the loss geometry. Standard automatic differentiation theory predicts that the computational complexity of an HVP is of the same order of magnitude as the complexity of computing a gradient. The goal of this blog post is to provide a practical counterpart to this theoretical result, showing that modern automatic differentiation frameworks, JAX and PyTorch, allow for efficient computat... | ||