|
You are here |
jingnanshi.com | ||
| | | | |
iclr-blogposts.github.io
|
|
| | | | | The product between the Hessian of a function and a vector, the Hessian-vector product (HVP), is a fundamental quantity to study the variation of a function. It is ubiquitous in traditional optimization and machine learning. However, the computation of HVPs is often considered prohibitive in the context of deep learning, driving practitioners to use proxy quantities to evaluate the loss geometry. Standard automatic differentiation theory predicts that the computational complexity of an HVP is of the same order of magnitude as the complexity of computing a gradient. The goal of this blog post is to provide a practical counterpart to this theoretical result, showing that modern automatic differentiation frameworks, JAX and PyTorch, allow for efficient computat... | |
| | | | |
liorsinai.github.io
|
|
| | | | | A series on automatic differentiation in Julia. Part 1 provides an overview and defines explicit chain rules. | |
| | | | |
thenumb.at
|
|
| | | | | [AI summary] This text provides a comprehensive overview of differentiable programming, focusing on its application in machine learning and image processing. It explains the fundamentals of automatic differentiation, including forward and backward passes, and demonstrates how to implement these concepts in a custom framework. The text also discusses higher-order differentiation and its implementation in frameworks like JAX and PyTorch. A practical example is given using differentiable programming to de-blur an image, showcasing how optimization techniques like gradient descent can be applied to solve real-world problems. The text emphasizes the importance of differentiable programming in enabling efficient and flexible computation for various domains, includ... | |
| | | | |
aclanthology.org
|
|
| | | [AI summary] The text provides an overview of various natural language processing (NLP) and machine learning research topics. It covers a wide range of areas including: grammatical error correction, text similarity measures, compositional distributional semantics, neural machine translation, dependency parsing, and political orientation prediction. The text also discusses the development of datasets for evaluating models, the importance of readability in reading comprehension tasks, and the use of advanced techniques such as nested attention layers and error-correcting codes to improve model performance. The key themes include the advancement of NLP models, the creation of evaluation datasets, and the exploration of new methods for text analysis and understa... | ||