Outer Web | Explore

Explore >> Select a destination

You are here		www.lesswrong.com Causal Scrubbing: a method for rigorously testing interpretability hypotheses [Redwood Research] - LessWrong
\|	\|	bair.berkeley.edu Evaluating and Testing Unintended Memorization in Neural Networks - The Berkeley Artificial Intelligence Research Blog	5.1 parsecs away Travel
\|	\|		5.1 parsecs away Travel
\|	\|	www.scottaaronson.com Research Papers and Surveys	5.0 parsecs away Travel
\|	\|		5.0 parsecs away Travel
\|	\|	transformer-circuits.pub Towards Monosemanticity: Decomposing Language Models With Dictionary Learning	3.9 parsecs away Travel
\|	\|	[AI summary] The text discusses the interpretability of features in a machine learning model, focusing on how features like Arabic, base64, and Hebrew are used in interpretable ways. It explores the extent to which these features explain the model's behavior, noting that features with higher activations are more interpretable. The text also addresses the limitations of current methods, such as the computational cost of simulating features and the potential for dataset correlations to influence feature interpretations. Finally, it concludes that the model's learning process creates a richer structure in its activations than the dataset alone, suggesting that feature-based interpretations provide meaningful insights into the model's behavior.	3.9 parsecs away Travel
\|	\|	blog.otoro.net Generating Large Images from Latent Vectors \| ???	19.5 parsecs away Travel
\|		[AI summary] This text discusses the development of a system for generating large images from latent vectors, combining Generative Adversarial Networks (GANs) and Variational Autoencoders (VAEs). It explores the use of Conditional Perceptual Neural Networks (CPPNs) to create images with specific characteristics, such as style and orientation, by manipulating latent vectors. The text also covers the ability to perform arithmetic on latent vectors to generate new images and the potential for creating animations by transitioning between different latent states. The author suggests future research directions, including training on more complex datasets and exploring alternative training objectives beyond Maximum Likelihood.	19.5 parsecs away Travel