Outer Web | Explore

Explore >> Select a destination

You are here		www.lesswrong.com Toy Models of Superposition - LessWrong
\|	\|	goodfire.ai Goodfire \| AI Interpretability	2.2 parsecs away Travel
\|	\|	Goodfire is an AI research company building practical interpretability tools for safe and reliable generative models.	2.2 parsecs away Travel
\|	\|	www.alignmentforum.org Towards Monosemanticity: Decomposing Language Models With Dictionary Learning - AI Alignment Forum	1.8 parsecs away Travel
\|	\|	Text of post based on our blog post as a linkpost for the full paper which is considerably longer and more detailed. ...	1.8 parsecs away Travel
\|	\|	transformer-circuits.pub Towards Monosemanticity: Decomposing Language Models With Dictionary Learning	2.9 parsecs away Travel
\|	\|	[AI summary] The text discusses the interpretability of features in a machine learning model, focusing on how features like Arabic, base64, and Hebrew are used in interpretable ways. It explores the extent to which these features explain the model's behavior, noting that features with higher activations are more interpretable. The text also addresses the limitations of current methods, such as the computational cost of simulating features and the potential for dataset correlations to influence feature interpretations. Finally, it concludes that the model's learning process creates a richer structure in its activations than the dataset alone, suggesting that feature-based interpretations provide meaningful insights into the model's behavior.	2.9 parsecs away Travel
\|	\|	www.depthfirstlearning.com Variational Inference with Normalizing Flows · Depth First Learning	12.4 parsecs away Travel
\|		[AI summary] The user has provided a detailed and complex set of questions and reading materials related to normalizing flows, variational inference, and generative models. The content covers topics such as the use of normalizing flows to enhance variational posteriors, the inference gap, and the implementation of models like NICE and RealNVP. The user is likely seeking guidance on how to approach these questions, possibly for academic or research purposes.	12.4 parsecs away Travel