|
You are here |
www.alignmentforum.org | ||
| | | | |
www.lesswrong.com
|
|
| | | | | A new paper from Google, in which they get a language model to solve some (of what to me reads as terrifyingly impressive) tasks which require quanti... | |
| | | | |
www.lesswrong.com
|
|
| | | | | TL;DR * We train sparse autoencoders (SAEs) on artificial datasets of 2D points, which are arranged to fall into pre-defined, visually-recognizable... | |
| | | | |
deepmind.google
|
|
| | | | | Announcing a comprehensive, open suite of sparse autoencoders for language model interpretability. | |
| | | | |
iamirmasoud.com
|
|
| | | Amir Masoud Sefidian | ||