|
You are here |
www.lesswrong.com | ||
| | | | |
www.alignmentforum.org
|
|
| | | | | This piece gives an overview of the alignment problem and makes the case for AI alignment research. It is crafted both to be broadly accessible to th... | |
| | | | |
www.greaterwrong.com
|
|
| | | | | Eric DrexlerCentre for the Governance of AIUniversity of Oxford This document argues for "open agencies" - not opaque, unitary agents - as the appropriate model for applying future AI capabilities to consequential tasks that call for combining human guidance with delegation of planning and implementation to AI systems. This prospect reframes and can help to tame a wide range of classic AI safety challenges, leveraging alignment techniques in a relatively fault-tolerant context. | |
| | | | |
www.alignmentforum.org
|
|
| | | | | "Discovering Language Model Behaviors with Model-Written Evaluations" is a new Anthropic paper by Ethan Perez et al. that I (Evan Hubinger) also coll... | |
| | | | |
gilkalai.wordpress.com
|
|
| | | Two weeks ago I was invited together with my colleague Shay Mozes to visit the Israeli Quantum Computing Center located near the Tel Aviv University quite close to my home. That morning my wife told me not to be disappointed if I happened to see some quantum computers there :) , and I assured her... | ||