Outer Web | Explore

/explore

Click through on any links that interest you or select the planets on the right to continue exploring the Outer Web.

You are here		www.lesswrong.com A central AI alignment problem: capabilities generalization, and the sharp left turn - LessWrong
\|	\|	www.greaterwrong.com Role Architectures: Applying LLMs to consequential tasks - LessWrong 2.0 viewer	4.2 parsecs away Travel
\|	\|	TL;DR:Strong problem-solving systems can be built from AI systems that play diverse roles, LLMs can readily play diverse roles in role architectures, and AI systems based on role architectures can be practical, safe, and effective in undertaking complex and consequential tasks. This article explores the practicalities and challenges of aligning large language models (LLMs[1]) to play central roles in performing tasks safely and effectively. It highlights the potential value of Open Agency and related role architectures in aligning AI for general applications while mitigating risks.	4.2 parsecs away Travel
\|	\|	distill.pub AI Safety Needs Social Scientists	4.5 parsecs away Travel
\|	\|	If we want to train AI to do what humans want, we need to study humans.	4.5 parsecs away Travel
\|	\|	www.alignmentforum.org Critique of some recent philosophy of LLMs' minds - AI Alignment Forum	2.7 parsecs away Travel
\|	\|	I structure this post as a critique of some recent papers on the philosophy of mind in application to LLMs, concretely, on whether we can say that LL...	2.7 parsecs away Travel
\|	\|	www.alignmentforum.org Risks from Learned Optimization: Introduction - AI Alignment Forum	18.3 parsecs away Travel
\|		AI researchers warn that advanced machine learning systems may develop their own internal goals that don't match what we intended. This "mesa-optimiz...	18.3 parsecs away Travel