Outer Web | Explore

Explore >> Select a destination

You are here		www.lesswrong.com Empirical work that might shed light on scheming (Section 6 of "Scheming AIs") - LessWrong
\|	\|	joecarlsmith.com Can we safely automate alignment research? - Joe Carlsmith	2.8 parsecs away Travel
\|	\|	It's really important; we have a real shot; there are a lot of ways we can fail.	2.8 parsecs away Travel
\|	\|	www.alignmentforum.org Model Organisms of Misalignment: The Case for a New Pillar of Alignment Research - AI Alignment Forum	1.8 parsecs away Travel
\|	\|	Evan et al argue for developing "model organisms of misalignment" - AI systems deliberately designed to exhibit concerning behaviors like deception o...	1.8 parsecs away Travel
\|	\|	www.greaterwrong.com Role Architectures: Applying LLMs to consequential tasks - LessWrong 2.0 viewer	5.2 parsecs away Travel
\|	\|	TL;DR:Strong problem-solving systems can be built from AI systems that play diverse roles, LLMs can readily play diverse roles in role architectures, and AI systems based on role architectures can be practical, safe, and effective in undertaking complex and consequential tasks. This article explores the practicalities and challenges of aligning large language models (LLMs[1]) to play central roles in performing tasks safely and effectively. It highlights the potential value of Open Agency and related role architectures in aligning AI for general applications while mitigating risks.	5.2 parsecs away Travel
\|	\|	www.lesswrong.com AI #107: The Misplaced Hype Machine - LessWrong	26.0 parsecs away Travel
\|		The most hyped event of the week, by far, was the Manus Marketing Madness. Manus wasn't entirely hype, but there was very little there there in that...	26.0 parsecs away Travel