Explore >> Select a destination


You are here

www.molecule.xyz
| | deepmind.google
5.3 parsecs away

Travel
| | New AI system designs proteins that successfully bind to target molecules, with potential for advancing drug design, disease understanding and more.
| | www.cureffi.org
4.6 parsecs away

Travel
| | 20 years after the discovery of RNA interference, the first drug is approved. Great news, but our path forward for prion disease remains unchanged.
| | www.molecule.to
0.0 parsecs away

Travel
| | Ringing in 2024, longevity stalwart VitaDAO has funded Dr. Michael Torres' work to nullify a nonsense mutation that is implicated in a wide range of cancers and age-related diseases.
| | qwenlm.github.io
27.9 parsecs away

Travel
| PAPER DISCORD Introduction Reinforcement Learning (RL) has emerged as a pivotal paradigm for scaling language models and enhancing their deep reasoning and problem-solving capabilities. To scale RL, the foremost prerequisite is maintaining stable and robust training dynamics. However, we observe that existing RL algorithms (such as GRPO) exhibit severe instability issues during long training and lead to irreversible model collapse, hindering further performance improvements with increased compute. To enable successful RL scaling, we propose the Group Sequence Policy Optimization (GSPO) algorithm.