|
You are here |
ljvmiranda921.github.io | ||
| | | | |
irisvanrooijcogsci.com
|
|
| | | | | Three weeks ago, I wrote a blogpost about how ChatGPT is a "stochastic parrot" (a term coined by Bender, Gebru, McMillan-Major, & Shmitchell, 2021; see also this video for an explanation) and when used for academic (and other) writing constitutes automated plagiarism. My aim was to bring the discussion down to earth and prevent that... | |
| | | | |
www.nngroup.com
|
|
| | | | | Plausible but incorrect AI responses create design challenges and user distrust. Discover evidence-based UI patterns to help users identify fabrications. | |
| | | | |
www.schneier.com
|
|
| | | | | In a new paper, "Adversarial Poetry as a Universal Single-Turn Jailbreak Mechanism in Large Language Models," researchers found that turning LLM prompts into poetry resulted in jailbreaking the models: Abstract: We present evidence that adversarial poetry functions as a universal single-turn jailbreak technique for Large Language Models (LLMs). Across 25 frontier proprietary and open-weight models, curated poetic prompts yielded high attack-success rates (ASR), with some providers exceeding 90%. Mapping prompts to MLCommons and EU CoP risk taxonomies shows that poetic attacks transfer across CBRN, manipulation, cyber-offence, and loss-of-control domains. Converting 1,200 ML-Commons harmful prompts into verse via a standardized meta-prompt produced ASRs up to... | |
| | | | |
anyscale-staging.herokuapp.com
|
|
| | | Explore a technical comparison of leading Reinforcement Learning (RL) libraries for LLMs from Ray. This guide analyzes frameworks like TRL, Verl, and RAGEN to help developers choose the best tools for RLHF, reasoning, and agentic AI. | ||