mark_yellow's comments | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit | mark_yellow's comments

mark_yellow 6 months ago | parent | context [–] | on: New paper: A single character can make or break yo...

One can manipulate LLM rankings to put any model in the lead—only by modifying the single character separating demonstration examples.

- MMLU performance varies by +/- 23% depending on the choice of delimiter across leading open model families (Llama, Qwen, and Gemma). - Closed models, GPT-4o, are also brittle to the choice of delimiter.

Consider applying for YC's Summer 2026 batch! Applications are open till May 4
Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact