Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

  > You recognize that you haven't really needed strong mathematical (or coding) skills to create models for some time.
And then there goes something like this [1], where researchers failed to control for p-value: "In this particular setting, emergent abilities claims are possibly infected by a failure to control for multiple comparisons. In BIG-Bench alone, there are ≥220 tasks, ∼40 metrics per task, ∼10 model families, for a total of ∼10^6 task-metric-model family triplets, meaning probability that no task-metric-model family triplet exhibits an emergent ability by random chance might be small."

[1] https://arxiv.org/abs/2304.15004

 help



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: