Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

In addition to passing bar exam[0], improved performance on medical questions[1], economics questions that experts thought it was years away from[2], all the other things marked in green on page 6 were just the changes from 3.5 to 4: https://arxiv.org/pdf/2303.08774

4o added image analysis.

The o-series starting at o1 improves on 4o as per the margins in these charts: https://openai.com/index/learning-to-reason-with-llms/

I'll have to wait and see about o3, because only the mini model is out yet.

[0] https://law.stanford.edu/2023/04/19/gpt-4-passes-the-bar-exa...

[1] https://ai.nejm.org/doi/full/10.1056/AIdbp2300192

[2] https://www.betonit.ai/p/gpt-4-takes-a-new-midterm-and-gets



At this point just paste my comments into ChatGPT and ask it to explain to you what I mean by them. Then paste your response and ask it why it's not addressing the point made. At least use the tool for what it's good for.


So you're saying that it understands you better than I do?

I get that feeling too (in both directions) but this vague and hard to quantify sensation is not what I'd suggest in response to your clearly stated question:

> And what has enormously improved since ChatGPTs launch?

Which is, I think, answered by the things I listed.


It doesn’t understand me, but it could help you understand. What you listed aren’t major unexpected leaps but incremental improvements on things that already were known to be possible.

But you insist on being obstinate. ChatGPT advised me to disengage from this conversation.


This is highly misleading.

ChatGPT did not ace the bar exam -- it was basically percentile graded against a group of people who mostly failed. If compared to real lawyers, it was 15th percentile on the essay portion

[0] https://law-ai.org/re-evaluating-gpt-4s-bar-exam-performance...


I said pass, not ace.

15th percentile of passes, on the weakest aspect, is still a big improvement over "not passing". That improvement is what I wish to highlight.

(The observation that 48th percentile (lowest overall from your link, let alone 15th for essays) of passes corresponds to 90th percentile of all exam takers, suggests that perhaps too many humans are taking the exams before they're ready).




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: