Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

When none of the models, STOA or not, could answer any genuinely interesting question. All models could regurgitate was has been expressed before but nothing actually new was there, until explicitly asked for, and even then it required filtering through potentially so much noise it was practically not interesting anymore as it required all the knowledge to validate or invalidate the claims. That's when, few years ago, I realized "Oh shit... despite all the tremendous effort and resources, it's still not that useful.". Honestly this was NOT was I expected. Yet, it was an important realization.
 help



Related but distinct, few years later I asked an acquaintance to ask a question to a model. I didn't want to bias the test so I ask them to ask whatever they wanted. They asked "What time is it in Sri Lanka?" which I thought was a funny question. I predicted it wouldn't work because it was asked to an offline model so I thought it wouldn't manage to get current data. Still, I didn't interfere and we watch the answer being provided. It was roughly factually correct information about Sri Lanka... but it did not give the correct time. Again that's a rather basic question a young child would easily get right. You need the current time with a known timezone, the time difference, basic arithmetic and voila, you have the correct answer with an explanation to verify. Here it didn't work and I was there trying to explain how to STOA open-source model which required thousands if not millions in resources, training time, researcher salaries, etc could not even handle that random basic question. Another "oh shit" moment, again, not the one I expected which is precisely why to me it was, and still is, interesting.

"I googled 'what is my bank balance' and it couldn't even tell me. What a waste of resources."

I didn't mention resources here.

The point of the test was to ask somebody with no bias on HOW the result was produced.


"I couldn't remember the order of the words in 'state of the art' so I just spray and pray across the keyboard like usual. I can't tell the difference because I'm just a pattern matching bot"

Oops, unfortunately too late to fix. I actually misspell it often... apologies if it caused confusion!

A few years ago, as you say, this was true. Nowadays I guess you just have to bite the bullet that Erdős problems aren’t interesting.

I already commented on Erdos problem, that is also a jagged frontier.

Curious what your interesting questions were, you should be able to find them in your chat history.

That was more than a decade ago so unfortunately not. I should have kept those questions though. I even mention in a comment on HN a while ago that unanswered or wrongly answered questions should precisely be a batch test when new models are released.

Here's a good one for you: "Explain the double slit experiment which way variation"

If they say anything about leaving two straight lines, then it fails. Just tried Gemini, and it failed.

This is an extremely common misconception that has spread all throughout the internet, and so it is baked into the training data. The real answer is that there are multiple ways to do which way double slit experiments, but Einstein's thought experiment proves it's impossible for any of them have an interference pattern, as that would violate Heisenburg's Uncertainty Principle.

Somehow, not leaving an interference pattern became twisted into leaving a specific pattern of two lines, which then falsely implies that quantum objects lose their quantum behavior in certain circumstances. The field of quantum physics becomes so much simpler to understand once you realize that all of this is hogwash.

The best reference I can find for where this myth started is a documentary about quantum physics that tries to connect it with mysticism. On the other hand, Wikipedia actually has it correct. In its "which way" section in the double slit experiment page, it correctly says "A well-known thought experiment predicts that if particle detectors are positioned at the slits, showing through which slit a photon goes, the interference pattern will disappear".


What? What LLM were you using a decade ago? Am I misreading you?

You might not be aware of it but GenAI predates OpenAI which was founded more than 10 years ago anyway.

Of course I am aware, but how is this relevant today? How does that prove that the science is irrelevant and wasted?

Did I say that the science is irrelevant and wasted?

No. GenAI means LLMs right now. I agree it didnt in the past, but definitions change.

Are you sure you're asking the right questions?

To me they were important questions. Maybe totally interesting to you.

What question?

I can't recall but basic stuff like P = NP. /s

My point was preciously to challenge STOA in domains, not questions with well known answers.


What is STOA? Do you mean SOTA?

Yes sorry I misspelled it in the whole thread.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: