Can LLMs Generate Novel Research Ideas?

jedberg · on Sept 12, 2024

An LLM is like a well read college student with a nearly photographic memory that sometimes mixes things up.

It's great for bouncing ideas off of and getting feedback on them. And yeah, it might product "novel ideas" by mixing and matching existing ideas, but LLMs will never create truly novel ideas. Not in their current form.

The paper didn't really answer the question sadly: their conclusion was just that humans rate LLM answers as more novel than human ones, but less feasible.

ryandvm · on Sept 12, 2024

Let's be real though. 99% of people 99% of the time are also not coming up with novel ideas.

LLMs seem to mostly be limited right now by the fact that they're always losing context on new conversations and their interactions don't rewire their neural nets. Hard to come up with a new idea when your brain gets reset with every conversation.

chaosist · on Sept 12, 2024

I agree but I have tried many times to intersect two ideas with a LLM that would be novel and the LLM can not do this at all.

We shouldn't expect the stochastic parrot to be able to do this though and it is unfair to the stochastic parrot.

It is like expecting a real parrot to say words it has never heard before.

No one asks that of a real parrot because we don't anthropomorphize a real parrot like we do the LLM.

owenpalmer · on Sept 12, 2024

> Hard to come up with a new idea when your brain gets reset with every conversation.

Haha I love this picture!

lostmsu · on Sept 12, 2024

You are trivially incorrect.

LLMs (unless used in deterministic mode, which you shouldn't anyway) will eventually generate all ideas for the same reason as 1_000_000 monkeys on typewriters will eventually generate War and Peace. The question is only how soon in practice this will happen.

Well, I am nearly certain, that 1_000_000 LLMs (or rather 1_000_000 streams of generation using a few LLMs) will do better than 1_000_000 monkeys on typewriters.

The same 1_000_000 LLMs could do better than 1_000_000 average humans, but we don't know yet.

Zambyte · on Sept 12, 2024

> And yeah, it might product "novel ideas" by mixing and matching existing ideas, but LLMs will never create truly novel ideas. Not in their current form.

Can you give a historic example of a human creating "truly novel ideas" that is not the product of mixing and matching existing ideas?

api · on Sept 12, 2024

This argument can be applied recursively. For example: the information in the human genome is just information from the environment transferred into the genome by evolutionary learning. Ultimately you get back to some kind of first cause argument where everything goes back to God or whatever natural process created information in the universe. Either nothing new is actually new or everything new is new.

In the end it becomes moot. A novel rearrangement of existing ideas that does something new or different is creativity.

Can LLMs do that? I think they can to a limited extent, but not as well as humans. Is it something we get with scale or does it require a fundamental architectural innovation? Don't know.

F-Lexx · on Sept 12, 2024

> Can you give a historic example of a human creating "truly novel ideas" that is not the product of mixing and matching existing ideas?

The invention of PCR comes to mind:

> During a symposium held for centenarian Albert Hofmann, Hofmann said Mullis had told him that LSD had "helped him develop the polymerase chain reaction that helps amplify specific DNA sequences".

https://en.m.wikipedia.org/wiki/Kary_Mullis

regularfry · on Sept 12, 2024

https://www.iflscience.com/lsd-dna-pcr-the-strange-origins-o... would argue that it was exactly mixing and matching at play here, but that the component parts wouldn't necessarily have been called to mind without the hallucination.

PaulHoule · on Sept 12, 2024

You don't even need drugs...

https://en.wikipedia.org/wiki/August_Kekul%C3%A9#Kekul%C3%A9...

jedberg · on Sept 13, 2024

How about the Transformer models that power LLMs? That was pretty novel.

jedberg · on Sept 12, 2024

I mean, it depends on how pedantic you want to be. I'd say e=mc2 was pretty novel, but yes, it was based on existing math concepts.

But I don't think an LLM could ever come up with something like that.

ajross · on Sept 12, 2024

> I'd say e=mc2 was pretty novel

Well, the rest mass notion in special relativity is just a natural derivation. It appears once you have the "real" ideas in place. And those aren't really "existing math" at all. It was a pure physics idea at root: "the universe's laws don't change if you are in motion" (or alternate framings like "you can't tell if you're moving from inside a moving box").

Well, it turns out that if you try to construct such a theory, you end up needing some different (pre-existing) math to help describe it. But the idea isn't math at all.

Then the question is "Can LLMs propose new well-framed, evocative theories like relativity", and the answer is sort of open. In point of fact human beings, to first approximation, can't do this either!

Validark · on Sept 12, 2024

Considering the fact that we know the names of the individuals who came up with new ideas in physics, it's extremely rare. If every other college kid was in that category, then we could call it "human ability".

gmaster1440 · on Sept 12, 2024

Relevant to this discussion is the fact that if an LLM can’t come up with that, it wouldn’t be due to the inability to mix and match to form novel ideas, but something else, and that something else hasn’t been clearly articulated yet.

staticman2 · on Sept 12, 2024

I think every major idea was truly novel at some point. Even an idea like "I will write down the things that happened, so when I am dead, other will know what has happened from my writings" was novel at some point.

Which means that an LLM can probably technically come up with novel ideas, since novel ideas aren't some special category of things, it's just that LLMs are not very good at it.

bongodongobob · on Sept 12, 2024

Why? Just a gut feeling?

michaelt · on Sept 12, 2024

Perhaps it's specific to present-day LLMs and things will improve in the future - but when I ask my favourite LLM to suggest creative, novel marketing options for a coffee shop it suggests a tenth-drink-free punch card. Which is pretty much the least novel answer imaginable.

It's such a banal suggestion it makes me think there could be a tension between the requirement to be creative, and the requirement that the next token be among the most probable.

bongodongobob · on Sept 12, 2024

I mean if that's how you're prompting it that's your problem. What are marketing options? Do you mean campaign? Campaign to do what? Increase repeat business? Get new customers? Expand into a new market?

"Hey make me a cool like marketing thing or whatever" isn't going to work.

moralestapia · on Sept 12, 2024

What does `e=mc2` mean?

mbivert · on Sept 12, 2024

https://en.wikipedia.org/wiki/Mass%E2%80%93energy_equivalenc...

> Mass–energy equivalence states that all objects having mass, or massive objects, have a corresponding intrinsic energy, even when they are stationary. In the rest frame of an object, where by definition it is motionless and so has no momentum, the mass and energy are equal or they differ only by a constant factor, the speed of light squared (c²).

moralestapia · on Sept 12, 2024

Thanks.

Now go check the "History" section of that article; which, by the way, takes up about half of it.

mbivert · on Sept 12, 2024

Alright.

I'm trying to come up with other examples, but as the parent said, it then depends how pedantic we want to be e.g. take Poincaré's spacetime, but he'd still be working from two pre-existing ideas, "space" and "time"; the idea of combining the two is (feels?) quite novel and unexpected.

Going back some more, the notion of "space" (or that of "time") feels more primitive, less explainable in terms of other notions.

moralestapia · on Sept 13, 2024

Yes, but the whole "combine things and see if they make other thing make sense" seems to me like a procedure that may be reproduced by pattern matching at a very large scale.

PaulHoule · on Sept 12, 2024

Creativity is one of the most problematic concepts in psychology. I am right now reading

https://www.amazon.com/Sounds-Bell-Jar-Psychotic-Authors/dp/...

where the authors (a psychologist and two literature critics) carefully tease apart the connection between creativity and psychosis which is of course problematic because insanity mostly gets in the way of being creative which leads to much more serious definition of what "being creative" really means than one usually finds. (One thing they point out is that a third-rate artist (Andy Warhol?) can become quite prominent if they are good at marketing their work.)

People who are religious will make a theological argument to the effect that "God gave you the power to create when he created you" or "You can be creative because God is inside you".

Atheists may dismiss these arguments out of hand but it's a mistake to do so because of

https://en.wikipedia.org/wiki/Ontological_argument

in the sense that "God" can be defined as "the reason why there is something instead of nothing" which could have no relation to the image of some old patriarch on a throne. If we are made "in his image" we should consider the image revealed in a microscope that reveals that we are based on cybernetic principles that apply to the individual cell as well the whole organism and how those principles apply to the evolution of language and culture as they do to our genetic endowment.

(Insofar as God can delegate his creative ability to you, can't you further delegate it?)

A vulgar version of this is Rodger Penrose's "I can solve math problems because I am a thetan" where he claims to be exempt from the problems that Godel and Tarski and Turing warned you about but since there is nothing complete or consistent about Rodger Penrose these don’t apply (he can't solve Collatz and neither can a OT VIII!)

Muddy thinkers may reject the existence or relevance of God or not explicitly believe they are "a spirit in the material world" but often think there is something uniquely human about creativity (can other animals be creative?) but I sense that the ghost of the arguments above is behind that thinking.

In this transcript I get Python to create something that was never seen before and will never be seen again

   >>> uuid4()
   UUID('21205a92-2611-4710-b120-4a94f5ccf2d9')

which is by no means interesting; real creativity involves creating something that is useful and/or expressive using certain resources and subject to some system of constraints. Insofar as some task is repeatable, creativity is involved in the creation of some process or and/or system that makes the task repeatable.

As Edison put it “Genius is one percent inspiration and ninety-nine percent perspiration” so it is not so interesting that the LLM can generate novel (yuck I hate that word, "novel" is the first word I delete when I have to squash a long paper title to fit into 80 characters) research ideas, I'll be impressed when it can fill out a grant application that gets funded.

uludag · on Sept 12, 2024

I'd be curious what an LLM would be able to accomplish if it was given all pre-modern texts we have access to. Would it be able to come up with ideas we know as modern (political theories, philosophical ideas, scientific theories, etc.)?

I have a hard time believing that if we fed an LLM all prehistoric speech uttered from humans that no matter what, it would never escape the paradigms of those people. If that is the case, than would relying on LLMs just get us stuck in our own paradigms and prevent true "progress"?

onemoresoop · on Sept 12, 2024

LLM are just large language models, they're not capable of thinking or coming up with novel ideas. They could enable people to do so by providing food for brainstorming.

janalsncm · on Sept 12, 2024

Anytime I see an argument of the form X is just Y an alarm bell goes off in my head. While it is descriptive of the components, these arguments ignore the possibility of emergent complexity. Sometimes things are more than just the sum of their parts.

The sun is just a bunch of hydrogen.

A computer is just a bunch of transistors.

Humans are just a bunch of cells.

An artificial neural network is just a bunch of matrix multiplications.

I personally think LLMs are extremely limited and overhyped. But this form of argument seems incorrect to me since it can be used to argue that LLMs cannot do things we already know they can do.

ninetyninenine · on Sept 12, 2024

Probably overhyped from a business and media perspective. From a technological perspective LLMs are the first thing we've ever built that can rocket past a turing test.

LLMs are like mentally challenged autistic human beings. The fact that we even built such a thing is a milestone in humanity.

mkaic · on Sept 12, 2024

> "...they're not capable of thinking or coming up with novel ideas."

This claim is unfalsifiable given common definitions of "thinking" and "coming up with novel ideas".

lostmsu · on Sept 12, 2024

No, it is very much demonstrably false.

Here's a simple program that does not even do LLMs that will trivially enumerate all ideas (broken UTF8 handling omitted for brevity):

  for (var numeral = new BigInteger(0); ; numeral++)
    WriteLine(UTF8.GetString(numeral.ToByteArray());

For any given idea in English LLMs will certainly get it faster.

jonathaneunice · on Sept 12, 2024

But it is verifiable. We could quibble over "who judges novelty," but I bet if there were regular examples of it doing so, and there were some community agreement the ideas were indeed suitably novel, we'd pretty quickly shout "existence proof!" and be done.

mensetmanusman · on Sept 12, 2024

An LLM could come up with an infinite amount of new theories and ideas, of which >99% would be meaningless, if given enough time and energy.

Human’s ability is in guiding the prompt (of AIs, and the mind) on what is worth knowing and which hypotheses are worth testing.

Choosing an action from the countable infinite number of actions is the general framework of free will.

ninetyninenine · on Sept 12, 2024

I think the amazing thing about the LLM is that it's not a random text generator so it's not 99% meaningless.

I give it only 50% meaningless.

mensetmanusman · on Sept 12, 2024

With nearly unlimited energy, if you set up an indefinite while-loop asking it to continue to expand and make new theories on the same token vector it would approach >99% meaninglessness.

ninetyninenine · on Sept 12, 2024

Yeah that's because the only thing changing is some random seed. It's like looking at f(0) = 1 and f(0.00001) = 1.00001 and saying that the function can't produce anything novel because the answer is always less than 2. Hint: try f(99999).

Of course it's all meaningless because most of the input and output is virtually identical. Vary the input heavily and then you will approach 50%. Of course this is assuming the token vector is truly random in terms of subject matter.

mensetmanusman · on Sept 13, 2024

It’s not just a random seed though, this would be attempting to squeeze everything out of the LLM network with all the seeds :)

ninetyninenine · on Sept 13, 2024

If you wanted to squeeze everything out of the network it's not just varying the seeds. It's varying the token vector.

mensetmanusman · on Sept 13, 2024

Agreed, I doubt the full squeeze would be 50% correct on new hypotheses though. Would be a fun PhD thesis if I went back to grad school :D

Teever · on Sept 12, 2024

I've wondered that too. If we trained an LLM on only the scientific content that predates Einstein could it come up with general relativity?

How much coaxing would it require to get it there? What would that process be like?

Can we learn from that and get an LLM trained on all scientific material pre Einstein and post to discover new physics stuff from what we learn from that process?

staticman2 · on Sept 12, 2024

>> Can we learn from that and get an LLM trained on all scientific material pre Einstein and post to discover new physics stuff from what we learn from that process?

The parts of an LLM that teach it language is not disconnected from the parts that teach it facts. Good luck teaching it language with only pre Einstein data.

Teever · on Sept 12, 2024

Why can't you just omit all modern content related to physics? There's lots of training data that doesn't mention any of Einstein's work.

staticman2 · on Sept 12, 2024

Maybe you could, but any influence these concepts had on general culture might still leak in.

Teever · on Sept 12, 2024

To me that's an interesting feature, not necessarily a bug.

How much would have to leak in to guide the LLM to the right conclusions? Is it a matter of quantity or quality? Can we successfully eliminate all leakage?

deisteve · on Sept 12, 2024

i just read this paper on arxiv and im still trying to wrap my head around the implications so the authors got over 100 nlp researchers to write down novel research ideas and then had them review ideas generated by a large language model (llm) without knowing which ones were human generated and which ones were from the llm and the results are pretty fascinating the llm generated ideas were actually judged to be more novel than the human ones p value < 0.05 but slightly weaker in terms of feasibility i mean whats the point of having a novel idea if its not feasible right but still its pretty cool that the llm can come up with stuff that humans havent thought of before

and the authors are saying that this study highlights some of the open problems in building research agents that can generate novel ideas like the llm was bad at self evaluation it couldnt tell which of its own ideas were good or bad and they also found that the llm generated ideas that were too similar to each other lacking diversity i mean thats not surprising right llms are trained on huge datasets but theyre still just pattern recognition machines they dont really understand the context or the implications of what theyre generating

but heres the thing novelty is hard to judge even for experts i mean how do you even define novelty is it just something that nobody has thought of before or is it something that challenges our current understanding of the world and the authors are proposing a follow up study where they actually have researchers execute these ideas into full projects to see if the novelty and feasibility judgements actually translate into meaningful differences in research outcomes which is a great idea i mean thats the only way we can really know if these llms are useful for accelerating scientific discovery or not

anyway im rambling on now but i just think this is a really interesting area of research and im excited to see where it goes can we really use llms to accelerate scientific discovery and what are the limitations of these models and how can we overcome them etc etc

karmakurtisaani · on Sept 12, 2024

Your output would benefit from proper punctuation. Are you using some kind of speech-to-text application? Certainly looks like it.

Interesting comment nevertheless, if not somewhat difficult to parse.

fasa99 · on Sept 13, 2024

It's a brilliant idea, but LLMs are predicated upon the massive data of the internet.

That being said, you could take a pre-2018 dataset and test it for discoveries/insights circa 2019 or 2020 and see how well it performs.

eagerpace · on Sept 12, 2024

Is this how our simulation started?

Circlecrypto2 · on Sept 12, 2024

Even if they can't, individuals are prone to inspiration from an LLMs attempt. Worth giving it a shot at least.

vunderba · on Sept 12, 2024

Strongly agree.

Back in high school, I put together a program called DreamPool that would just literally pick out a few nouns from a gigantic dictionary file and bubble them up in a little graphic of a well, and then I would sit quietly and spin those concepts around attempting to connect them together.

LLMs are like a version of this on steroids and the potential as a tool for augmentation is huge.

influx · on Sept 12, 2024

Dreampool sounds very cool, almost like brain excercise!

kelseyfrog · on Sept 12, 2024

Even if we've essentially created a rubber duck that slightly more easy to engage with, that sounds like a win.

ModernMech · on Sept 12, 2024

This is true, I find LLMs to be the best mental lubricant ever created. Cures writers block and gets me out of creative and technical jams all the time.

dageshi · on Sept 12, 2024

I'm outlining a fantasy story I want to write, I've found using image generation (leonardo.ai) very helpful in nailing down what the world looks like and how it'll work, I've actually changed my mind on a few things based on what the AI generated.

Plus honestly it's just genuinely fun! I've resisted moving to one of their paid packages just yet because I think I'd spend all day on it!

marmaduke · on Sept 12, 2024

Devils advocate, writers block and jams are symptoms not bugs to fix

roninorder · on Sept 12, 2024

Could you elaborate? Symptoms of what?

marmaduke · on Sept 12, 2024

Lots of things I think, lack of understanding, lack of motivation. Typical suspects. I think We look for easy solutions like using LLMs for instance (but not exclusively) doing busywork when we shouldn’t be doing the busywork in the first place. Busywork is just example, but in creative non-busywork it could also be the extensive, time intensive exploration required to advance. In jazz for instance spending hundreds if not thousands of hours on scales and arpeggios over the instrument is a non-negotiable expense to access fluidity.

So while I think LLMs are great fun and I don’t judge anyone using them, personally wanting to use one for a difficult problem is a flag that I should allow more time to think before going ahead.

throwanem · on Sept 12, 2024

Not OP, but much writing advice with which I'm familiar, and my own experience, both suggest these are symptoms of an error made earlier and not yet recognized.

If I can't figure out where the plot could possibly go from here, go back and look for where I sent it off the rails earlier such that there's nowhere to take it now; if I can't find dialogue or action that fits, go back and find where I put the character in a situation they'd never get themselves in, or miswrote them to respond to it in a way they never would. Stuff like that, especially once it's had you stuck too long for inspiration latency still to be tenable as the cause.

marmaduke · on Sept 12, 2024

Yep and in general I think the matrix got it right

when discussing Trinity and Neo's visions, the Oracle states: "We can never see past the choices we don't understand".

throwanem · on Sept 12, 2024

Or have yet to realize that we made, yeah.

roninorder · on Sept 12, 2024

Good advice, thanks!

Comma2976 · on Sept 12, 2024

Writer's traffic congestion

6510 · on Sept 12, 2024

For some reason my mind maps concepts from chess onto the real world and the other way around. The novel AI ideas in chess are usually combinations of several... well... bad ideas. People usually take on one bad idea at a time and try to make it work. If they succeed it is quite surprising because objectively it was a bad idea.

Our thinking is a lot less error prone than that of the LLM's but we have to study for years to absorb prior art that LLM's mostly receive at birth by osmosis. It won't be able to take on the truly stupid ideas but it can combine large numbers of the somewhat stupid.

Like a chess position with lots of possible moves followed by lots of possible responses.

Incipient · on Sept 13, 2024

Taking a step back. Define "novel".

I have the idea for an induced draft umbrella. Stick a fan at the top under an opening.

Is that idea novel? I haven't seen it anywhere, it's just something I came up with. But it's not entirely novel, I'm just borrowing the concept of a fan, and an umbrella.

I don't feel this is entirely out of scope for what an LLM could describe in words?

fzeindl · on Sept 12, 2024

Could rolling dice with matching phrases generate novel research ideas? (ok the metaphor is slightly off, but not a lot)

AnimalMuppet · on Sept 12, 2024

Maybe yes.

If you give it phrases from, say, the last five years' worth of published research papers, and it combines phrases at random and spits out the words, yes, in that there will be some interesting research ideas, and maybe even some that a human would not have come up with.

Unfortunately, they'll be buried in a blizzard of stuff that no human would come up with, because they are totally meaningless combinations. Finding the good ones is the issue - and it's not an issue that LLMs can currently solve.

jrsharma · on Sept 13, 2024

Looks like @sama's o1 is killing PhD science questions. https://x.com/sama/status/1834283100639297910?t=iCRehNoBofMP...

pfannkuchen · on Sept 12, 2024

Would you be able to tell if it did? There could be an obscure document in the training set that contains the idea. It seems like a very hard problem to definitively detect whether a concept came from the training set.

valine · on Sept 12, 2024

A single document doesn’t contribute much to the loss during base training. LLMs can absolutely memorize text if it’s duplicated in the training set, not so much if the document in question is obscure.

mercurialsolo · on Sept 12, 2024

The difference between novelty and hallucination is feasibility. With integrated critique and feasibility checks you can eventually map the hallucination space into novelty.

breck · on Sept 12, 2024

Yes.

LLMs have helped me generate 1 novel discovery: that every top 10 programming language has a single creator (https://pldb.io/blog/aSingleCreator.html).

They also helped me generate this map yesterday (https://pldb.io/blog/whereInnovation.html), which is the most comprehensive map of programming language creation and software innovation ever created.

smokel · on Sept 12, 2024

How do you define single creator? The person who typed the first keystroke?

It seems pretty silly to claim that Algol 60 has thirteen creators. Sure, Wikipedia lists that many names in some table, but that is not really a significant statistic, now is it? Java 22 probably has hundreds of people who contributed to it.

Fun exercise though :)

breck · on Sept 12, 2024

> How do you define single creator?

The creators define how many creators there were.

It's all open source and anyone can fix any mistakes by updating one line (for example, here's the creators entry for Python: https://github.com/breck7/pldb/blob/b8ae74253733e4aa0fb57d26...).

We generally don't have any disputes but there's the occasional error and always open to pull requests to fix those.

But you do bring up a good point about contributors/maintainers. Adding that data is definitely on the priority list, but might be a 2025/2026 thing.

svieira · on Sept 12, 2024

Apache is credited with things it didn't initiate, but accepted after they were given up by another company (e. g. Cassandra)

nullc · on Sept 12, 2024

shuf -n 3 /usr/dict/words will also sometimes generate novel research ideas.

browningstreet · on Sept 12, 2024

Harmony Korine, the poet (done at SXSW 2010 with GordonandtheWhale.com)

https://youtu.be/PqvAmlnJTfU?si=C8M7IGnJ_iYJE-8_

bdhcuidbebe · on Sept 12, 2024

[flagged]

throw310822 · on Sept 12, 2024

There once was a user online,

Who thought AI verse wasn't fine.

"These LLMs can't reason!"

He cried with displeasure,

"Their rhymes are just data, not mine!"

-

He typed with furious might,

Certain robots can't versify right.

But with each posted screed,

The irony grew indeed -

His arguments lacked the insight.

-

For while he debated with zeal,

AIs composed with appeal.

They reasoned through meter,

Made metaphors sweeter,

And rhymed with poetic ideal.

amelius · on Sept 12, 2024

No.

Perhaps we can use LLMs to invalidate patents. Want to check a patent? Just download the LLM model from before the issue date, then ask the LLM to produce the work. If it succeeds, you have invalidated the patent because the work was not novel.

bee_rider · on Sept 12, 2024

Huh.

Regardless of folks opinions on the actual plan, does this idea imply an interesting question?

In some sense all of the ideas in a book already exist. Before I read the book, they haven’t been put through a process of being interpreted by my eyeballs and ingested into my brain. But, they do already exist.

Do the ideas in an LLM latent space (or whatever) already exist? They haven’t been read or interpreted yet, but the sentient cognition that goes into creating the idea has already happened.

What does it mean for an idea to exist anyway?

ninetyninenine · on Sept 12, 2024

https://openai.com/index/learning-to-reason-with-llms/

amelius · on Sept 12, 2024

Conversely, you may ask: do we really want to grant patents for ideas that can be generated simply by asking an llm?