Fake HN titles generated by GPT-3

metadat · on Nov 27, 2022

Ask HN: Is there another HN beginning with a “h”?$

Show HN: Daily XLSX to-do list with attached spreadsheet as material*

Banning JavaScript from web pages is bad for the user

Has Google become too social?

SQLite development from scratch from scratch

Armor-piercing lasers are not shooting lasers but missiles

We made a public blockchain off-chain

Apple sued for pricing user data against provider who did not provide refunds

100% Embarrassing Haskell Builds

Heroku Compose is not fit for purpose

Pain Enhancer

The distance between reality and fantasy is grows ever smaller.

jonas-w · on Nov 27, 2022

My favorites so far:

"Ask HN: Why do brands fail to realize that consumers want snacks?"

"Ask HN: Why does everyone assume everyone is “too technical?”"

"Ask HN: HN is now running on GCP. Why?"

"Incest" (i'm not kidding)

"Tell HN: Switzerland is replacing its central bank with a peer to bank banking company"

"Starlink—Smart RFID Hotspot"

"Kryptos: A robot written in Rust who needs to talk to humans"

"Cloudflare was hacked and replaced with an NSA spyware-run Proxy"

"I poked a hole in a wall and sent the image through a digital lens"

2fast4you · on Nov 27, 2022

“Zxyn – A C escape language”

andrewnicolalde · on Nov 27, 2022

Apparently it’s not even the most realistic it could be! From the footer:

> We used Ada because titles generated by the larger Curie model are barely distinguishable from the real thing, despite being original.

artbristol · on Nov 27, 2022

This has the disconcerting effect of making the real HN titles look like they're generated by GPT-3.

firtoz · on Nov 27, 2022

Eventually we will get AI based title analysers that simulate the conditions to predict and optimise where your title will land on the front-page listing order.

rhizome · on Nov 27, 2022

I think a case could be made that ML is automated growth hacking.

ludamad · on Nov 26, 2022

Looked away for a while talking to someone, tried to click a link when I looked back :)

Also unrelated but good one "Tether issue: under $1B and all of the tether tokens suspended". This would definitely get my click :)

FullstakBlogger · on Nov 27, 2022

I absolutely want to read some of these. There needs to be a way to for people to post and rank the best matching articles that actually exist. I dream of the day that generative AI is repurposed to act as a more effective search engine.

prattatx · on Nov 27, 2022

I didn’t need to look away to click a link. Spoiler: it doesn’t work, nor do the comments

johnfn · on Nov 26, 2022

"Glassdoor: Employees have the right to be anonymous, but the CEO doesn't"

Hard to argue with that logic!

randomlurking · on Nov 26, 2022

Personal favourite:

Mark Zuckerberg: “Haters are supporters; hate is a solution”

timeon · on Nov 27, 2022

I came across this one: "Why I Love Vapor (2018)"

cirgue · on Nov 27, 2022

Mine was “Elon Musk’s armies are about to begin the slaughter”

bombcar · on Nov 27, 2022

It’s not GPT3 it’s just posting from the future!

jpmattia · on Nov 27, 2022

Oh god, I sense this thing is trying to make us write essays to back up the titles.

My favorite so far: "Why isn't Linux perfect?"

cesarb · on Nov 27, 2022

I got "Why encryption doesn’t work", which does feels like an essay which should exist, if it doesn't already.

angry_moose · on Nov 27, 2022

"Ask HN: Is HN worth caring about?" is by far the best one I've gotten.

"James Webb Telescope Recall" was pretty amusing as well.

strbean · on Nov 27, 2022

> The Impossible History of Soil Temperature in the Earth’s Atmosphere

This one seems too good to be AI

version_five · on Nov 26, 2022

This is good, and reminds me of the "Fake Paper Generator" that some MIT students made using context free grammars in 2005: https://pdos.csail.mit.edu/archive/scigen/

Unfortunately it doesn't work anymore, but the titles it generated for the papers were all plausible but also ridiculous CS paper titles, e.g. Rooter: A Methodology for the Typical Unification of Access Points and Redundancy

raphlinus · on Nov 27, 2022

I got "Install the NuGet package manager on a Mac" which I'm still not sure is ridiculously infeasible or the kind of hack somebody might actually manage to pull off. Definitely HN-worthy if they manage it!

cglong · on Nov 27, 2022

It's actually super easy! NuGet.exe has had support for Mono as long as I can remember :)

In fact, I think you can even install NuGet via Homebrew.

raphlinus · on Nov 27, 2022

TIL :)

zoogeny · on Nov 27, 2022

The real trick here is to use this site as a source of inspiration for legitimate blog posts.

remram · on Nov 27, 2022

Is there a dataset of HN titles? This made me want to fiddle with this, but step one is to get the data, and I don't want to crawl HN if the data has already been collected.

FiloSottile · on Nov 27, 2022

There are a few sources. There's the official API [0], the Algolia search API [1], and the BigQuery dataset which is pretty up to date [2].

I used the Algolia search API, it has extremely generous rate limits and page limits.

[0]: https://github.com/HackerNews/API [1]: https://hn.algolia.com/api [2]: https://console.cloud.google.com/bigquery?p=bigquery-public-...

krapp · on Nov 27, 2022

There's an API[0] but it's frustratingly limited in capabilities (albeit not rate-limited.) You'll have to iterate all post IDs, download each post as JSON and get the titles that way.

There's also a Google dataset but I don't know the URL for it or if it's up to date.

[0]https://github.com/HackerNews/API

capableweb · on Nov 27, 2022

> frustratingly limited in capabilities

What's missing precisely? Seems to be good enough for every use I could think of.

One time I even downloaded every single item from it, with a threaded fetcher of I think 16 threads, iterating from 1 up to latest ID and it was done in some like 2 hours I think.

krapp · on Nov 27, 2022

No ability to directly download threads with a single request, for one, or query it like a database to sort or filter results, exclude unwanted fields, etc.

capableweb · on Nov 28, 2022

Those things should be trivial to achieve with most general purpose languages, as the API is so simple. No need for pagination or other things, just request things by ID recursively and you get the full thing, then after than filter/select whatever you want.

Pseudo-code to show how simple it would be:

    function get_thread(id) {
      let item = http.get(`{api}/?id={id}`).body
      if item.childs {
        item.fetched_children = item.childs.map((id) => {
          return get_thread(id)
        })
      }
      return item
    }

(untested, but you don't really need more than that, besides checking if the item was deleted)

krapp · on Nov 29, 2022

Yes, and i've done it. But you still wind up having to make a separate request for each item, which makes building threads incredibly slow. It's also a waste of time if you're filtering out anything, because you still have to make the request and download the item to filter it out.

Which is why it would be preferable for the API itself to support these features.

lumenwrites · on Nov 27, 2022

Can you somehow fine-tune GPT3 on a dataset? I just assumed the OP generated them using a prompt like "top hacker news threads" or somehing like that.

squarefoot · on Nov 27, 2022

"Redtube’s Mark Twain Is a Scientologist Now"

Now I totally want to read that article. Please let the AI write it:)

AkshatJ27 · on Nov 27, 2022

It averages around 0.0005$ per request according to the footer. Could this end up costing the author quite a bit due to HN traffic? Also, whats stopping bad actors from writing a script to continuously fetch the page?

I wonder if some sort of caching might help lower costs.

FiloSottile · on Nov 27, 2022

It's part of the bit.

There's intentionally no caching, every batch is warm from the AI oven.

HN usually drives around 10k visits, so organically it's going to be well within my Saturday night budget. If someone decides to hammer it, well, the OpenAI account has a hard limit of $20/month. It will live until it's killed I guess.

TeMPOraL · on Nov 27, 2022

> HN usually drives around 10k visits

You have to account for this not being a regular visit, but rather (I guess) 5-50 "visits" per visit, as people mash F5 to get more titles.

> OpenAI account has a hard limit of $20/month

That makes me feel better about the couple refreshes I did myself :).

sillysaurusx · on Nov 27, 2022

Nothing. And in fact that’s exactly what happened to me. Some fella from HN spawned like 45 simultaneous wget’s in a loop to cause maximum financial damage. All of a sudden we see Firebase’s cost graph go vertical.

It happened after I mentioned “just be kind, please! Theoretically this could cost a lot of money.”

So there’s at least one person who will do exactly this just for fun.

Firebase customer support was super cool about it, but it still knocked us off the paid tier.

throwaway0x7E6 · on Nov 27, 2022

>All of a sudden we see Firebase’s cost graph go vertical.

think of all the money you have saved by not having to hire a hundred engineers to maintain your website's infrastructure!

postultimate · on Nov 27, 2022

Microsoft considered killing the startup community, says CEO

Tesla’s strange new CEO: ‘This is not the right place for me to be successful’

Puerto Rico is receiving $6

Earth’s largest cloud is the remains of a comet (2019)

A deep dive into C++ features that make your life harder

Windows 10 is basically iOS without the app drawer

Never trust a trusted set of data (2013)

The Rust Tax

Terabot: A remote-controlled toy train that trains riders to do your work

4-Hour Workweek Passes UNH’s Students

Why does Brexit happen?

NASA Launches High-Definition to Broadcast Its Infomercials

I made a living building software that no one needs

Scientists have created a new kind of blood clot

Two weeks without Instagram

Life on the ocean floor and how humans inhabit it

Taxing robots is immoral, right? Because it’s a) easier, and b) it would vastly reduce human labor

Why are people so glad to have their memories wiped?

Does the US lead the world in online harassment?

A 50-year-old man shops for lamb chops on the Internet for the first time

They said it was impossible to dig for cobblestones without being coated with mud

--

Ok, that's it. Time to abandon HN and go over to this site.

Actually, scratch that - I'm just moving to the universe this bot is from.

postultimate · on Nov 27, 2022

Hurricane Florence will likely “peel off” the US Atlantic Coast, US Coast Guard spokesman says

Getting the right amount of memories right – the culture of memory buffing in bats

The art of lithium-ion dynamite (1995)

We built the game that makes it harder to punch people

I posted this on Hacker News, and now it's being cut off from my website (2021)

Show HN: I built an acrylic web interface that resembles my laptop

Ask HN: How come not other languages?

80-bit integers get twice as big

EBay fired me for praying during my contract negotiations

Oregon Community College Students with Common Hyperparasites (2018)

New body cameras that didn’t see a shooting

Show HN: Awk – The ultimate search engine

The Complete Code of the IWK-01 Space Gun

Write It Down: Probably

NASA's Hubble observatory detects earliest known appearance of crude grease on objects around the world

Bitcoin collapses below one BTC

Show HN: I made a web app for making to-do lists

Hollywood is turning to Russia as a model of “positive cultural integration”

--

Ok it's just interviewing for The Onion at this point.

ASalazarMX · on Nov 27, 2022

Beware, that universe might be closer to the singularity than this one:

Google Chrome's 'enhanced' pop-up blocker opens an emergency emergency_auth package after public disclosure

postultimate · on Nov 27, 2022

Yes,

Google CEO Jeff Weiner's Quest to Turn the AI Resource Engine Into an Effective Leader

is a very worrying development.

FullstakBlogger · on Nov 27, 2022

"Ask HN: I'm working full time on my blog and still have not found a sustainable way to make money"

rrjjww · on Nov 27, 2022

Ask HN: How do you limit your son's software interest?

Show HN: A tool to quickly improve your skills

Amazing stuff.

thirteenfingers · on Nov 27, 2022

@rachelbythebay's HN spoof is still my personal favorite: https://rachelbythebay.com/fun/hrand/

snickerer · on Nov 27, 2022

I want the links clickable and AI generated discussion threads! Please! Please!!

altilunium · on Nov 27, 2022

It reminds me of subreddit simulator

smtpserver · on Nov 27, 2022

"French technical community in Bastogne after a brush with German Army"

"US sets 10-minute work hours as government turns to technology, not people"

"China Urges First Time Travelers to Avoid Underwear Sales"

LelouBil · on Nov 27, 2022

On mobile, after clicking on the link and getting distracted for a second, I could not tell that I was not on HN ! I even tried to click multiple links until I reached the bottom of the page and remembered !

Good job with this !

TeMPOraL · on Nov 27, 2022

Right. The only reason I didn't get confused for long is because I use a userstyle to render HN in dark mode, and this one displayed "plain". But my immediate reaction wasn't "hey, that's fake HN", but "hey, why is this HN tab rendering in default style?!".

Good job, OP!

capableweb · on Nov 27, 2022

Would love to see it go a step further and do the same thing for comment threads as well. Think I could possibly lose hours on a site like that rather than minutes. This was a lot of fun as well though.

bombcar · on Nov 27, 2022

The various subreddit simulator subreddits are a decent amusement.

swagasaurus-rex · on Nov 27, 2022

"SpaceX is thinking of a rocket rail system"

This is something I can support

cglong · on Nov 26, 2022

> Show HN: Rhapsody.fm – Send and receive messages using Discord

What an invention!

emigre · on Nov 26, 2022

If only there was a way

jose-cl · on Nov 27, 2022

"What Microsoft needs to do to become a large tech company 46 points by a machine learning model | hide | 24 comments"

orls · on Nov 27, 2022

I refuse to believe this one wasn’t snuck in to the outputs by a human:

> Tesla Co-CEO says the autonomous vehicle ‘is for chancers’

tanduv · on Nov 27, 2022

got another one about this Tesla Co-CEO

> Tesla Co-CEO is trying to raise $800M for T4, but details are sketchy

alfonsinbox · on Nov 26, 2022

Well now I need to know more about the "Mystery of the frozen frog latin speaker (2012)"

b1ue64 · on Nov 27, 2022

"Pornhub was around for 18 years before people started thinking about it [video]"

kesor · on Nov 27, 2022

Pick the good ones, and write blogs on them, then post on HN ... Profit!

anonu · on Nov 27, 2022

Not fair. There's some articles on there I really want to read.

doitLP · on Nov 27, 2022

Top title: “The baby I was supposed to deliver on twitter”.

Man this is good.

smcin · on Nov 27, 2022

Superb and timely. Although "Silicon Valley’s single biggest problem is that it’s the clearest way of doing things" is just garbled.

number6 · on Nov 27, 2022

First warp drive test successfully conducted – US engineers

Nice

NayamAmarshe · on Nov 27, 2022

> "Apple is so petty, it may murder your kid"

> Git as a platform for mentalillness

What!?

> Show HN: Open-Source PyXMP Cryptographic Algorithm

I'd love to see that!

qbrass · on Nov 27, 2022

That's wrong on so many levels.

"ToddlerPorn.com is a website for parents to help their toddlers get off porn"

motohagiography · on Nov 27, 2022

"Ask HN: What's your strategy for winning the lottery?"

"Computer Science at MIT Does Not Exist."

Still other ones I would use as writing prompts for a blog. It would be about things an ML model produced because it thought this was what you would think would be popular. (I defy any ML model to craft a more painful sentence.)

"The first hour of a movie is often worth the whole movie"

"Transhumanist manifesto: explore my visions for the future"

"Fill up on technology. Technology freaks out. Tell your kids to build a crisis framework"

These are amazing.

bluedevilzn · on Nov 27, 2022

This is incredible and hilarious

lumenwrites · on Nov 27, 2022

It'd be so fun if you could generate comments for each of the threads as well.

icambron · on Nov 26, 2022

“Progress on addressing the massive crisis in electrical chargers”

I’m glad to hear it!

timeon · on Nov 27, 2022

When there is no new content on HN, I'm gonna refresh this one.

indigodaddy · on Nov 27, 2022

“The experience of a young black man in a white prison (2020)“

None4U · on Nov 27, 2022

My favourite was "Crowdfunding Scam - Monero"

dhruval · on Nov 26, 2022

Interested in the process for fine tuning gpt3 for this

FiloSottile · on Nov 27, 2022

Super easy, just took 10k titles/comments/points from the Angolia API, formatted them as JSON Lines like the following with jq, and fed them to the very well built and documented openai CLI.

{"prompt": "A plausible Hacker News title:", "completion": " The Feynman Lectures on Physics (1964) (280 points, 62 comments) END"}

The space at the beginning of the completion is for tokenizing, and the END token is for use as a stop token in the generations.

dhruval · on Nov 27, 2022

Thanks!

system2 · on Nov 27, 2022

A Caveman’s Guide to C++ (2009)

I am crying.

sodapopcan · on Nov 26, 2022

Where’s “Tiny X written in Y”?