Hacker Newsnew | past | comments | ask | show | jobs | submit | toxik's commentslogin

Oh lord, are the LLMs already replacing LLMs?

$10k is well outside my budget for frivolous computer purchases.

It would be plenty in-budget if the software part of local AI was a bit more full-featured than it is at present. I want stuff like SSD offload for cold expert weights and/or for saved/cached KV-context, dynamic context sizing, NPU use for prefill, distributed inference over the network, etc. etc. to all be things that just work for most users, without them having to set anything up in an overly error-prone way. The system should not just explode when someone tries to run something slightly larger; it should undergo graceful degradation and let them figure out where the reasonable limits are.

But it's well within the budget of a small company that wants to run a model locally. There are plenty of reasons to run one locally even if it's not state of the art, such as for privacy, being able to do unlimited local experiments, or refining it to solve niche problems.

yeah, but if you really really wanted to and/or your livelyhood depended on it, you probably could afford it.

99.97% of HN users are nodding… :)

There are way too many good uses of these models for local that I fully expect a standard workstation 10 years from now to start at 128GB of RAM and have at least a workstation inference device.

or if you believe a lot of HN crowd we are in AI bubble and in 10 years inference will be dirt cheap when all of this crashes and we have all this hardware in data centers and it won't make any sense to run monster workstations at home (I work 128GB M4 but not run inference, just too many electron apps running at the same time...) :)

> I work 128GB M4 but not run inference, just too many electron apps running at the same time.

This is somewhat depressing - needing a couple of thousand bucks worth of ram just to run your chat app and code/text editor and API doco tool and forum app and notetaking app all at the same time...


Crucial (Micron) sold 128GB of DDR5-5600 in SODIMM form for $280 a year ago. It would be slower tham the same amount on an M4 Mac, but still, I object to characterizing either as “a couple thousand bucks worth”.

Inference will be dirt cheap for things like coding but you'll want much more compute for architectural planning, personal assistants with persistent real time "thinking / memory", as well as real time multimedia. I could put 10 M4s to work right now and it won't be enough for what I've been cooking.

That's kind of a specific percentage. What numbers did you use to get there?

Just have to reclassify it as non-frivolous then. $10k's not a lot for something as important as a car, if you live somewhere where one is required. Housing is typically gonna cost you more than $10k to own. I probably spend close to $10k for food for 1.5 years.

So if you just huff enough of the AI Kool aid, you too can own a Mac Studio. Or an M5 MacBook. Or a dual 3090 rig.


Worth noting that even in clinical settings, ethanol has largely been replaced by fomepizole (Antizol) as the preferred antidote. It's more predictable, easier to dose, and doesn't come with the side effect of making a critically ill patient drunk. Ethanol is the field expedient, not the standard of care.

Great post. Also, we celebrate "midsummer" on the summer solstice in Sweden and other countries. I see the author noted that.

Pandas and so on exist for the same reason Django's ORM and SqlAlchemy do: people do not want to string interpolate to talk to their database. SQL is great for DBA's, and absolutely sucks for programmers. Microsoft was really onto something with LINQ, in my opinion.


You can reduce in parallel. That was the whole point of MapReduce. For example, the sum abcdefgh can be found by first ab, cd, ef, gh; then those results (ab)(cd), (ef)(gh); then the final result by (abcd)(efgh). That's just three steps to compute seven sums.


No, you can not. Your example is correct only if addition is associative. And it is not always associative. Hence the need for higher abstractions, where you model commutativity and associativity of certain operations.


"I have never woodworked a day in my life, with Claude Carpenter I don't have to touch the work at all and can just vaguely ask for things and pray that it does something useful."


If you're inexperienced you have no bookcase at all, going from that to a rickety bookcase is an enormous improvement.

(Perhaps this is why some devs dislike it, perhaps they place the quality of their work very very high)


I mean claude carpenter sounds pretty sick


Until it builds a stairway which leads to an attic in such a way that the access is under the shallowest part of the roof and unusable.

I've tried using the 3D generation stuff a bit, but it never worked out.

Still amazed that folks such as:

https://www.reddit.com/r/openscad/comments/1adcw41/i_am_comp...

manage to get anything usable in 3D at all, but making an STL is a big difference from making a useful architectural structure.


Pedantry: 18:16 is the same as 9:8 since it's a ratio.


Not just for functional programmers. Prints and other I/O operations absolutely are side effects. That's not running counter to the point being made. Print in an assert and NDEBUG takes away that behavior.


You're right of course. I was thinking specifically of printing log/debug statements in the assert(..), but that usually only happens if the assert(..) fails and exits, and in that case the "no side effects" rule no longer matters.


And in a bold face font:

> You've always needed an account to operate your Joule Sous Vide with the Joule app. This is not a new requirement.

Absolute comedy.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: