Hacker Newsnew | past | comments | ask | show | jobs | submit | zkmon's commentslogin

The biggest rule-break was done, not by the agent or infra company, but by the person who gave such elevated authorization (API key) to an autonomous bot.

Isn’t the biggest rule to have working backups with 3-2-1 strategy?

That's not what happened.

if an api key with full perms was put in a place where the agent can access it, that is the biggest problem.

that somebody made a key thst can delete prod when they dont need to delete prod is the underlying problem with that

and underlying that still is that the staging environments were on the same account as prod.


You’re very defensive in these comments - are you the author?

Yesterday was a realization point for me. I gave a simple extraction task to Claude code with a local LLM and it "whirred" and "purred" for 10 minutes. Then I submitted the same data and prompt directly to model via llama_cpp chat UI and the model single-shotted it in under a minute. So obviously something wrong with coding agent or the way it is talking to LLM.

Now I'm looking for an extremely simple open-source coding agent. Nanocoder doesn't seem install on my Mac and it brings node-modules bloat, so no. Opencode seems not quite open-source. For now, I'm doing the work of coding agent and using llama_cpp web UI. Chugging it along fine.


https://pi.dev/ seems popular, whats not open source about opencode? The repo has an MIT License.

Been LOVING Pi so far!

+1 for pi. I used claude and opencode but pi is the first agent tool that made me excited about the whole thing.

Some people believe only copyleft licenses are open source. They're right on principle, wrong in (legal) practice.

They're not even right on principle: https://www.gnu.org/licenses/license-list.html

Even the FSF recognizes that non-copyleft licenses still follow the Freedoms, and therefore are still Free Software.


Maybe it's just my feeling. It asks to update/upgrade continuously.

It's completely open source, but is under heavy continual development (likely a lot of AI coding).

On launch, it checks for updates and autoupdates.


It doesn't auto update. Maybe you have an extension?


Probably a silly idea, but I'll throw it into the mix - have your current AI build one for you. You can have exactly the coding agent you want, especially if you're looking for "extremely simple".

I got annoyed enough with Anthropic's weird behavior this week to actually try this, and got something workable up & running in a few days. My case was unique: there's no Claude Code for BeOS, or my older / ancient Macs, so it was easier to bootstrap & stitch something together if I really wanted an agentic coding agent on those platforms. You'll learn a lot about how models actually work in the process too, and how much crazy ridiculous bandaid patching is happening Claude Code. Though you might also appreciate some of the difficulties that the agent / harnesses have to solve too. (And to be clear, I'm still using CC when I'm on a platform that supports it.)

As for the llama_cpp vs Claude Code delays - I've run into that too. My theory is API is prioritized over Claude Code subscription traffic. API certainly feels way faster. But you're also paying significantly more.


Just in case it didn't occur to you already, you can just build whatever coding agent you want. They're pretty simple

Swival is not bloated and was specifically made for local agents: https://swival.dev

pi.dev as well

I use both Cursor and Claude Code, and yes, the latter is noticeably slower with the same model at the same settings.

However, it's hard to justify Cursor's cost. My bill was $1,500/mo at one point, which is what encouraged me to give CC a try.


You'd figure by now we would have something between a TUI and an IDE.

You can run CC with local models, it's pretty straightforward. I've done this with vLLM + a thin shim to change the endpoint syntax.

what model you used with llama_cpp?

Qwen3.6-35B quant-4 gguf

They released 1.6 T pro base model on huggingface. First time I'm seeing a "T" model here.

Kimi K2.5 and K2.6 are both >1T

Bots would win over all anti-spam, anti-slop measures. All blog posts and comments everywhere would be filled with spam and slop. That's when humanity turns it head away from screens, back towards other humans nearby and start talking to each other, while the ocean of slop and spam keep bubbling, infested bots.

On llama server, the Q4_K_M is giving about 91k context on 24GB, which calculates to about 70MB per 1K context (KV-Cache). I could have gone for Q5 which probably leaves about 30K token space. I think this is pretty impressive.

I have been getting good results with IQ4_NL and TurboQuant at 4bits on 24gb (3090). It easily fits 256k with that setup, but it starts slowing down quite a bit after 80-100k. Quality in my testing is also still good:

- Coding task test: https://github.com/sleepyeldrazi/llm_programming_tests/ - Design task test: https://github.com/sleepyeldrazi/llm-design-showcase

Coding was against minimax-m2.7 and glm-5, and the design against other small models


The raw math: File size is hard-linked to parameter count and quant type. Intelligence is sort of linked to parameter count. Parameter count dictates the hardware requirement. What't left for the labs is, compressing more intelligence into lower parameter count, or packing more of specialized intelligence or buying up more hardware. Those are the only 3 directions all models/labs are heading.

It's not isolated phenomenon.If we look at the larger scale of, say, 100 years, a lot of things are rapidly disappearing. It's actually some sort of extinction that is underway, but you don't feel it on a smaller scale of time. Similar to how Romans wouldn't have been aware of the Fall of Roman empire while it was happening, because it was too slow to notice.

Can you give some examples of things rapidly disappearing?

I think OP is talking about Shifting Baseline Syndrome[0].

> A shifting baseline (also known as a sliding baseline) is a type of change to how a system is measured, usually against previous reference points (baselines), which themselves may represent significant changes from an even earlier state of the system that fails to be considered or remembered.

[0]: Wikipedia: https://en.wikipedia.org/wiki/Shifting_baseline.

[1]: Earth.org article that reads nicer: https://earth.org/shifting-baseline-syndrome/


The average person's ability to acquire food from nature (farm, hunt, gather) and cook it for themselves.

The average person ability to make and fix their own tools. Build and fix their shelter.

Free range childhood.

Average person getting dirt under their fingernails.

Being in sync with sunlight cycle.

Stargazing at night.

These are off the top of my head (I'm not op).


Learnt how to balance and ride a cycle on my own when I was a kid, and I used to 'run away from home' for 12 hours with other kids, I learnt how to swim after drowning, twice, ate whatever was around, green almonds from trees, grapes from vineyards, but raw corn was painful, I would not advise trying. We tried to hunt with arrows and sometimes used gun powder in a primitive red-loading rifle, but we sucked at it. we got chased by dogs in farms we raided and chased by an armed man who claimed we caused his wife's abortion while we were playing football in the street. Another armed man chased us brandishing his gun after we attacked him with stones after we caught him staring at our neighbor's daughter while she was on the balcony. This was in 1969 to 1973 before we moved to an apartment building and all that ended. Now I joke telling my family that I wish for once the police would call me for something my son has done, but no luck with that:) . Here some photos I wish you could recognize that dude on my shirt https://imgur.com/a/JCFMgap

The dude on your shirt is Mork from the show Mork & Mindy, played by Robin Williams.

Thank you, I was bullied all the time because of it.

I don’t think in 1926 more than 50% of global 15-20 year olds could:

“acquire food from nature (farm, hunt, gather) and cook it for themselves.”

Or

“make and fix their own tools. Build and fix their shelter.”

(Culturally, those tasks often specialize by vocation, gender etc.)


They absolutely could. A quarter of Americans’ primary job was agriculture in the 1920s. While job specialization was certainly a thing that didn’t mean people outsourced all of these tasks the way people do today.

That’s different from solo gather and cook and repair which is the artificially inflated bar being set here. I know people from multiple parts of the world who grew up on family working farms - specializing very real (esp. gender based)

I'm saying that the number of people doing these things are disappearing relative to the past. Seems to me like you are the one making up and moving bars around.

You said 50%, you said 15-20, you are speaking in absolute terms.

I'm pointing at trends.

Do you deny the trends?


I found that you’re basically arguing against the wall at this level of threading so I would just give it up.

I don't know about "globally" but I would be surprised if 50% of American males couldn't do this in 1926. These were skills taught to most young males and the country was far more rural. It was far less universal among females though some did learn these skills.

While unstructured, this kind of standard life knowledge was intentionally and systematically passed down to most boys in every community I lived even when in elementary school. It was expected that you knew how to do these things and men would go out of their way to teach you if you didn't.

Kids did fishing, trapping, hunting, building out camp sites, etc for fun when I was growing up and it was generally encouraged. Learned helplessness wasn't really a thing.

This started to die out decades ago. Most zoomers I know didn't have anything like this experience.


Being able to see stars in the night sky


Yes.

I'm guessing 3.5-27b would beat 3.6-35b. MoE is a bad idea. Because for the same VRAM 27b would leave a lot more room, and the quality of work directly depends on context size, not just the "B" number.

MoE is not a bad idea for local inference if you have fast storage to offload to, and this is quickly becoming feasible with PCIe 5.0 interconnect.

MoE is excellent for the unified memory inference hardware like DGX Sparc, Apple Studio, etc. Large memory size means you can have quite a few B's and the smaller experts keeps those tokens flowing fast.

Give more ammo to bad actors and sell the ammo to defenders, charge both for tokens. Why isn't this business model banned already?

Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: