Hacker Newsnew | past | comments | ask | show | jobs | submit | matchagaucho's commentslogin

Some redundancy also helps to keep a running todo list on the context tip, in the event of compacting or truncation.

Distilled mini/nano models need regular reminders about their objectives.

As documented by Manus https://manus.im/blog/Context-Engineering-for-AI-Agents-Less...


Keyboard response feels 10x slower in ChatGPT Projects (possibly for reasons other than react state).


Not to mention LLMs love XML.

The markup includes self-describing metadata and constantly reminds the GPT model of explicit context.


Agents can propose refactoring just as readily as humans.

If coding agents already read AGENTS.md before making changes, they can also maintain a TECHNICAL_DEBT.md checklist.

Keep the loop intact: AGENTS.md ensures technical debt remains in context whenever changes are planned.


For me, it’s about preserving optionality.

If I can run resume {session_id} within 30 days of a file’s latest change, there’s a strong chance I’ll continue evolving that story thread—or at least I’ve removed the friction if I choose to.


It seems unlikely that a file that hasn't changed in 30 days in an environment with a lot of "agents" cranking away on things is going to be particularly meaningful to revisit with the context from 30 days ago, vs using new context with everything that's been changed and learned since then.


The OpenAI PR implies that Anthropic had a "usage-policy" clause with no actual enforcement.

Whereas OpenAI won their contract on the ability to operationally enforce the red lines with their cloud-only deployment model.


These articles are largely based on a false equivalence of LLM=moat.

That's not the case. OpenAI is advancing on many fronts; codex, vectorStore, embeddings, response API, containers, batch processing, voice-to-speech, image generation... the list goes on.


Launch an internal hackathon. Everyone must use the latest Gemini coding models. Vote for the top 5 Chat/Productivity tools.

Eventually the culture will come around to: a) build new sh-- quickly with AI b) build a new productivity stack


Results from a one-shot approach quickly converge on the default “none found” outcome when reasoning isn’t grounded in a paper corpus via proper RAG tooling.


Can you provide more context to your statement? Are you talking about models in general? Or specific recent models? I'm assuming "one-shot approach" is how you classify the parent comment's question (and subsequent refined versions of it).


Large models in general. A semantic query for "fake articles", without examples, is a wildcard search.

A commercial RAG solution would use Query Expansion (QE) and examples to find nearest neighbors.


"If I had six hours to chop down a tree, I’d spend the first four sharpening the axe."

I still believe there's a mise en place step before doing the thing, when quality counts.


If some task has a known step-by-step pattern, then doing it step by step makes perfect sense. That is doing the thing. Taking the known shortest/best path.

Doing the thing is going to involve both direct steps, and indirect steps necessary to do the direct steps.

Not doing the thing involves doing things other than the shortest/safest/effective path to getting the thing done.


Sure, but a lot of the things on the list marked as “not doing the thing” are actually important early steps to doing the thing.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: