A lot of unit conversions are just built into Kagi (and Google!). "Searching" for "10nzd in usd" gives me a price, "10kg in lb" gives me a converted unit, etc.
Thanks, I've been tooling away in my spare time on my own version of this -- both to get a deeper understanding of agents (everyone suggests writing your own) and to help learn Rust. I'd like to retain `pi`'s configurability though, the ability to self-mutate and generate new tools is incredibly useful, particularly because I don't think any of these things should have access to arbitrary code execution through `bash` (of course, if they have access to, say, `edit` and `cargo run` they still have arbitrary code exec, but...) (so I tend to generate tools on the fly when I encounter something the no-bash agent needs to do).
I actually though about this issue, but while Pi can have this script-like environment thanks to the fact that it's based on an interpreted language (TypeScript), Rust has its own limitation as a compiled language.
I decided to allow for customization in a different way:
1. The prompt library (~/.config/hypernova/prompts/) acts as a simpler alternative to Skills, with the built-in prompts that should replace superpowers + Claude's frontend-design
2. Compile-time features; things that might make the agent more bloated can be disabled when you decide to compile zerostack
3. Clean code; code that's short and easy to read, you can just throw zerostack on its own source code in order to build a custom fork if your necessity can't be satisfied. Good features could also be adopted by the main version.
4. Permission mode; as you can see in the README, there was lots of concern around the permission model, and I landed on a 4-mode system that goes from "Restrictive" (no commands) to "YOLO" (whatever the agent wants to do" + custom regex patterns for allow/ask/deny permission on 'bash' calls. In your case, you just need to run `zerostack -R` to force all tools to ask for permission.
(Also, there is a work-in-progress features for programmable agents, but that's yet to be announced)
I've been trying to use `Deno` underneath `Rust` so that the tools can still be written in Typescript and thus self-mutated without the compilation step (but I can still try to do clever things with V8 Isolates or similar). It's been an ugly experiment so far; I'm vaguely thinking a simpler model would be to just define a binary "API" and run tools by exec-ing binaries.
I have to be honest and tell you that try to load such an heavy runtime as a scripting layer is not a great idea; at the same time I can tell you that I am working on another Rust project where I also needed scripting, and after three attempts I landed on rhai (https://rhai.rs/) (https://rhai.rs/book).
You might find it nice for pretty much all use cases except for high-performance scripting (so, if you are not try to build the entire logic entirely in rhai, you are going to be fine).
Yeah, it's been a bit of a dead end. I didn't want the heavy runtime but felt it was worth disproving after experimenting rather than ruling out off the bat. Even before getting it running, the dependency list alone was pretty discouraging, especially given the storm of supply chain attacks these days.
Rhai looks nice, I'll take a look, thanks! And good luck with Zerostack.
I was just going to suggest rhai. It's simple enough LLMs can easily write it with a little context, and you control the entire API so you can sandbox effectively without needing to resort to hacks with a JS interpreter etc.
I agree v8 and Deno seems very heavy handed and complex to integrate for scripting capabilities.
Have you considered Lua? It is tailor made for use cases like this. Creating an embedded host in Rust is trivial, the work lies in creating built-in functions for the script runtime so that the user scripts can do useful things to the environment.
That’s not how it works. Comptime Zig is Zig, not an embedded scripting language. You can’t run comptime code separately, it only runs as part of compiling a Zig program. Think of it like Rust macros.
Yes and no. It is just Zig, but that is the advantage. And compiling it to an special function that could work as an end seems to be a doable thing, so you would just have to have the compiler in your agent.
Possibly, I'm not really interested in learning Zig though (or learning to embed it in Rust). I'm sure that'd be a cool project for someone else to try :).
Unfamiliarity and I believe it requires a compile step. I’m at least familiar with Typescript and Deno so being able to embed them was an appealing idea :)
Ok, what about having tools be discoverable from the environment, similar to how $PATH works in POSIX?
There could be an env var $AGENT_TOOLS, a string of paths delimited by `:` and tools would be discovered as some specific format of file. Maybe a JSON that contains tool name, list of parameters and the command to run it.
This is essentially decoupling tools from the agent, allowing more customization and per-project environments. It does require shipping and installing more binaries, one for each tool probably.
This is one of the approaches im considering for my own, Roder.
The approach mostly being communicating over json rpc which has become the standard for MCP so it makes it more approachable to agent developers.
Obviously its very much NOT mcp, its a low level events based rpc system for registering capabilities and extending low level primitives of the agnet itself not the model
I understand the concept, but I don't get what's the advantage over adding in the prompt instructions to use a specific bash command for a specific task, acting as a "custom tool".
The harness clamps what the agent can do. `bash` allows full code execution; a dedicated `mvn` tool might only allow `mvn compile` but not `mvn spring-boot:run`. You could probably implement this with an `allow` list attached to your `bash` tool, but by doing it this way, you can enhance the outputs or perform mandatory checks too.
For instance, Claude likes to run little Python scripts; reviewing them is tedious. Removing `bash` and adding a `python` tool would allow the harness to pre-review and grep for common harmful patterns, or run the `python` script in a `krunvm` or `muvm` to isolate it, etc. This review/isolation would be handled programatically as it's part of the harness; leaving the agent to choose what to do as a skill means the agent can conveniently forget to enforce its own checks.
Good point. There might be a small advantage if one does not want to give bash access.
But general answer to "how do add custom tools like we can in pi" is "you don't". Keep it simple.
Skills are notably more complex than that. They require metadata (which the model is given and uses to determine whether or not to load the main file), are intended to be loaded via a tool call, contain extra resources (also loaded by tool calls), etc. In contrast, with this system the harness doesn't need a tool to load the stored prompts, the prompts don't need to include metadata to allow for runtime discovery, etc.
Runtime discovery is the entire point of skills. Without it, this is just a templating prompt system that the user has to remember to use… except because this one changes your system prompt, it also busts your cache and costs you extra money when you use a prompt.
Skills are already dead-simple and this prompt system doesn’t at all tackle the same problem.
"{Feature} is the whole point of {more complex technology}" is an objection that can very often be raised. That doesn't mean that giving up features in exchange for simplicity is always the wrong call. And there's also advantages to having the user drive what instructions go into the prompt instead of the harness/model.
This is tangential to the point. It’s often great to have a simpler version of a solution, even if it eschews some features. But this isn’t that. OP claims that the prompt system is an “alternative” to skills, but it isn’t. It isn’t solving the same problem that skills solve at all. It’s like saying that a bicycle is a simpler alternative to a lawnmower because they both have wheels.
Prompts are a feature that are simpler than skills, sure, but they’re a completely different feature entirely.
It's an alternative in the same way e.g. plain markdown is an alternative to HTML, even though plain markdown lacks some of the features of HTML. "X is an alternative to Y" in this sense doesn't mean "X all the same features of Y", it means "you might reasonably choose to use X instead of Y, depending on your exact usecase"
I feel this a lot too these days. The only place the fun seems to remain for me is in Linux and in devices out of China, that you can hack and experiment with — like the Anbernic consoles, or the Xteink X4, or the little mp3 player DAP things, or the Sipeed NanoKVM devices, or the Supernote. The Steam Deck probably sits in this niche too, now I think about it.
Alongside the purpose they serve, all of them can be trivially broken into and re-tooled however you like — and for me at least, that’s where a lot of fun lies in computers. When it comes to mainline desktops now, everything is incredibly expensive and deflating.
Mine hasn't actually arrived yet! I'm still waiting for it in the mail. I don't think I'll tweak that much, all I really want to do is flash the published open source firmware, attach it to my tailnet and use it to control a little server I keep in the crawl space under the house (because crawling under there whenever it needs a reboot + entering disk encryption keys is distinctly unpleasant).
Ah fair enough. I thought you were talking about hacking it to do novel things. :)
I have one on a server. The server itself is accessible (it's in the rack cabinet I have in the office) but sticking a display on it is fiddly on a good day, so it's been a major QoL improvement!
The harnesses we have are almost stunningly incomplete IMHO. I've been trying `pi` recently, and quite like that it comes with a minimal set of tools by default -- and that I can easily override or replace the ones that it ships.
I've only just started working with it, but clamping `read/write/edit` to only allow editing files in the current directory, banning `bash` and mandating I write tools for the specific commands I want it to execute, has made me much happier. Running Claude inside a VM or similar to sandbox it is nuclear overkill; I've always been surprised that that's seemed like the state of the art.
With a better harness, the model can't choose to rename things with search and replace; if it wants to rename things, it _must_ call the LSP to do it. If it's going to write code, as you suggest, the harness _forces_ linting/formatting to run.
(Reading my own comment back, I am worried that the fucking AI writing style is infecting me :()
One of the problems with tools is the permissions for them. I can either grant Claude access to this one specific python command, or free run with python to do whatever it wants, but not “you can execute the python scripts in this directory structure”.
Claude’s “api access required” approach means that I can’t even experiment with customising the harness without doubling up…
Yes, that aggravates me too. I noted recently, Claude had a phase where it would write little one-off Python scripts to aid it in analysis -- which is super useful! But when it's written ten scripts in a row, each of which I've had to review and each of which I've had to approve by hand, it gets pretty annoying. If I could bless it with "if it only uses these Python libraries, pre-approve the script", that would've made life a lot simpler, but of course, that's not possible. Sigh.
If you want an e-ink type screen, the Supernotes (or Remarkables, or Viwoods) are all very good at this. Personally I hate trying to read things on iPads.
> A good way to think of it is that jj new is an empty git staging area. There's still a `jj commit` command that allows you to desc then jj new.
This always made me feel uncomfy using `jj`. Something that I didn't realise for a while is that `jj` automatically cleans up/garbage collects empty commits. I don't write as much code as I used to, but I still have to interact with, debug and test our product a _lot_ in order to support other engineers, so my workflow was effectively:
git checkout master
git fetch
git rebase # can be just git pull but I've always preferred doing this independently
_work_/investigate
git checkout HEAD ./the-project # cleanup the things I changed while investigating
```
Running `jj new master@origin` felt odd because I was creating a commit, but... when I realised that those commits don't last, things felt better. When I then realised that if I made a change or two while investigating, that these were basically stashed for free, it actually improved my workflow. I don't often have to go back to them, but knowing that they're there has been nice!
I think calling them "commits" is doing it a disservice because it's not the same as git commits, and the differences confuse people coming from git. I'd say "jj changes are like git commits, except they're mutable, so you can freely move edits between them. They only become immutable when you push/share them with people"..
It's a mouthful, but it's more accurate and may be less confusing.