Hacker Newsnew | past | comments | ask | show | jobs | submit | Zetaphor's commentslogin

Quantization is the major appeal, we can't all run full precision

Are you referring to the CLI Codex? That can be installed with NPM or Homebrew, and is fully open source.

Yes and the official docs even mention that if you’re on Windows you should run Codex CLI via WSL. Meaning, it’s specifically designed for unix systems.

Most organizations aren't going to need the wide breadth of capabilities of the frontier models. They're risk averse and LLMs are non-deterministic, so use cases are typically more tightly scoped to tasks that involve nuanced classification that small models can easily handle even if it takes a little fine-tuning on your organizations data.

I guess I write like an LLM :P

Probably a side effect of using them so much


LM Studio is equally as simple, has all the same features, and none of the performance or lock-in problems of ollama.

If you only needed a single reason, how about kneecapping your performance by choosing ollama?


Give LM Studio a shot! It gives you the same experience without all of the problems of Ollama.

LM Studio is a popular option that bundles the MLX backend

LM Studio is basically Ollama except they give attribution. It offers all of the same features including the ability to host a server.

LM Studio also offers curation, while giving credit to llama.cpp and also easy search across all of Huggingface's GGUF's

If you don't want to have to think about it, LM Studio is probably the best choice.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: