Hacker Newsnew | past | comments | ask | show | jobs | submit | dunb's commentslogin

Why use --fit on on an M4? My understanding was that given the unified memory, you should push all layers to the GPU with --n-gpu-layers all. Setting --flash-attn on and --no-mmap may also get you better results.

Meaningless question, fit will put everything on the gpu if it fits. Fa is default on. No-mmap is not an inference tradeoff and if you do turn it off you need to turn on direct io via -dio

What he should actually do is enable speculative decoding


3.6 is the release version for Qwen. This model is a mixture of experts (MoE), so while the total model size is big (35 billion parameters), each forward pass only activates a portion of the network that’s most relevant to your request (3 billion active parameters). This makes the model run faster, especially if you don’t have enough VRAM for the whole thing.

The performance/intelligence is said to be about the same as the geometric mean of the total and active parameter counts. So, this model should be equivalent to a dense model with about 10.25 billion parameters.


And even if you have enough VRAM to fit the entire thing, inference speed after the first token is proportional to (activated parameters)/(vram bandwidth)

If you have the vram to spare, a model with more total params but fewer activated ones can be a very worthwhile tradeoff. Of course that's a big if


Sorry, how did you calculate the 10.25B?

> > The performance/intelligence is said to be about the same as the geometric mean of the total and active parameter counts. So, this model should be equivalent to a dense model with about 10.25 billion parameters.

> Sorry, how did you calculate the 10.25B?

The geometric mean of two numbers is the square root of their product. Square root of 105 (35*3) is ~10.25.


What is it lacking in your eyes that makes it not true? I find fish’s vi mode more ergonomically complete for things like editing multi-line commands


Just pressing `xp` to swap two characters does not work in fish. Combining deletion with a movement also does not work (e.g. `d3w` to delete three words).


these have been fixed as of 4.4.0


Awesome! That released hasn't landed yet in my distro's repos. Thanks a lot! Fish is a great product.


Are you running with all the --fit options and it’s not working correctly? You could try looking at how many layers are being attempted to offload and manually adjust from there. Walk down --n-gpu-layers with a bash script until it loads.


I'm not sure if it's working right for me on Orion Version 1.0.3 (142), WebKit 624.1.2.19.2. I type my word, hit "Enter," and I still get the hint to press "Enter."

It would also probably be more effective to not change the mechanics between the starting hit and subsequent ones. Instead of automatically starting after typing "hit," the user should have to hit "Enter" as well.


I'll look into what's going on with some of the other browsers.

To clarify, the game actually runs a quick validation when the timer runs out to check if your word is valid. If it is, the ball returns automatically—so you don't have to hit Enter or Space, but doing so early gives you a speed bonus.

As for getting rid of Enter/Space entirely, auto-submitting can be tricky with compound words (e.g., should it submit 'REGULAR' or wait for 'REGULARLY'?).


Others have already told you the name of the project, but if you happen to be on Arch, I have a PKGBUILD written for PlantStudio. I haven't published it to the AUR since I don't necessarily want to be the maintainer though. Shoot me an email (in my bio) if you're interested


This isn't rsync, but you can integrate a-Shell[0] with iOS Shortcuts. You would need to make the syncing happen in the script instead of in the background though. I use a Python script to create aliases for my email this way, so I don't have to turn on wildcard addressing to my inbox.

[0] https://github.com/holzschu/a-shell


    Location: Alabama, USA
    Remote: Yes — hybrid and office roles okay
    Willing to relocate: Yes
    Technologies: Python, Tensorflow, PyTorch, scikit-learn, OpenCV, seaborn, MLflow, R, LaTeX, Linux, Git, Bash
    Résumé/CV: https://docs.google.com/document/d/1akzgKAz7gElVsIBpHNA6hdcZQucDqPC2
    Email: See résumé
I'm Ella. I'm currently a neurobiology researcher, but I want to get back to my machine learning and computer vision roots. I have experience with building deep learning models, object avoidance, and data visualization. I tend to be a quick learner, and I have good communication skills.

I would be excited to hear about positions where I can model data for hard problems. ML Engineering and Robotics would be good fits, and I've gotten a better understanding of biological systems in my current position. Reach out by email if anything here catches your eye, and I would be happy to talk. My current role is tentatively wrapping up around the holiday season this year.


I follow a similar but more terse pattern. I prepend them all with a comma, and I have yet to find any collisions. If you're using bash (and I assume posix sh as well), the comma character has no special meaning, so this is quite a nice use for it. I agree that it's nice to type ",<tab>" and see all my custom scripts appear.


What's the benefit of the container over installing as a tool with uv? It seems like extra work to get it up and running with a GPU, and if you're using a Mac, the container slows down your models.


Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: