Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Prefill is around ~600 t/s.

I don't remember what the 27B was, I tried a 27B with different quantization at some point for that one, but I settled on the 31B.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: