It’s not reasonably possible (currently?) to get the same performance from a 7 b...

phillipcarter · on Dec 13, 2023

It gets very similar performance, so maybe it is possible. But also when I fine-tune 3.5, its response quality is indistinguishable from when I use the base model.

All of this is to say: this shit is way harder than it needs to be. I'm not an ML engineer but I do know my data and how to get it. Why is it still so hard to specialize a model?

kiratp · on Dec 13, 2023

Because we don’t have a clear understanding of how the things work yet.