Tools that remove the fat seem like a good idea, but I’m highly suspicious of their effect on the LLM’s reasoning.
LLMs were trained in the typical full-fat output found everywhere on the internet, and all of sudden they get a slightly different response that may look like nothing they have seen before.
LLMs were trained in the typical full-fat output found everywhere on the internet, and all of sudden they get a slightly different response that may look like nothing they have seen before.
Does that really save tokens in the long run?