Not sure it's related to function calling. GPT4 can do function calling without using the specific function-calling API just by injecting the schema you want into the prompt with directions and asking it to return JSON. It works like >99% of the time. Same with 3.5-turbo.
The problem is these libraries convert pydantic models into json schemas and inject them into the prompt, which uses up like 80% more tokens than just describing the schema using typescript type syntax for example. See https://microsoft.github.io/TypeChat/, where they prompt using typescript type descriptions to get json data from LLMs. It's similar to what we built but with more boilerplate.
The problem is these libraries convert pydantic models into json schemas and inject them into the prompt, which uses up like 80% more tokens than just describing the schema using typescript type syntax for example. See https://microsoft.github.io/TypeChat/, where they prompt using typescript type descriptions to get json data from LLMs. It's similar to what we built but with more boilerplate.