Dan found that the 2-bit quantization broke tool calling but upgrading to 4-bit (at 4.36 tokens/seco...