The LLM community is obsessed with benchmarking model performance. Mistral released their new “flagship” model this week, and immediately focused the discussion on how it performs on “commonly used benchmarks” relative to other models:
Continue reading this post for free, courtesy of Justin.