With 7.3 billion parameters, it has shown to outperform larger models like Llama 2 13B and Llama 1 34B on various benchmarks. Mistral 7B is unique in its use of Grouped-query Attention (GQA) and Sliding Window Attention (SWA) techniques, which contribute to its faster inference capabilities and its ability to handle longer sequences more efficiently.
The model has been tested and has demonstrated superior performance in various tasks, including commonsense reasoning, world knowledge, reading comprehension, math, and code generation. In terms of real-world application, Mistral 7B can be fine-tuned for specific tasks and is versatile enough to be deployed on different platforms, including cloud services. It’s released under the Apache 2.0 license, allowing unrestricted use.
Performance of Mistral 7B and different Llama models on a wide range of benchmarks. For all metrics, all models were re-evaluated with Mistral’s evaluation pipeline for accurate comparison. Mistral 7B significantly outperforms Llama 2 13B on all metrics, and is on par with Llama 34B (since Llama 2 34B was not released, Mistral report results on Llama 34B). It is also vastly superior in code and reasoning benchmarks.
Benchmark | LLaMA 2 7B | LLaMA 2 13B | Code LLaMA 7B | Mistral 7B |
---|---|---|---|---|
MMLU | 0.44 | 0.56 | 0.37 | 0.60 |
HellaSwag | 0.77 | 0.81 | 0.63 | 0.81 |
WinoGrande | 0.70 | 0.73 | 0.62 | 0.75 |
PIQA | 0.78 | 0.81 | 0.73 | 0.83 |
Arc-e | 0.69 | 0.75 | 0.59 | 0.80 |
Arc-c | 0.43 | 0.49 | 0.35 | 0.56 |
NQ | 0.25 | 0.29 | 0.11 | 0.29 |
TriviaQA | 0.64 | 0.70 | 0.35 | 0.70 |
HumanEval | 0.12 | 0.19 | 0.31 | 0.31 |
MBPP | 0.26 | 0.35 | 0.53 | 0.48 |
MATH | 0.04 | 0.06 | 0.05 | 0.13 |
GSM8K | 0.16 | 0.34 | 0.21 | 0.52 |
Mistral AI, a Paris-based startup established just seven months ago by former researchers from Meta and Google, has recently secured a substantial funding round, raising an impressive 385 million euros (equivalent to approximately $415 million USD). This development underscores the remarkable enthusiasm surrounding a novel form of artificial intelligence that powers online chatbots. This substantial investment has placed a valuation of roughly $2 billion on the company, as reported by two individuals knowledgeable about the transaction. Among the notable investors in this round are renowned Silicon Valley venture capital firms, Andreessen Horowitz and Lightspeed Venture Partners.
The Mistral support community is a vibrant and dynamic network, dedicated to fostering collaboration and innovation among users of the Mistral 7B model. This community is supported through various platforms, each offering unique resources. The GitHub repository established by Mistral AI stands out as a central hub, promoting collaborative problem-solving and knowledge sharing. It’s a space where users can contribute to the model’s development, share their challenges, and offer solutions. Additionally, platforms like DataCamp and Banana extend their support by providing detailed guides and tutorials, which are invaluable for both new and experienced users. DataCamp offers a comprehensive guide covering every aspect of using the Mistral 7B model, while Banana provides practical insights into model deployment, supplemented with community support through modern communication channels like email and Discord. The community also benefits from the insights shared by over 15,000 #CubeAlumni experts, as highlighted by SiliconANGLE, adding a wealth of knowledge and experience to the mix. This robust community support not only enhances the user experience but also significantly contributes to the ongoing development and refinement of the Mistral 7B model.