Meta has recently unveiled the Meta Llama 3 family of large language models (LLMs), which includes both 8B and 70B parameter versions. These models are designed for generative text and code production, with a focus on dialogue applications. The Llama 3 models are optimized for helpfulness and safety, outperforming many open-source chat models on industry benchmarks.
The Llama 3 models are built using an optimized transformer architecture and are fine-tuned with supervised learning and reinforcement learning to align with human preferences. They are pretrained on over 15 trillion tokens from publicly available sources and further refined with over 10 million human-annotated examples. Notably, no Meta user data is included in the training datasets.
Meta’s commitment to responsible AI development is evident in the Llama 3 models. They provide developers with a Responsible Use Guide and tools like Meta Llama Guard 2 and Code Shield to implement safety best practices. These resources help reduce residual risks while maintaining high levels of helpfulness.
The Llama 3 models are intended for commercial and research use in English, with the instruction-tuned variants specifically designed for assistant-like chat applications. Developers are encouraged to fine-tune the models for other languages, provided they comply with the Llama 3 Community License and Acceptable Use Policy.
In summary, the Llama 3 models represent a significant advancement in AI technology, offering powerful capabilities for natural language generation tasks while emphasizing safety and responsible use.
Highlights:
Category | Benchmark | Llama 3 8B | Llama2 7B | Llama2 13B | Llama 3 70B | Llama2 70B |
---|---|---|---|---|---|---|
General | MMLU (5-shot) | 66.60 | 45.70 | 53.80 | 79.50 | 69.70 |
General | AGIEval English (3-5 shot) | 45.90 | 28.80 | 38.70 | 63.00 | 54.80 |
General | CommonSenseQA (7-shot) | 72.60 | 57.60 | 67.60 | 83.80 | 78.70 |
General | Winogrande (5-shot) | 76.10 | 73.30 | 75.40 | 83.10 | 81.80 |
General | BIG-Bench Hard (3-shot, CoT) | 61.10 | 38.10 | 47.00 | 81.30 | 65.70 |
General | ARC-Challenge (25-shot) | 78.60 | 53.70 | 67.60 | 93.00 | 85.30 |
Knowledge reasoning | TriviaQA-Wiki (5-shot) | 78.50 | 72.10 | 79.60 | 89.70 | 87.50 |
Reading comprehension | SQuAD (1-shot) | 76.40 | 72.20 | 72.10 | 85.60 | 82.60 |
Reading comprehension | QuAC (1-shot, F1) | 44.40 | 39.60 | 44.90 | 51.10 | 49.40 |
Reading comprehension | BoolQ (0-shot) | 75.70 | 65.50 | 66.90 | 79.00 | 73.10 |
Reading comprehension | DROP (3-shot, F1) | 58.40 | 37.90 | 49.80 | 79.70 | 70.20 |
The team behind the Meta Llama 3 Large Language Model (LLM) is a diverse and extensive group of professionals from Meta AI. This team includes experts in various fields such as AI research, software engineering, data science, and cybersecurity, among others. They have collaborated to develop and release the Llama 3 family of models, which are advanced generative text models available in 8B and 70B sizes. These models are designed for dialogue use cases and are optimized for helpfulness and safety. The team’s efforts have been focused on responsible AI development, ensuring that the models are not only powerful but also align with ethical standards and safety best practices. The contributors’ list showcases a wide range of talents and expertise, reflecting the collaborative nature of this project. Their work represents a significant step forward in the field of AI and natural language processing, aiming to provide a robust and reliable tool for both commercial and research applications.