Phi-2

by Fede Nolasco | Jan 9, 2024

In a recent breakthrough at Microsoft Ignite 2023, CEO Satya Nadella unveiled "Phi-2," a remarkable development in the world of small language models (SLMs) by Microsoft Research's Machine Learning Foundations team. This article dives into the intricacies and achievements of Phi-2, a 2.7 billion-parameter language model that stands out in the realm of base language models with fewer than 13 billion parameters.

Phi-2 represents a significant leap in language modeling, demonstrating impressive reasoning and language understanding capabilities. It excels in comparison to models up to 25 times its size, a testament to the innovative scaling and training data curation techniques employed by Microsoft Research. Phi-2 follows the 1.3 billion parameter models Phi-1 and Phi-1.5, which already showed state-of-the-art performance in Python coding and common sense reasoning, respectively.



2.7B, LLM



Current



MIT License



Instruction-tuned



Phi

Comparison

Sourced on: January 9, 2024

Phi-2’s performance is particularly noteworthy when compared against existing models like Llama-2 and Mistral. In various benchmarks, including common sense reasoning, language understanding, math, and coding, Phi-2 consistently outperforms or matches these larger models. Notably, it surpasses the 70 billion-parameter Llama-2 model in multi-step reasoning tasks and competes effectively with Google Gemini Nano 2, despite its smaller size.

Benchmark	Llama-2 7B	Llama-2 13B	Llama-2 70B	Mistral 7B	Phi-2 2.7B
BBH	40	47.8	66.5	57.2	59.2
Reasoning	62.2	65	69.2	66.4	68.8
Language Understanding	56.7	61.9	67.6	63.7	62
Math	16.5	34.2	64.1	46.4	61.1
Coding	21	25.4	38.3	39.4	53.7

Team

The Phi-2 project, demonstrates a remarkable blend of expertise and innovation. This team’s approach goes beyond mere development of language models; they are pioneering in integrating these models into advanced technologies, which is revolutionizing the way we think about remote communication and interaction. Their commitment to optimizing performance without compromising the model’s compact size is evident in their meticulous choice of training data and the application of sophisticated training methodologies. This not only enhances the model’s capabilities but also ensures its applicability in a broad range of practical scenarios, paving the way for groundbreaking advancements in the field of AI.

Contributors

Marah Abdin, Jyoti Aneja, Sebastien Bubeck, Caio César Teodoro Mendes, Weizhu Chen, Allie Del Giorno, Ronen Eldan, Sivakanth Gopi, Suriya Gunasekar, Mojan Javaheripi, Piero Kauffmann, Yin Tat Lee, Yuanzhi Li, Anh Nguyen, Gustavo de Rosa, Olli Saarikivi, Adil Salim, Shital Shah, Michael Santacroce, Harkirat Singh Behl, Adam Taumann Kalai, Xin Wang, Rachel Ward, Philipp Witte, Cyril Zhang, Yi Zhang

Community

Phi-2 is a Transformer-based language model, focuses on next-word prediction and has been trained with an extensive dataset of 1.4 trillion tokens, encompassing a diverse mix of Synthetic and Web sources for NLP and coding applications. The model’s training process spanned 14 days, utilizing the power of 96 A100 GPUs. As a foundational model, Phi-2 has not been subjected to alignment through reinforcement learning from human feedback (RLHF), nor has it undergone instruction-based fine-tuning. Nevertheless, Phi-2 exhibits improved behavior in terms of reduced toxicity and bias, surpassing the performance of other open-source models that have undergone alignment processes. This enhancement aligns with the advancements observed in its predecessor, Phi-1.5, which is attributable to a specialized approach to data curation.

Phi-2, available in the Azure AI Studio model catalog, is an inviting platform for researchers exploring areas like mechanistic interpretability and safety improvements. The community around Phi-2, while still growing, is poised for significant involvement, particularly in fine-tuning experiments across various tasks.

Designed as an open-source model, Phi-2 is accessible to the broader research community. It has been downloaded by over 174,000 users, reflecting its popularity and utility. The model serves as a valuable tool for researchers focused on addressing critical safety challenges in AI, such as reducing toxicity, understanding societal biases, enhancing controllability, and exploring other key aspects of safe and responsible AI development.

Active Members: 1-10 Members

Engagement Level: Medium Engagement

Resources

List of resources related to this product.

← Mistral 7B LLM Mixtral 8X7B →