Phi-2

by | Jan 9, 2024

In a recent breakthrough at Microsoft Ignite 2023, CEO Satya Nadella unveiled "Phi-2," a remarkable development in the world of small language models (SLMs) by Microsoft Research's Machine Learning Foundations team. This article dives into the intricacies and achievements of Phi-2, a 2.7 billion-parameter language model that stands out in the realm of base language models with fewer than 13 billion parameters.

Phi-2 represents a significant leap in language modeling, demonstrating impressive reasoning and language understanding capabilities. It excels in comparison to models up to 25 times its size, a testament to the innovative scaling and training data curation techniques employed by Microsoft Research. Phi-2 follows the 1.3 billion parameter models Phi-1 and Phi-1.5, which already showed state-of-the-art performance in Python coding and common sense reasoning, respectively.

Current
MIT License
Instruction-tuned

Comparison 

Sourced on: January 9, 2024

Phi-2’s performance is particularly noteworthy when compared against existing models like Llama-2 and Mistral. In various benchmarks, including common sense reasoning, language understanding, math, and coding, Phi-2 consistently outperforms or matches these larger models. Notably, it surpasses the 70 billion-parameter Llama-2 model in multi-step reasoning tasks and competes effectively with Google Gemini Nano 2, despite its smaller size.

BenchmarkLlama-2 7BLlama-2 13BLlama-2 70BMistral 7BPhi-2 2.7B
BBH4047.866.557.259.2
Reasoning62.26569.266.468.8
Language Understanding56.761.967.663.762
Math16.534.264.146.461.1
Coding2125.438.339.453.7

Team 

The Phi-2 project, demonstrates a remarkable blend of expertise and innovation. This team’s approach goes beyond mere development of language models; they are pioneering in integrating these models into advanced technologies, which is revolutionizing the way we think about remote communication and interaction. Their commitment to optimizing performance without compromising the model’s compact size is evident in their meticulous choice of training data and the application of sophisticated training methodologies. This not only enhances the model’s capabilities but also ensures its applicability in a broad range of practical scenarios, paving the way for groundbreaking advancements in the field of AI.

Contributors

Marah AbdinJyoti AnejaSebastien Bubeck, Caio César Teodoro Mendes, Weizhu Chen, Allie Del Giorno, Ronen EldanSivakanth GopiSuriya GunasekarMojan JavaheripiPiero KauffmannYin Tat Lee, Yuanzhi Li, Anh NguyenGustavo de RosaOlli SaarikiviAdil SalimShital Shah, Michael Santacroce, Harkirat Singh Behl, Adam Taumann KalaiXin WangRachel WardPhilipp WitteCyril Zhang, Yi Zhang

Community 

Phi-2 is a Transformer-based language model, focuses on next-word prediction and has been trained with an extensive dataset of 1.4 trillion tokens, encompassing a diverse mix of Synthetic and Web sources for NLP and coding applications. The model’s training process spanned 14 days, utilizing the power of 96 A100 GPUs. As a foundational model, Phi-2 has not been subjected to alignment through reinforcement learning from human feedback (RLHF), nor has it undergone instruction-based fine-tuning. Nevertheless, Phi-2 exhibits improved behavior in terms of reduced toxicity and bias, surpassing the performance of other open-source models that have undergone alignment processes. This enhancement aligns with the advancements observed in its predecessor, Phi-1.5, which is attributable to a specialized approach to data curation.

Phi-2, available in the Azure AI Studio model catalog, is an inviting platform for researchers exploring areas like mechanistic interpretability and safety improvements. The community around Phi-2, while still growing, is poised for significant involvement, particularly in fine-tuning experiments across various tasks.

Designed as an open-source model, Phi-2 is accessible to the broader research community. It has been downloaded by over 174,000 users, reflecting its popularity and utility. The model serves as a valuable tool for researchers focused on addressing critical safety challenges in AI, such as reducing toxicity, understanding societal biases, enhancing controllability, and exploring other key aspects of safe and responsible AI development.

Active Members: 1-10 Members
Engagement Level: Medium Engagement