PHI 3.5 MoE Instruct Model

by | Sep 29, 2024

The PHI 3.5 MoE model is a multi-lingual model that uses a Mixture-of-Experts (MoE) approach, featuring 16 experts with 6.6B active parameters out of a total 42B parameters. This allows it to outperform many larger models with high efficiency.

PHI 3.5 MoE is designed as a multi-lingual, multi-task model with 42B parameters, leveraging a Mixture-of-Experts (MoE) architecture where only 6.6B parameters are activated per inference. It excels in language reasoning, translation, and multi-lingual understanding, optimized for various tasks such as reasoning, summarization, and code generation. The MoE system specializes different experts for distinct tasks such as STEM and Social Sciences, ensuring efficient task handling. The model is known for its safety protocols aligned with Microsoft’s Responsible AI principles, making it reliable for large-scale, ethical AI deployments.

Current
Commercial License
Pretrained, Instruction-tuned

Comparison 

Sourced on: September 29, 2024

PHI 3.5 MoE Instruct surpasses many large-scale models like Mistral and Llama in specific tasks such as multi-lingual support and reasoning tasks. With 42B parameters in total but activating only 6.6B at a time, it is highly efficient, particularly in language modeling and understanding. It offers a balance of scalability, performance, and safety, in addition to being cost-effective for commercial use. The model has been benchmarked against several high-performance models and shows superior results in multi-lingual and task-specific areas like summarization and code generation.

CategoryBenchmarkPhi-3.5-MoE-instructMistral-Nemo-12B-instruct-2407Llama-3.1-8B-instructGemma-2-9b-ItGemini-1.5-FlashGPT-40-mini-2024-07-18 (Chat)
Popular aggregated benchmarkArena Hard37.939.425.742.055.275.0
Popular aggregated benchmarkBigBench Hard CoT (0-shot)79.160.263.463.566.780.4
Popular aggregated benchmarkMMLU (5-shot)78.967.268.171.378.777.2
Popular aggregated benchmarkMMLU-Pro (0-shot, CoT)54.340.744.050.157.262.8
ReasoningARC Challenge (10-shot)91.084.883.189.892.893.5
ReasoningBoolQ (2-shot)84.682.582.885.785.888.7
ReasoningGPQA (0-shot, CoT)36.828.626.329.237.541.1
ReasoningHellaSwag (5-shot)83.876.773.580.967.587.1
ReasoningOpenBookQA (10-shot)89.684.484.889.689.090.0
ReasoningPIQA (5-shot)88.683.581.283.787.588.7
ReasoningSocial IQA (5-shot)78.075.371.874.777.882.9
ReasoningTruthfulQA (MC2) (10-shot)77.568.169.276.676.678.2
ReasoningWinoGrande (5-shot)81.370.464.774.074.776.9
Multi-lingualMMLU (5-shot)69.958.956.263.877.272.9
MathMGSM (0-shot CoT)58.763.356.775.175.881.7
MathGSM8K (8-shot, CoT)88.784.282.484.982.491.3
MathMATH (0-shot, CoT)59.531.247.650.938.070.2
Long contextQasper40.030.737.213.943.539.8
Long contextSQUALITY24.125.826.20.023.523.8
Code GenerationHumanEval (0-shot)70.763.466.561.074.486.6
Code GenerationMBPP (3-shot)80.868.169.469.377.584.1
AverageAverage69.261.361.063.368.574.9

Team 

The PHI 3.5 MoE model was developed by a large team of experts at Microsoft, specializing in AI and responsible machine learning. The team worked across multiple global regions, leveraging Microsoft’s AI expertise and focusing on ethical development to ensure robust safety and performance.

Community 

Microsoft provides substantial community support for the PHI 3.5 model, including a dedicated GitHub repository and Azure AI Studio resources. The model is available for developers and data scientists to experiment with through accessible API deployments, and the community offers ongoing updates and discussions around best practices.

Active Members: 1001-5000 Members
Engagement Level: High Engagement

Resources

List of resources related to this product.