← Deepseek 2.5 AI Model PHI 3.5 MoE Instruct Model →

Llama 3.2 Lightweight Models for Mobile

by Fede Nolasco | Sep 29, 2024

Llama 3.2 lightweight models offer efficient AI performance tailored for mobile and edge devices. These models perform text summarization, translation, and tool usage with minimal latency.

Llama 3.2 introduces lightweight models optimized for mobile and edge devices, including the 1B and 3B parameter versions. These distilled models maintain high performance while minimizing computational resources. With a 128K token context length, they excel in text-based tasks such as summarization and translation, and they also handle instructions effectively. These models are particularly suited for applications requiring low latency, including augmented reality, healthcare diagnostics, and environmental monitoring.



1B, 3B, LLM



Current



Open source



Pretrained, Distilled



Edge AI, llama, meta, Mobile AI

Comparison

Sourced on: September 29, 2024

Llama 3.2 lightweight models, including 1B and 3B, are optimized for mobile and edge AI tasks. They perform efficiently in multilingual translation, text summarization, and low-latency applications such as augmented reality and healthcare. Compared to larger models like GPT-4 mini, these models maintain strong performance in text-based and vision-related tasks, while operating with fewer resources.

Type	Benchmark	Llama 3.2 1B	Llama 3.2 3B	Gemma 2 2B IT	Phi-3.5-mini IT
General	MMLU (5-shot)	49.3	63.4	57.8	69
General	Open-rewrite eval (0-shot, regular)	41.6	40.1	31.2	34.5
General	TLDR9+ (best, 5-shot, regular)	16.8	19	13.9	12.8
General	IFEval	59.5	77.4	61.9	59.2
Tool use	BFCL V2	25.7	67	27.4	58.4
Tool use	Nexus	13.5	34.3	21	26.1
Math	GSM8K (8-shot, Cot)	44.4	77.7	62.5	86.2
Math	MATH (5-shot, Cot)	30.6	48	23.8	44.2
Reasoning	ARC Challenge (0-shot)	59.4	78.6	76.7	87.4
Reasoning	GPQA (2-shot)	27.2	32.8	27.5	31.9
Reasoning	Hellaswag (3-shot)	41.2	69.8	61.1	81.4
Long Context	InfiniteBench/En.MC (128B)	38	63.3		39.2
Long Context	InfiniteBench/En.QA (128B)	20.3	19.8		11.3
Long Context	NIH/Multi-needle	75	84.7		52.7
Multilingual	MGSM (8-shot, Cot)	24.5	58.2	40.2	49.8

Team

The Llama 3.2 project involved a collaboration between Meta’s AI research team and Qualcomm, focusing on optimizing AI for mobile and edge platforms. The team consists of a mix of AI engineers and researchers specializing in model distillation and edge computing solutions. The efforts align with Meta’s broader goal of making AI more accessible and efficient for everyday devices.

Meta AI Team

Community

The Llama 3.2 community is active, particularly on Hugging Face, where developers and researchers collaborate on fine-tuning and deploying the models for mobile applications. The community focuses on optimizing these models for low-latency environments, such as mobile devices.

Active Members: 1001-5000 Members

Engagement Level: Medium Engagement

Resources

List of resources related to this product.

Llama 3.2 Meta Blog

← Deepseek 2.5 AI Model PHI 3.5 MoE Instruct Model →