← Integrate Open-Source Models with AutoGen UI Locally Learn AI in 2024: Your Roadmap →

Fine-tune Mixtral-8x7B on Consumer Hardware

by Fede Nolasco | Aug 6, 2024

 Brillibits | Fine-tuning | Mistral AI | Mixtral-8x7B

In this video, Brillibits discusses the new Mixtral model released by Mistral AI, which is an 8x7B mixture of experts (MOE) model that outperforms Llama 70B while being significantly faster. The model activates only two expert models at a time, resulting in roughly 7 billion parameters being activated in a forward pass for each token. Brillibits provides a detailed overview of the model and explains how to fine-tune it on custom datasets. The hardware requirements for fine-tuning include roughly 48GB of VRAM (two RTX 3090s or RTX 4090s) and at least 32GB of RAM. The video covers the process of creating an instruct dataset using the Dolly 15K dataset and the format of the instruct model. Brillibits walks through the fine-tuning process using the Finetune_LLMs software, highlighting important flags and options. The performance characteristics of the fine-tuned model are discussed, and a demonstration of using the text generation inference to get results is provided. Brillibits also shares thoughts on the future of mixture of experts models and the potential for enhancing the model by selecting more experts at a time. The video concludes with a call to action for viewers to like, subscribe, and join the Discord community for further discussions.

 Brillibits

 Not Applicable

 July 7, 2024

 Finetune LLMs GitHub

⏳PT22M35S

← Integrate Open-Source Models with AutoGen UI Locally Learn AI in 2024: Your Roadmap →