LoRAMoE is a plugin version of Mixture of Experts (MoE) that can effectively prevent world knowledge forgetting in large language models (LLMs) during supervised fine-tuning.
Areas of application
LoRAMoE ensures the integrity of world knowledge by freezing the backbone model during the training phase.
LoRAMoE uses localized balancing constraints to coordinate parts of experts for task utilization.
LoRAMoE demonstrates that even dramatically increasing instruction data does not result in knowledge forgetting.
LoRAMoE provides additional benefits for the performance of downstream tasks, indicating the potential of our approach for multi-task learning.