8x7B 32k
The Mistral ‘Mixtral’ 8x7B 32k model is an 8-expert Mixture of Experts (MoE) architecture, using a sliding window beyond 32K parameters.