← Self-Operating Computer Framework: Multimodal Integration & Vision-Based Models Cody AI assistant for your entire code base →

Moondream Vision Model: 1.6B Parameters, SigLIP, Phi-1.5, LLaVA Dataset

by Fede Nolasco | Mar 19, 2024

 TLDR | Vision Model

 Llava | Vision

The web page introduces a Vision model with 1.6B parameters built using SigLIP, Phi-1.5 and the LLaVA training dataset. The model weights are licensed under CC-BY-SA due to using the LLaVA dataset. The model can be tried out on Hugging Face Spaces.

 Vikhyat

 Not Applicable

 March 3, 2024

 Moondream: Tiny Vision Language Model on GitHub

← Self-Operating Computer Framework: Multimodal Integration & Vision-Based Models Cody AI assistant for your entire code base →