Vision

Moondream Vision Model: 1.6B Parameters, SigLIP, Phi-1.5, LLaVA Dataset

Posted by Fede Nolasco | Mar 19, 2024 | TLRD, Vision Model

Explore a Vision model with 1.6B parameters, trained on the LLaVA dataset using SigLIP and Phi-1.5, available for trial on Hugging Face Spaces with CC-BY-SA licensed weights.

State-of-the-Art Natural Language and Computer Vision Models by Llava

Posted by Fede Nolasco | Mar 18, 2024 | TLRD, Vision Model

Explore cutting-edge natural language and computer vision models for text summarization, image captioning, and sentiment analysis.