Image Captioning Archives

Florence 2 VLM: Best Small Vision-Language Model?

Posted by Fede Nolasco | Oct 14, 2024

Explore Florence 2 VLM by Microsoft, a powerful small vision-language model with 5.4 billion labels. Learn about its capabilities in image captioning, object detection, and segmentation.