meta-llama/Llama-3.2-90B-Vision-Instruct Image-Text-to-Text • 89B • Updated Mar 4, 2025 • 23.3k • 352
Runtime error Featured 2.02k Chat With Janus-Pro-7B 🌍 2.02k A unified multimodal understanding and generation model.
microsoft/Phi-4-multimodal-instruct Automatic Speech Recognition • Updated Dec 10, 2025 • 334k • 1.58k