Meta Releases Llama 3.2 Models with Vision Capability For the First Time

Sep. 25, 2024



Llama 3.2 Models Are Optimized for On-Device Tasks

Llama 3.2 Models Are Optimized for On-Device Tasks

First of all, Llama 3.2 has two smaller models, which includeLlama 3.2 1B and 3Bfor on-device tasks. Meta says these small models are optimized to work on mobile devices and laptops.

Llama 3.2 1B and 3B models are best suited foron-device summarization, instruction following, rewriting, and even function calling to create an action intent locally. Meta also claims that its latest Llama models outperform Google’sGemma 22.6B and Microsoft’s Phi-3.5-mini.

Basically, developers candeploy these models on Qualcomm and MediaTek platformsto power many AI use cases. Meta further says Llama 3.2 1B and 3B models are pruned and distilled from the largerLlama 3.1 8B and 70B models.

Now coming to the exciting vision models, they come in larger sizes —Llama 3.2 11B and Llama 3.2 90B. They replace the older text-only Llama 3.1 8B and 70B models. Meta goes on to say that Llama 3.2 11B and 90B models rival closed models like Anthropic’s Claude 3 Haiku and OpenAI’sGPT-4o miniin visual reasoning.

These new Llama 3.2 11B and 90B vision models will be available through theMeta AI chatboton the web, WhatsApp, Instagram, Facebook, and Messenger. Since these are vision models, you can upload images and ask questions about them. For example — you can upload an image of a recipe, and it can analyze and give you instructions on how to make it. you can have Meta AI capture your face and reimagine yourself in tons of different scenarios and portraits.

The vision models also come in handy while understanding charts and graphs. On social media apps like Instagram and WhatsApp, the vision models can alsogenerate captionsfor you.

Overall, Meta has released multimodal models for the first time under an open-source license. It is going to be pretty exciting to test the vision models against the competition.

Passionate about Windows, ChromeOS, Android, security and privacy issues. Have a penchant to solve everyday computing problems.