Thanks to Faith's invitation, I attended the Meta Connect conference in the U.S. Faith was very nice and provided me with a lot of information about AI-related sub-forums, but I myself couldn't find the sub-venue for the Llama launch. Still, I watched the introduction of the latest model, Llama 3.2, outside the venue using my phone connected to Singapore’s number (since I also couldn’t find WiFi). (Had a lonely experience 😂)
Version
and
Lightweight (Lightweight) 1B and 3B models
These are Meta's lightest and most efficient models, capable of running on mobile devices and edge devices. They excel in multilingual text generation and tool invocation capabilities. These models empower developers to build personalized and autonomous applications directly on-device, with strong privacy protection, ensuring that data does not leave the device. For example, an application could help summarize the last 10 received messages, extract key to-do items, and send calendar invitations via tool calls to schedule follow-up meetings.
There are two main advantages to running these models locally. First, prompts and responses are almost instantaneous because processing occurs locally. Second, local operation can maintain privacy by avoiding uploading messages and calendar data to the cloud, making the entire application more privacy-friendly. Since processing is completed locally, the application can clearly control which queries remain on the device and which queries need to be handled by larger models in the cloud.
Multimodal (Multimodal) 11B and 90B models
Llama Stack
Llama Stack provides a complete set of seamless toolchains to help build autonomous applications.
This codebase contains the API specifications for Llama Stack, as well as the API providers and Llama Stack release versions. Llama Stack aims to define and standardize the core modules for building generative AI applications, covering the entire development lifecycle: from model training and fine-tuning to product evaluation, and then building and running AI agents in production environments. In addition to defining standards, Meta is also developing providers for the Llama Stack API, including both open-source versions and collaborations with multiple parties, ensuring that developers can piece together AI solutions through consistent modules across platforms.
The full release content includes:
: Used for building, configuring, and running Llama Stack releases : Including Python, Node, Kotlin, and Swift : Supporting Llama Stack Distribution Server and Agents API Provider Multiple releases: Single-node Llama Stack release: Provided through Meta internal implementation and Ollama Cloud Llama Stack release: Supporting AWS, Databricks, Fireworks, and Together On-device Llama Stack release: Implemented on iOS through PyTorch ExecuTorch Locally deployed Llama Stack release: Supported by Dell