Venture Step

Unlocking Llama 3.1: Meta’s Open-Source AI Revolution

Meta's Llama 3.1 is a powerful open-source AI that rivals closed models like GPT-4, offering new tools for creators and enterprises. Listen to the full episode to learn more.

Dalton Anderson

30 Jul 2024 • 11 min read

TL;DR

Meta's Llama 3.1 isn't just an update; it's a powerful open-source AI that now competes head-to-head with closed models like GPT-4, empowering developers and creators like never before. #VentureStep #OpenSourceAI #Llama3

INTRODUCTION

Have you ever imagined an AI that acts not just as a tool, but as a true partner in innovation—one you could customize and run on your own terms? For the first time, this is a reality. The release of a highly capable, open-source foundational model means that enterprises, creators, and even individuals can now download and operate a state-of-the-art AI, breaking free from the "one size fits all" constraints of closed systems.

In this episode of VentureStep, host Dalton Anderson dives deep into Meta's groundbreaking release of Llama 3.1. Drawing on his background in programming and data science, Dalton unpacks the significance of this moment, where an open-source model is not only catching up to its closed-source counterparts but, in many cases, outperforming them. This isn't just another tech update; it's a fundamental shift in the AI landscape.

Dalton explores the technical achievements of the 405-billion parameter model, its performance against OpenAI's GPT-4, and the immense investment required to build it. He also discusses the strategic vision behind Meta's open-source push, the new suite of creator tools designed to build a powerful ecosystem, and the critical safety features like Llama Guard that aim to ensure responsible innovation. This episode is a must-listen for anyone interested in the future of AI and the power of open collaboration.

KEY TAKEAWAYS

Llama 3.1’s 405-billion parameter model is now highly competitive with top-tier closed-source models like OpenAI's GPT-4, often tying or winning in head-to-head comparisons.
The open-source nature of Llama democratizes advanced AI, allowing businesses, universities, and governments to build on foundational models without the prohibitive cost of initial development.
Meta is building a powerful ecosystem around Llama with intuitive creator tools like Segment Anything 2 and Audio Box, aiming to win creators to attract and retain users.
The massive investment in training Llama 3.1—including over $600 million in GPUs and millions in electricity costs—highlights the significant barrier to entry that Meta is helping others overcome.
Built-in safety features like Llama Guard and Prompt Guard are crucial for mitigating misuse, preventing harmful content, and establishing a framework for responsible open-source AI development.

FULL CONVERSATION

Dalton: Welcome to VentureStep podcast where we discuss entrepreneurship, industry trends, and the occasional book review. ¹Have you ever wondered what AI would be like if it wasn't just a tool, but a partner in innovation and something that you could carry around in your pocket, in your glasses, on your watch, or at your home and have your own setup for your AI? ²

Dalton: Well, you can do that now. For the first time ever you have a highly capable foundational model that you could download and run at your house. 3You only need the compute of multiple computers and a plethora of other things to get it going, but you could do it if you wanted to. 4

Can Open-Source AI Finally Compete with Closed Systems?

Dalton: Many enterprises are doing so. Llama has 3,500 enterprises using their models and Llama recently just surpassed 300 million total downloads. ⁵And that was before they had the 405 billion parameter model, which is insanely good for being open source. ⁶There was a lot of question on whether or not open source would ever be able to compete with a closed source system. ⁷And Meta, after using 16,000 H100s to train their cluster and almost 3 trillion training tokens at an 8K context and 800 billion tokens at 128K context, it seems like they've done it. ⁸

Llama 3.1 vs. GPT-4: How Does It Stack Up?

Dalton: The range of the results vary, but I won't go through all of the benchmarks. ⁹I'll just talk about a very simple diagram they have. ¹⁰They compare Llama 3.1 405 billion versus ChatGPT 4.0 and 4 Omni. ¹¹Against ChatGPT 4, Llama 3.1 won 23.3% of the time and tied 52% of the time, so it lost around 24%. ¹²Against ChatGPT 4 Omni, the newer model, it won 19.1% of the time and tied 51% of the time, so it lost around 29% of the time. ¹³

Businesses no longer have to send their data to this closed source model and they can control the weights and the parameters of these AI models, whereas in a closed source model, it's one size fits all. ¹⁴

Dalton: That's crazy since businesses no longer have to send their data to this closed source model and they can control the weights and the parameters of these AI models. ¹⁵In a closed source model, you are not given the opportunity to distill the model, to edit the weights, to do things as you wish, and to run it on your own infrastructure. ¹⁶You can do that now, and I'm super excited about it because I think it will have a positive effect. ¹⁷

Dalton: I try to just be very positive in my remarks about these releases. ¹⁸There are some things that aren't so good, and I'll call them out. ¹⁹I've called out Google several times for misleading people with their demos. ²⁰But for the most part, I try to be positive as these companies are putting themselves out there and they keep pushing each other and the rate of innovation is incredible. ²¹Every four to six weeks, there's some big announcement. ²²

The technology doesn't have any evil use cases, it's the users. ²³

Dalton: Before we dive in fully, I'm your host Dalton Anderson. ²⁴My background is a mix of programming, data science and insurance. ²⁵Offline you can find me running, building my side business, or lost in a good book. ²⁶You can listen to this podcast in video and audio on YouTube, or find the audio on Spotify, Apple Podcasts, or wherever else you get your podcasts. ²⁷

A Deep Dive into Llama 3.1's New Features

Dalton: Today, we're going to be talking about Meta's Llama 3.1. ²⁸We'll discuss some feature releases, my thoughts about the new model, and where I think things are going with open source. ²⁹They released updated models for their 70 and 8 billion parameter models, and then there is the brand new 405 billion parameter model. ³⁰

Dalton: They took their 405 billion parameter model and used about 2.75 trillion tokens for their 8K context and 800 billion tokens for their 128K context window. ³¹If you're not familiar with the context window, that's how much stuff you could put in that text box. ³²128,000 tokens is like 20 to 30 pages of text. ³³

Exploring Meta's New AI Creator Tools

Dalton: They also released these AI studio demos. ³⁴They have Segment Anything 2, which creates video cutouts and other visuals from a video you upload. ³⁵It's going to be a game changer for creators. ³⁶³⁶Seamless Translation can translate your voice to any language using your own voice as a recording; it mimics your voice, which is pretty cool. ^37373737With Animated Drawings, you can upload your own drawing and it'll animate it for you. ^38383838

Dalton: Then there's Audio Box, which is really cool. You can create an AI audio speech-generated story. 39I asked it to create a sci-fi story, and you can also add in additional sound effects, additional voices, and even record your own voice and add yourself into the story. 40They also have an AI agent platform where you can make your own AI agents and share them with your friends. 41It's just too much to put in one episode. 42

Dalton: On Facebook and Instagram, there are about 200 million people who consider themselves creators. ⁴³I guess by definition, I am a content creator. ⁴⁴These tools would definitely be useful for a professional content creator. ⁴⁵It might be that who wins the creators with these best AI features, allowing people to easily create content on your platform, you win the content creators. ⁴⁶If you win over the content creators, you win the users. ⁴⁷

Dalton: So, Llama 3.1 405 billion had an increased context window of 128K, up from 8K. ^48484848That's huge. ⁴⁹What does an increased context window do for the user? ⁵⁰It allows Llama to understand more complex topics, remember previous interactions, maintain context, and generate more accurate responses. ⁵¹It also supports long-form text summarization and document analysis. ⁵²There's now multi-language support for up to eight languages. ^53535353

The Prohibitive Cost of Training Foundational AI Models

This model release of the 405 billion is the first open source frontier model to exist. ⁵⁴

Dalton: This hasn't been done before. I mentioned earlier that Meta used 16,000 H100s from Nvidia. 55To put that in perspective, that's about $600 million worth of training equipment. 56I think it's about $35,000 per one H100. 57So they spent $600 million on just buying GPUs for their training clusters to release this model free to the public. 58It's a game changer because many countries have the technical support to utilize these models, but they don't have the infrastructure to put up $600 million for GPUs. 59

Dalton: Those 16,000 H100s run at 700 watts each. ⁶⁰That's 1.12 megawatts, and a megawatt is enough power for a thousand homes. ⁶¹So it's the consumption of a small town. ⁶²At the going rate of electricity, it would cost around $34,944 per day. ⁶³It took them months to train that AI, so they spent millions of dollars in electricity just training it. ⁶⁴

Dalton: The upfront costs are immense. ⁶⁵Even governments can't afford to do these things. ⁶⁶But once you get the model created, it's not that hard to fine-tune it. ⁶⁷ Getting the training data is also a huge barrier. Meta trained on almost three trillion tokens. ⁶⁸What this does is it allows a level of innovation that you normally wouldn't see for closed source models. ⁶⁹

Building with Llama: The Open-Source Ecosystem

Dalton: You can download it on Hugging Face, Kaggle, or directly from Meta. ⁷⁰They also have partners that support the AI via the cloud, so you can use an API. ⁷¹Google, AWS, and Databricks all have offerings. ⁷²But the two places that have support top to bottom—real-time inferencing, fine-tuning, model evaluation, safeguard rails, synthetic data generation—would be Databricks and NVIDIA. ⁷³Those are the only ones that have a full suite of offerings. ⁷⁴

Dalton: For the first time, you can edit these models yourself and fully customize it with your own data. ⁷⁵They're calling this the Llama Stack API. ⁷⁶It's a standardized inference that allows developers to create applications on top of 3.1. ⁷⁷

Building a Safer AI: Llama Guard and Prompt Guard Explained

Dalton: Within the Llama stack, you can download your safety guide rails and Llama Guard. ⁷⁸Llama Guard has the ability to detect and prevent potential misuse like hate speech, harassment, and explicit content. ⁷⁹Then they have this thing called Prompt Guard, which tries to prevent people from gaming the system. ⁸⁰A lot of times with these models, you can force them to do things they wouldn't normally do, like telling you information it's not supposed to. ^{818181818181818181}

Dalton: Prompt Guard tries to prevent, flag, and categorize the prompts you're sending to Llama. If it detects that you're trying to misuse the AI by crafting special prompts to jailbreak it, it will block you. 828282These are important because it gives companies peace of mind. 83There's criticism regarding open source AI, with the idea that it gives bad actors a chance to use these products maliciously. 84This will help protect against those bad actors. 85

Meta's Vision: Winning the AI Ecosystem

Dalton: The perspective of Mark Zuckerberg and others is if you open source these things, you can potentially protect governments and companies from bad actors because they'll have this sophisticated AI model to launch on their own systems. ⁸⁶Meta wants to ensure that the open source model wins. ⁸⁷They don't want to be reliant on a closed source model from another company. ⁸⁸They want to build a wonderful ecosystem to allow creators, companies, and innovators to use their product. ⁸⁹

Dalton: It also allows them to be a front runner in governmental conversations regarding compliance and regulation. ⁹⁰They want to be one of the most used AI assistants in the world, which they're on track to do before the end of the year, according to Mark. ⁹¹They have this AI assistant in WhatsApp, Instagram, Facebook, and Messenger. ⁹²

If you win over the content creators, you win the users because people want to interact with the content creators and interact with content. ⁹³

Dalton: Open source allows Meta to control the ecosystem and become the industry standard. ⁹⁴The example Mark Zuckerberg uses often is Linux. ⁹⁵A lot of companies were building closed operating systems, but Linux came about and became popular because you could do whatever you want with it. ⁹⁶Since it was open source, the innovation of Linux was much faster than the closed source models. ⁹⁷

Dalton: When you release your product as open source, you can have feedback on your code, features, bugs, and security from millions of people. 300 million people have downloaded Llama. 98

You put something in the hands of 300 million people, something is bound to happen that is potentially spectacular. ⁹⁹

Dalton: I'm excited to see what comes of it all. ¹⁰⁰This open source movement is going to push closed source to the brink. ¹⁰¹Closed Source is going to make some big releases, and Open Source is going to push right back. ¹⁰²

This is going to be this constant tug of war and this tug of war and competition will bring out the best of both parties. ¹⁰³

Dalton: And I just am excited to see what comes of it all. ¹⁰⁴

Live Demos: Animated Drawings and Seamless Translation

Dalton: I animated this little one-eyed alien guy. ¹⁰⁵You can upload a drawing and it comes alive. ¹⁰⁶I think it would be really awesome for a kid to see this. ¹⁰⁷ Next is the seamless translation. Here is my original voice, then a standard translation, and then the translation in Spanish that is supposed to sound like my voice. ¹⁰⁸The accent is a lot better in this translation, obviously. ¹⁰⁹

Segment Anything 2: A Game-Changer for Video Editing

Dalton: Next is the Segment Anything demo, which I think is the biggest one they've released. ¹¹⁰This platform allows you to upload videos and make changes to them on the fly. ¹¹¹You can select which objects you want to edit. ¹¹²I did the person in a race I did last weekend. ¹¹³I could erase the background, make it black and white, or add a green screen. ¹¹⁴You just click on an object to select it. ¹¹⁵

Dalton: If you're not familiar with video editing, this stuff takes a long time. With Adobe Premiere, you have to segment it and then go through the key frames of the video. 116The AI was able to do it on its own. 117It's mind blowing because something like this would take me a long time to do manually, like 40 minutes just to get the segment put together. 118

Audio Box: Creating AI-Generated Stories with Your Own Voice

Dalton: Finally, I asked this storytelling machine, Audio Box, to build me a fictional sci-fi story. ¹¹⁹It came up with this whole thing, and I added my own voice into it. ¹²⁰This is super cool and super easy to use. ¹²¹You can see how this could be really powerful if you had a larger context window. ¹²²You could make an entire cartoon just with these couple of tools using AI. ¹²³It unlocks possibilities that you normally wouldn't be able to do in that short amount of time. ¹²⁴

Dalton: These tools are just getting started; they're just demos. ¹²⁵But they're very impressive. ¹²⁶Next week, we're going to continue this Meta marathon and discuss making some AI agents. ¹²⁷I appreciate you listening in today. ¹²⁸Wherever you are in this world, good morning, good afternoon, good evening. ¹²⁹ Thanks for listening and please listen again. I'll talk to you next week. Goodbye. ¹³⁰

RESOURCES MENTIONED

Meta (Facebook, Instagram, WhatsApp, Messenger)
Llama 3.1
OpenAI (ChatGPT 4, ChatGPT 4 Omni)
Google
Nvidia (H100 GPUs)
Databricks
Hugging Face
Kaggle
Adobe Premiere
TikTok
Linux
Apple
Windows
Bloomberg
GitHub

INDEX OF CONCEPTS

128K context, 405 billion parameter model, Adobe Premiere, AI agents, AI Studio Demos, Animated Drawings, Apache 2.0, Apple, Audio Box, AWS, ChatGPT 4, ChatGPT 4 Omni, closed-source AI, context window, Dalton Anderson, Databricks, Facebook, GitHub, Google, H100 GPUs, Hugging Face, Instagram, Kaggle, Linux, Llama 3.1, Llama Guard, Llama Stack API, Mark Zuckerberg, Messenger, Meta, Nvidia, open-source AI, Prompt Guard, Segment Anything 2, Seamless Translation, TikTok, VentureStep, WhatsApp, Windows