Meta's AI Vision: From VR Headsets to Smart Glasses
Meta just unveiled its ambitious AI future at Connect 2024, from multimodal AI in Quest to live translation in Ray-Ban glasses. Listen to the full episode to learn more.
TL;DR
Meta's latest AI updates are wild: AI agent video calls, authentic voice dubbing for Reels, and real-time translation in smart glasses. They're betting everything on an open-source future to dominate the next era of tech. #VentureStep #MetaAI #FutureTech
INTRODUCTION
Meta's 2024 Connect event showcased a company moving at an incredible speed, deeply integrating advanced AI across its entire product ecosystem. 1What started as a pivot to the metaverse has evolved into a full-scale AI offensive, leveraging a user base of 500 million active users across platforms like WhatsApp, Messenger, Instagram, and Facebook to deploy its technology at an unprecedented scale. 2From a position of playing catch-up, Meta is now arguably leading the pack with some of the most popular and accessible AI offerings available. 3
In this episode of VentureStep, host Dalton Anderson breaks down the most significant announcements from the event. With a background in programming and data science, Dalton offers a unique perspective on the technology behind the headlines, separating the hype from the truly transformative developments. 4He dives into everything from hardware updates for the Meta Quest and Ray-Ban glasses to the jaw-dropping new AI features that seem pulled from science fiction. 5555555555
This conversation explores not just the "what," but the "why" behind Meta's strategy. Dalton unpacks the new multimodal AI capabilities, the creation of photorealistic AI agents, automatic video dubbing that preserves a user's "authentic voice," and the company's profound commitment to an open-source future. 6666 It's a comprehensive look at how Meta plans to shape the next generation of human-computer interaction.
KEY TAKEAWAYS
- Meta is leveraging its massive user base, with 500 million active users now engaging with its AI platforms across WhatsApp, Messenger, Instagram, and Facebook. 7
- New AI models are now multimodal, allowing users to interact with AI using text, voice, photos, and even video for a more natural and collaborative experience. 8
- Groundbreaking features were demonstrated, including AI agents that can create a live, interactive video version of a person and automatic video dubbing for Reels that translates content while maintaining the speaker's authentic voice. 999
- The Ray-Ban Meta glasses are evolving into powerful AI assistants, with upcoming features like memory recall, real-time video feedback, and live multi-language translation. 1010101010
- Meta is making a strategic, long-term bet on an open-source model for AI and VR to challenge the "closed" ecosystems of competitors and accelerate innovation. 1111
FULL CONVERSATION
Dalton: Welcome to VentureStep Podcast where we discuss entrepreneurship, industry trends, and the occasional book review. 12Meta recently had their 2024 Meta Connect and they announced quite a few changes to their Questline, their Ray-Ban Meta glasses, and added some AI features into their social media products. 13I find that the biggest standout was something that I talked about a couple episodes ago when I reviewed Meta's Hurting the Llamas paper. 14I found it interesting that the majority of the audio and video content was in a one-minute to a minute-and-a-half time frame. 15
Initial Reactions to Meta Connect 2024
Dalton: And I was like, they're training on reels or they're trying to do something with reels and they'll be releasing something like that hopefully soon. 16And they did release something that was quite interesting that we'll get into later in the episode. 17But overall, very exciting, very exciting progress from Meta and integrating AI into their products. 18I think how quickly they're doing it and the scale of which they're at is incredible. 19
They are at apparently 500 million active users for their AI platforms. 20
Dalton: That includes the aggregation of users from WhatsApp, Messenger, Instagram, and Facebook. 21To go from, I would say, embarrassment to all the way to leading class models to now being the most popular AI offering and they're doing it at scale for free and they're integrated into all these platforms that have billions of active users. 22 Really cool.
What's New with the Meta Quest Lineup?
Dalton: So Meta made some changes to their Meta Quest lineup. 23 MetaQuest S is rolling out. S is going to have the same processor as MetaQuest, but it's going to be cheaper and it's going to have the same controller too. 24 It just won't have 4K. And MetaQuest really isn't 4K because it's 4K shared between the eye optical lenses. 25So the optical lenses for your eyes are 2K each. 26 It's not really 4K. They say 4K because it's like once you combine the 2K and the 2K, it's 4K, but it's really 2K per eye. 27
Dalton: Anyways, I ordered MetaQuest while I was out of town. And so when I get back, I'll be able to try some of these features out and I'll do an episode on it. 28One thing I'm really interested in learning more about is working remotely with virtual monitors, I think it would be sick. 29Like being able to just pick up your computer, go to a library or go somewhere and just bust out a couple hours worth of work using virtual monitors wherever you're at, I think is so cool. 30
Dalton: They enhanced their content offerings. So they are offering, I guess they're now available, you can watch Netflix, Amazon, YouTube, and Twitch. 31So now Twitch is rolling out and you can use Xbox cloud. 32Not only can you do those things, but you can also do them in their enhanced theater mode, which is supposed to be more immersive, like the screen will wrap around you kind of thing. 33They are partnering with Microsoft to enhance the integration with their remote desktop app. 34They also are enhancing the AI avatars and they're allowing you to watch YouTube and Meta Horizons with your friends. 35And then you can also capture photorealistic spaces using your smartphone. 36
The Leap to Multimodal AI
Dalton: So now let's go into a little bit more detail and we're going to start with the AI enhancements. 37One of the first things that they did was they announced that their models are now multimodal. 38And if you're new here, multimodal means that the models can now accept data from different avenues. 39So they can accept text, they can accept voice, they can accept photos, video, in some cases. 40And some of the examples I showed that it could accept video. 41So these things will allow you to interact with Meta's AI more naturally than you could before. 42
It's so much more natural to just talk to these AI platforms and get results that you're looking for. 43
Dalton: It's super useful to be able to use your voice. I feel like if there isn't voice capabilities on these systems, I personally don't use them. 44I would still use Google Gemini because Google Gemini was so much easier to interact with because I could just use my voice and it was so natural. 45And I really appreciate Meta opening that capability up to the public and launching that. 46
How Celebrity Voices are Making AI More Approachable
Dalton: They also integrated AI voice. So now you can talk to AI with your voice, but then it can also respond with its own voice and you can pick the voice that you want. 47And a different approach that Meta took was they leaned into their social status and social media network and they used public figures as their AI voices. 48So they are licensing the voices of actors and celebrities. 49And you can use those people to be the AI voice that you communicate with, which I think is really cool and a different approach. 50It's got a level of novelty that might convince people to try it out just for the giggles. 51Like, I want to see what Snoop Dogg says or so and so and just try it out for fun. 52 And then like, wow, this is really useful. And they get hooked. 53
AI Studio: Training Your Own Digital Twin
Dalton: Now we're talking about AI Studio and AI agents. 54With the studio and agents piece, public figures have enhanced capabilities. 55One of the things is they respond to DMs for you. 56And so you can train your AI agent to understand what kind of person you are by training the agent on your threads, your captions that you post, and your comments. 57And then it will understand who you are and how you want to interact with others. 58And then you can train the agent to automatically intercept certain questions or messages and respond how you would respond, but without you responding to automate that piece. 59
Dalton: One thing that they showed, which was incredible, was your AI agent can make a legitimate AI agent of you. 60They did a live demo of Mark Zuckerberg videoing this AI agent and asking the AI agent questions about a recent release of a book that he published. 61And the AI agent was able to respond and they asked a couple of questions and it responded in real time. 62 The head was moving around. It had facial expressions. 63
I mean, you can tell that it was AI agent, but it's probably 80% there. Like, it's pretty close. Surprisingly close. 64
Dalton: I think they would have tight restrictions on who they'd roll that out to because you don't want someone snatching photos from a public figure's profile, making up these fake accounts and then making these AI agents and representing these people. 65
Game-Changing AI Feature: Automatic Video Dubbing
Dalton: The next thing is these AI features that they're talking about rolling out. The first one was the public figure personas I talked about. 666666And the second one was automatic video dubbing for reels. 67And they're starting with English and Spanish. 68But the interesting thing was they emphasized that your dubbing was trying to use your authentic voice. 69
So like what would you sound like if you spoke Spanish? Or if you spoke whatever language. 70
Dalton: They showed some examples with people speaking English, turning it into Spanish, Spanish to English. 71And you're like, wow, I could really picture her sounding like that or this dude, he probably sounds like that if he spoke English. 72And so they're saying that you'll have the ability to reach more users and you'll have an authentic voice and that voice will be yours. 73 Pretty cool. Wild. So crazy to see in the live demo. It was nuts. 74
Upgrades to the Ray-Ban Meta AI Glasses
Dalton: Now we're moving over to the next product, Ray-Ban Meta AI glasses. 75They're changing their approach to summoning Meta AI and so instead of saying these long action words, the hot word is now just going to be "Hey Meta," and it's going to understand that you can just continue the conversation like if you were talking to a normal person. 767676767676767676They also are going to be rolling out a feature where you will be able to use Meta AI glasses to remember things for you. 77Like remember flyers or billboards that you saw, or my favorite, where you parked. 78
Dalton: They are now going to allow you to look at a flyer or an advertisement or whatever. 79 And you can ask it like, "Hey, Meta, scan this QR code." "Hey, Meta, call this person" and you could look at a flyer and it would know this is a QR code or this is a phone number and it would be able to process that information and take action for you. 80Another thing that they said that isn't out yet, but will be coming is this real time video overlay feedback. 81So Meta AI will be able to give you feedback during your task completion. 82 Like, how do you put this together? Or what should I wear today? 83
Dalton: This is the third thing that I thought was incredibly impressive, just overall mind-blowing, is this real-time live translation in multiple languages. 84 You're able to do live translation in many languages and what will happen is you'll talk, Meta AI will listen to what the other person says, and then it will translate it to you, send it to your speakers. And then you would reply. 85It's pretty quick where the person's talking and it's translating it in real time. 86
A Partnership for Accessibility
Dalton: Another feature that I thought was really cool is they're partnering with Be My Eyes. 87Be My Eyes is a phone app that people that are blind or have a hard time seeing can use to download the app and request a volunteer's eyes. 88It's people who volunteer to be the eyes of the visually impaired. 89And so instead of doing it on your phone, now you can do it with Meta AI glasses. 90I think that's more of a natural thing because if it's on your face, like, just look at what you need me to look at and I can help you. 91It's just really nice. 92
Project Orion: A Glimpse into Holographic Glasses
Dalton: Another thing that they announced was Orion, which is the new future glasses that are full holographic glasses. 93These glasses are going to potentially be the smartest glasses in the world currently. 94They are completely rebuilt from the ground up, all custom sensors, all custom silicon, custom lenses from scratch. 95They've spent close to 10 years working on these glasses. 96It has things like hand tracking, eye tracking, and this wrist tracking where you put this tracker on your wrist and it can sense your movements with monitoring your electronic signals. 97 It's not being launched to the public at the moment. It's going to be closed to certain reviewers and partners and then it's also going to be used for development purposes. 98
It's currently too expensive... I'm probably guessing probably like right now, maybe like $3,000, which is too much for a pair of glasses for them to make any money on it at least or get mainstream adoption. 99999999
Meta's Big Bet: Why the Open-Source Model Must Win
Dalton: This is just going to be talking about Meta's vision for the future. 100There's a big commitment to making technology more accessible and user-friendly. 101 And they always go back to the example of computers. Linux is open source. Mac OS is not. 102And for mobile phones, the closed model won with Apple dominating pretty much the market. 103 It particularly bothers Mark because there have been some hiccups with Apple with certain things that Meta wants to launch in the App store. And they said, "Hey, well, we won't let you do that." 104
Dalton: So there's a big emphasis on making technology more accessible and user friendly and especially open source. 105
They want the open source model to win. They're open sourcing their VR platform, they're open sourcing their AI models, they're offering them for free and giving them to billions of users for free. 106
Dalton: They want to build a technology that emphasizes human connection and experience and doing that with the open source model to allow innovation from different perspectives and companies. 107They spent some good amount of time at the end talking about how it's important for the open source model to win. 108 And I agree. It would be weird if something that was so important to everyone is closed source and you have no idea what's going on. 109
RESOURCES MENTIONED
- Meta
- Meta Connect 2024
- Meta Quest
- Ray-Ban
- Messenger
- Netflix
- Amazon Prime Video
- YouTube
- Twitch
- Xbox Cloud
- Microsoft
- Google Gemini
- OpenAI
- Snoop Dogg
- Be My Eyes
- Nothing Phone
- Apple
- Linux
- Windows
- Cursor
INDEX OF CONCEPTS
Meta Connect 2024, Meta Quest S, Ray-Ban Meta glasses, AI features, multimodal models, AI agents, AI Studio, Meta Horizons, celebrity AI voices, public figure personas, automatic video dubbing, live translation, Be My Eyes, Project Orion, holographic glasses, open source, closed source, Mark Zuckerberg, Dalton Anderson, Hurting the Llamas paper, remote desktop, virtual monitors