Google's New AI: Gemini Canvas & Consistent Image Models

Google's latest AI updates introduce consistent character image generation with Flash and real-time document editing with Canvas. Listen to the full episode to learn more.

Google's New AI: Gemini Canvas & Consistent Image Models

TL;DR

Google's new AI tools can now create consistent characters across multiple images and let you build websites in real-time with Canvas. #VentureStep #GoogleAI #Gemini

INTRODUCTION

Creating compelling and consistent visual assets has long been a significant hurdle in marketing and design. How do you tell a continuous story when your AI image generator produces a different character every time? Similarly, how can you rapidly prototype a website or document without constantly switching between a chatbot and an editor? These challenges slow down creative workflows and create frustrating inefficiencies.

In this episode of Venture Step, host Dalton Anderson dives into Google's latest AI releases that directly address these problems. Setting aside recent news from other players, Dalton focuses exclusively on the new capabilities within Google's ecosystem, showcasing tools that are pushing the boundaries of what's possible with generative AI for practical business applications.

Dalton explores two groundbreaking features: an experimental image generation model, Flash 2.0, which can maintain a consistent subject across multiple prompts, and Canvas, a new function within Gemini that allows for the real-time creation and editing of websites and documents. He walks through several powerful examples, demonstrating how these tools can be used for everything from product modeling to instant document creation, offering a glimpse into a more integrated and efficient creative future.

KEY TAKEAWAYS

  • Consistent Subject Generation is Here: Google's new AI model can maintain a consistent character or object across multiple images, a game-changing feature for storytelling, marketing campaigns, and product design.
  • Prototype Websites and Docs with Canvas: Gemini's Canvas feature enables users to create and edit websites, app mockups, and rich-text documents in real-time directly within the chat interface, streamlining the development process.
  • Powerful, Context-Aware Image Editing: The AI can understand complex prompts to modify existing images, such as correcting a person's posture or generating realistic hands that weren't in the original photo.
  • Practical Applications for Business: These tools offer immediate value for creating marketing materials, modeling products with consistent subjects, and rapidly generating formatted business documents.
  • Experimental Tools Come with Bugs: While powerful, new features like Canvas are still in their early stages and may have bugs; Dalton specifically notes that the versioning or "undo" function can break the prompt sequence.

FULL CONVERSATION

Dalton: Welcome to Venture Step podcast, where we discuss entrepreneurship, industry trends, and the occasional book review. 1Unfortunately, I had some audio issues last week, so I will be re-recording this episode. 2Hopefully, when you go to look at the most recent episode and the previous episode, they both have audio. 3If you were trying to listen to the episode last week, it was because I was having audio issues, and unfortunately, during the week, I was just slammed and could not re-record. 4So here I am recording the same episode that I did last week. 5I hope that you'll enjoy it this time when it has sound. 6

Apologies and Episode Overview

Dalton: So what are we discussing in this episode? 7We're pretending like there wasn't a recent release of ChatGPT's image model. 8In this world, I don't know this information. 9And it's really just talking about Google and some of the Google releases that they've had. 10Of those releases, they had some really cool releases related to image generation and native image generation. 11

...the model understands the context of the image and can have a consistent character use throughout your image creation. 12

Dalton: Not only can you create images, but the model understands the context of the image and can have a consistent character use throughout your image creation. 13So if you provided a character, you can tell a story with it. 14You can make marketing materials. 15You could do product designs. 16You can make models. 17And so that is something that you couldn't do before, especially natively in an app. 18Keep in mind that you can only do this in what they call Google AI studio. 19

Dalton: They also released another thing called Canvas, and Canvas is supposed to be used for generation of documents and websites and previews of apps that you create. 20This isn't a particularly new feature. 21Anthropic has had their version of Canvas, which they call artifacts, for quite a while. 22So don't think it's brand new, but I do think the take of rich text editing within a document, within your chat, is pretty interesting. 23

Dalton: That's what we're going to be going over today. 24And I hope once again that this episode has sound. 25I tested it a couple of times, so we'll see. 26Fingers crossed. 27

Native Image Generation with Consistent Characters

Dalton: All right, so the first thing that we'll be discussing is the native image generation. 28I have some examples that I'm going to put on the screen. 29If you're listening via audio only, that's fine. 30I'll narrate a little bit of what's going on and you'll be able to understand. 31

Example 1: Correcting Posture Instantly

Dalton: Okay, so in this first example, it has a model that is slouched over, not looking very confident, and she's got really bad posture. 32This could be from a disability or this could just be because she's not having a good day. 33Say that you wanted to change your image and make yourself have good posture. 34There's a side-by-side where one person is doing all the work that would be required to do this in Photoshop. 35And then there's another version to the right of the screen that just says, "Make the girl stand in correct standing posture." 36And then it just comes back six seconds later and she's standing up straight with good posture. 37So that's one example. 38I thought that was interesting. 39

Example 2: Creating Multiple Views of a Product

Dalton: This one I really like, and this emphasizes the consistent subject model that's created. 40The first prompt was to create a transparent futuristic vehicle. 41It has these big off-road tires and it's got this futuristic look. 42It's transparent so you can see all the parts in it. 43It looks really cool. 44The next thing that was prompted was, "Okay, now create different perspectives of this subject." 45And so it does a front view, the three-quarter view, and then does the side view. 46It does a really good job. 47This one's my favorite. 48It looks sick. 49

Example 3: Generating Hands from a Selfie

Dalton: This one is of a woman and there's an original selfie and she looks like she's in some kind of library, I think. 50It's a selfie of her and her arms are out, so you can't even see her hands. 51The only thing that is sent in the selfie is her smiling, her hair, and the top that she's wearing. 52So then it's prompted, "Make her create a heart shape with her hands." 53And then she does that with the hands that didn't exist in the selfie. 54And they even gave it French tips. 55The AI model gave the woman in the image French tips. 56It successfully makes the heart shape, generates hands that don't exist, and it matches her skin tone pretty well. 57

Dalton: Then the next prompt was, "Make her give a thumbs up," and the thumbs up works pretty well. 58The hands are once again generated because there are no hands in the original selfie. 59It's just using the tone of the body to figure out what the inside of your hand should look like. 60It does a great job. 61

Example 4: AI-Powered Storytelling

Dalton: This next example emphasizes telling a story. 62And I see this as a marketing plan. 63

All of marketing and sales is really telling a story. 64

Dalton: A bigger piece is community building, having in-person events, and also building in public, and also creating this compelling marketing campaign where people want to know what happens next. 65

You can create those things now with some prompts. 66

Dalton: The original prompt is saying something like, "I want a scene of a lonely man on Pluto, imagining a happy life." 67So it makes the first frame of this lonely man in the middle of nowhere. 68And then it creates another frame of the same man, and now he's holding hands with his partner. 69696969He's eating dinner and it's the same person. 70It's five shots. 71And then it flips back to where he is, actually on Pluto in this made-up story. 72It keeps going back and forth between the different memories. 73Compelling stuff. 74

Example 5: Modeling Products with AI

Dalton: This is also one of the key things that you can do if you can have a sustained subject in your image generation: you can use it to model products or you can have your products being used by models. 75

If you have the same subject modeling the item in different ways or using it different ways, then it's more compelling. 76

Dalton: In this example, it says, "Create image, make the girl in the photo wear the jewelry in the second photo." 77It's this expensive-looking piece of jewelry, a gold emerald necklace with pearls and gold emerald earrings. 78And then the image generates an image of the model wearing the jewelry that was required, which is great. 79

Dalton: Here's another really good one. 80The original image is a woman wearing a top and some jeans, and she's smiling. 81And from that, they created four reference images for the model. 82One is with her smiling, looking to the side; one is with her being playful; another of her smiling and drinking a cup of coffee; and another one from a side angle where she is smirking. 83You could take a model, hopefully with their permission, and you can provide an image of an object that you want them to be wearing. 84848484Then from there, you can create these consistent subject variations of your original idea. 85

Introducing Gemini Canvas for Prototyping

Dalton: Okay, so the next thing that we're going to touch on is Canvas. 86Canvas is within Gemini, at gemini.com. 87And Canvas, as I mentioned, isn't a new idea. 88Other companies like Anthropic have had artifacts. 89I do think it's a great addition to Gemini and it allows for quick prototyping of websites or apps or creating documents and then editing those documents in real-time with your adjacent AI buddy. 90

Live Demo: Building a Website with Canvas

Dalton: So I created a very simple website using a prompt that I created earlier. 91I'm going to be creating a maritime insurance coverage website. 92I'm going to select Canvas. 93Canvas opens up and takes up about two-thirds of your screen and creates this website. 94So in this prompt, I said, "Can you help improve the website, add whatever you want? I'm presenting this to my boss, asking for a demo on Monday." 95They replied back that they want to do content expansion, enhanced styling, and responsive design. 96

Dalton: All right, let's see. 97I can see that the website looks visibly better. 98They added a solid header styling. 99When I click on the headers at the top, like Cargo Insurance, Hull Insurance, it brings me exactly to that area on the website, which I think is great. 100So I'm happy with this. 101I'm sure my imaginary boss would love it. 102

Live Demo: Creating a Document in Canvas

Dalton: So this one, I want to create a document. 103It's the same gist where you click Canvas and Canvas creates a website, an app, or documents for you. 104I already created a wonderful outline for AI to process and turn into a document that I could share. 105Okay, so it breaks down the different types of coverage in maritime insurance. 106We have transport cargo, hull insurance, offshore energy, marine liability. 107But one of the things that it doesn't show is what are things that typically aren't covered. 108So let's add that. 109

Dalton: Okay, so it said it added a common exclusion section. 110Now I'm asking it to add the exclusions to each section. 111Okay, so it updated the document. 112This is exactly what I'm looking for. 113Each section of the coverages now has common exclusions. 114For transport cargo insurance, exclusions often include inherent vice, spoilage of perishable goods, improper packing, and delay. 115If we go to hull insurance, it's wear and tear, gradual deterioration, and damage due to lack of maintenance. 116

A Word of Caution on Canvas Bugs

Dalton: You can export in Docs if you wanted to change the formatting. 117

This canvas feature allows you to edit the documents and allow you to format it in the manner that you want... in real time, which is something I have not seen before. 118

Dalton: And I think it works pretty well. 119It has the same issue with the versioning though. 120The versioning seems to be broken on Canvas and is something that they need to work through because it doesn't seem to work. 121I had issues when I originally recorded this episode where when you try to undo and then re-prompt, it would break the prompt. 122122122122I couldn't prompt anymore. 123So I think that's just a bug. 124I would only use the previous version button when you're completely finished. 125

Final Thoughts

Dalton: So that was Google's recent release. 126Once again, sorry about the audio issues that I was having last week. 127Hopefully, that doesn't happen again anytime soon because it is definitely a pain to re-record an episode. 128I hope that you enjoyed this episode and appreciate you listening in every week. 129Wherever you are in this world, have a great day. 130Good morning, good evening, or good afternoon. 131Thanks for listening. 132Hope you listen in next week. 133Next week we'll be discussing OpenAI's recent release of their model and the things that you can do with that one. 134Have a great day. 135

RESOURCES MENTIONED

  • Google AI Studio
  • Gemini (gemini.com)
  • Canvas (Google Gemini feature)
  • Flash 2.0 Experimental
  • Anthropic
  • Artifacts (Anthropic feature)

INDEX OF CONCEPTS

Dalton Anderson, Venture Step, Google, Google AI Studio, Canvas, Gemini, Flash 2.0 Experimental, Anthropic, Artifacts, Photoshop, Pluto, maritime insurance, cargo insurance, whole insurance, offshore energy insurance, marine liability insurance, OpenAI