Leveraging AI

276 | The AI Automation Playbook: Turn One Idea into 1000 Assets (Without Technical Skills) with Ross Symons

Isar Meitis, Ross Symons Season 1 Episode 276

Use Left/Right to seek, Home/End to jump to start or end. Hold shift to jump forward or backward.

0:00 | 58:41

What if you could turn one AI idea into a fully scalable business system—without writing a single line of code?

Most leaders are still stuck experimenting with AI… testing prompts, playing with tools, and hoping something sticks. But the real opportunity isn’t in experiments—it’s in building repeatable, scalable systems that drive real business results.

In this episode, you’ll learn how to move from one-off AI wins to powerful automation engines that consistently produce high-quality outputs from marketing assets to full creative campaigns, on demand with Ross Symons, a creative technologist and co-founder of Zen Robot, blending deep technical expertise with creative execution. 

If you want to stop dabbling and start leveraging AI as a true business advantage, this episode gives you the blueprint.

In this session, you'll discover:

  • Why most AI projects fail to scale—and how to fix it
  • The mindset shift from “experimenting” to building repeatable AI systems
  • How node-based AI tools simplify complex workflows into visual processes
  • A step-by-step approach to creating scalable AI automations
  • How to generate professional-grade images, videos, and assets at scale
  • The power of combining LLMs with generative media tools
  • How to reverse engineer outcomes—even if you don’t know where to start
  • Why AI is amplifying creativity (not replacing it)
  • Practical examples: emotion engines, content machines, and automated campaigns
  • How to drastically reduce production time and costs using AI

Connect with Ross on LinkedIn:
 https://www.linkedin.com/in/rossmsymons/

About Leveraging AI

If you’ve enjoyed or benefited from some of the insights of this episode, leave us a five-star review on your favorite podcast platform, and let us know what you learned, found helpful, or liked most about this show!

Isar Meitis

Hello and welcome to Leveraging ai, the podcast that shares practical, ethical ways to leverage AI to improve efficiency, grow your business, and advance your career. This is Isar Mateis, your host, and I have a really. Great episode for you today. One of the biggest unlocks that you need to start learning as you start building things in AI is learning how to go from a single, small experiment into an actually working solution that you can use and build around in a professional environment that will be repeatable and will generate the same results every single time. And in today's episodes. the mindsets, the are going to show you the mindsets, the concepts, the processes you need to think about once you start building AI automations in order to allow them to be scalable and actually support real business use cases. So that's number one. But number two. One of the really magical capabilities of AI right now is being able to generate professional grade visual assets at scale, so images, videos, and so on for whatever need that you have in your business. But do this in a way that is, again, repeatable and automated as much as possible. So if you can combine these two things, you're getting the best of both worlds. If you know how to build automations well, it can apply to anything. And then also learning how to do this in a visual environment allows you to generate any visual assets you need for your business, and it does not require any previous technical skills. So consider this episode a BOGO where we are going to show you both how to build and think through building node base automation processes, and at the same time how to build really amazing. Professional grade, digital assets you can use in your business for, for marketing, or for your website, or for whatever it is that you need. Now, our guest today, Ross Simmons, is probably the perfect person to take us through this process. And why am I saying he's the perfect person? Well, he studied his career as he's developer, so he's very structured, well organized and well planned when it comes to approaching business problems. But he spent most of his career running creative companies and creative agencies, which means he has the perfect balance between left brain, left brain, and run brain, which is not very common. It's actually pretty rare, but it's absolutely perfect for the topics that we're going to cover today. And since both, both of these aspects, knowing how to build automations and knowing how to build professional visual assets are really important and valuable capabilities. I'm really excited to welcome Ross to the show. Ross, welcome to leveraging ai.

Ross Symons

Thank you. So, yes, good to be back again. Thanks. Uh, second time Amman, and yeah, always a, a treat. Thanks. Yeah, thanks for having me.

Isar Meitis

Uh, thank you for coming back. It's, uh, and I really think, you know, this, this topic is and it's fascinating and timely and it's evolving. So I know we, we had tools and we node base automation tools and we had visual tools and now they're converging into this really unique set of solutions that allow to do things that were well impossible two years ago, uh, difficult a year ago. And now a monkey like me can do it. So, uh, I'm very, very curious to see what you're gonna walk us through today. I think it's. Like I said, it's, it's a, it's a magic wand that once people know that they have, uh, can do very interesting things for their businesses.

Ross Symons

Hmm. Yeah, for sure. I think that, you know, the, um, the node-based, node-based, I guess interfaces are not new. They, they have been around since animation and 3D rigging and I think there's some architect software as well that also does the same thing or uses the same interface. And so I some very smart person, when they saw the AI sort of explosion happening, they were like, you know what, let's just, let's just give this a go and see what we can do there. And I personally, um, from where I'm sitting, I mean, I'm try and be on the sort of forefront of what's going on in the gen AI space, particularly with the creative tools, you know, image and video and audio. And from where I'm sitting, it looks like the node-based applications and the node-based interfaces are definitely the way forward. I mean, we train as, as part of the company, I run, we train creatives on how to use ai and we take them through the, the basics of image generation and then, uh, how to get consistency across those images, then into video, uh, but outside of a node-based application. And at the end of the course or at the end of the curriculum, we show them vy, uh, or Flora in this case. That's one of the tools I'm gonna show you as well. And you can see there's this thing that happens. They're like. It's all under one roof. And everything you've shown us up until now, and we've been, you know, multiple tabs open, you've gotta click there and then draw, download this, upload it there. That is, uh, I don't wanna say it's a thing of the past, but it's when you see everything housed under that one kind of net, uh, on a single canvas, things just happen. And, and I, I mean, I, I can't pinpoint what it is, but it's just looking at the entire project in a single space and having all these little nodes and everything connected. Um, initially when you do look at it, it's extremely intimidating because you, you see these blocks and it's like, you think maybe it's this new AI sort of alien that, you know, alien interface that's now, uh, been put in front of you. But when you start explaining it to people and you start breaking it down to, like, each of these blocks represents a. Image or a text block or a video block, and this is how you connect them. And when they, when, when you start connecting the, the strings and, and pulling, um, you know, nodes across the, the canvas, that's when some really cool stuff happens. I mean, to your, you know, to what you mentioned earlier, being a, essentially a creative technologist, but, you know, um, I think that someone that does appreciate the, um, what it, what, uh, what it, I guess what a, a good interface, what a good user interface and, um, what you can do technically with something like that makes it more appeal. Well, I don't wanna say it makes it more appealing to me, but there are things that I know from a technical standpoint, being a developer that I used to be. I know how to add a Um, I can look at that and go, oh, I know how to add a variable in there. I know how to iterate over that. I know how to create an array to make, and then these are all coding terms. But when you explain it to people on a very basic level, and it's like, cool, well this is how you make multiple images from one image. And yes, the technical tools are still the same. You're still using an array, you're still providing one, uh, I guess single variable that that changes and something switches on. So yeah, I'm super excited about this space. I mean, and you know, just to, I think the, the node-based, we, we touched on this earlier, but the node-based, um, landscape is definitely like, people see that it is, it is a thing. They don't know why. And I, no one can really pinpoint why it is because it, it doesn't really logically make sense to, to build this like spaghetti block interface to, to connect, um, images and videos together. But when you see it working and you see how scalable it is, I think that that's another thing. When you can see these, I mean, Weavy refers to like the, you're creating 📍 Weavy machines or design machines. And when you've got these blocks and you can duplicate that and just swap the pro, swap the product, swap the character, swap the anything out, hit the button, you know, drag all the, the nodes hit play and it just rolls it out for the rest of them. You kind of see, okay, whoa, this, this is very powerful stuff. So yeah,

Isar Meitis

I think the, I, I think the big thing here is, is, is, uh, transparency. It allows you to understand what's actually happening in a really complex process. And I think that's why people are very easily connected to this because it allows you, initially, allows you to understand and it allows you to build, right? Because once you understand, you're like, oh, now I know what all these Legos can do. Yeah. And now the Legos make a lot more sense. So previously, like you said, if you teach people, oh, you need to do this, and then you need to copy this there, and then download this file, and then you grab the file and you upload this to the other tool, and then it's like, oh my God, this is so complex. But, but, but now, so this, it still gives you the Legos, but you don't know really how to build with these Legos. Mm-hmm. And, and these, these node-based tools that like you to see on a single screen, everything that's happening, it's kind of like having the instructions for the Legos. Like, oh, now I know. And the other really cool thing is that all these tools, you know, IV for sure, uh, before that Comi, which was very, very high tech, geeky kind of like solution, but same kind of concept, um, there's a huge community meaning. More or less anything you want to build somebody already built and sharing. Like you don't even like, oh, I want to build this really complex thing and it needs to do these 17 different steps. And you're like, oh yeah, already some, somebody built 16 out of those 17 in order process. You can add your one step into that and then in 10 minutes you'll be up and running with a solution that, again, two years ago was not possible and six months ago was very complex. Mm-hmm. So let's, let's really go into this and show and, and, and as I mentioned, I think it'll be great seeing examples, but also thinking, you know, one step back about the, the design concepts that go behind it. So other people don't just see the example, but they can actually understand why it was built the way it was built, and what are the benefit of, of building it that way.

Ross Symons

Hmm. Okay. Cool. So let's take a look at. one first. Uh, let's look at this one first.

Isar Meitis

And for those of you, by the way, who are just listening to this and not watching a, you can watch this on our YouTube channel B, you, if you're on Spotify, you can watch this on Spotify. But if you don't, because you're driving your car or walking your dog or running on a treadmill or whatever it is that you're doing, uh, we'll explain everything that's on the screen. So you can still stay with us on the podcast, uh, sound only and we'll walk you through everything that we are seeing on our screen.

Ross Symons

if Cool. So if you arrive at a, at something that looks like this, initially, you'd be like, what the hell am I looking at? Even when I look at these workflows, I, I still, um, it's, it's always still this like, okay, cool. I better, uh, just, you know. Take it easy, take a breath and, and move slowly into it. 'cause I think what's, what's nice, um, about this is you can look at the whole thing. It's, it's not like looking at a, a piece of sheet music or looking at the, you know, the, the, um, the, the parts of a car that are being put together. If you were a mechanic, it's, it's very raw. It's kind of like, what the hell's going on here? And, but the more you zoom into it, it's almost like this world that you're kind of slowly zooming into. So like. And as you go into it, you start looking at the blocks. And you know, when, when I'm looking at something like this, I think naturally, um, the way these interfaces are designed, I mean, just, just to, as a quick example, you have, um, a prompt box which you can pull. I mean, I'm not gonna even try and explain how many of these little nodes there are 'cause there's hundreds of them. But for the most part, you have a prompt box. You pull a, a string out of here into a another box. And that could either be an image, um, or another prompt box. And it could be, in this case yeah, is like an an any LLM, which is you get the option to select the, the large language model that you want to run, the prompt that you've put in. But if you, the more you start using it, the more you realize it's pretty much the same as like a chat GPT search with your little search button there, you've got the search block and that goes, uh, you pull it out from the right hand side of the block into, uh, another block, which is. To the right of it. So you're working from, from the left through over to the right. So I think that naturally most of these workflows will, would start there. So it's not like they're just placed randomly and, um, they, they're all over the place. There is some sort of structure. And only once you start using it regularly do you realize that there is, um, there's a bit of, I don't wanna say magic, but there is a bit of, um, structure and, and, um, yeah, just, uh, I guess design thinking behind how you put all this together.

Isar Meitis

Yeah. Again, for those of you who are not watching, what we are looking at at the screen is there are three entries to a more complex process. Uh, two of them are text boxes, which we'll explain what they do in a minute, and then there's an entry that is an image, and they all go into a large language model as an entry point. Mm-hmm. So the way this works is that every one of these blocks can either be a input. A process or an output, and every one of these could be an image or text or video, right? So you can enter as an example. In this case, you can enter a. and an Prompt and a piece of code and an image into a large language model. And because it knows what to do with these, then it just does, and then it can create whatever output you're asking it to create in the prompt. Now, the other cool thing here, because it's not necessarily one-to-one, it could be one to many or many to one, there are many cool things that you can do, and I'm, I'm, I'm, I don't know if Ross is gonna jump into this, but just think about the fact they're saying, well, I don't know if GPT five or CLA 4.5 or whatever is gonna gimme a better solution for the thing that I'm trying to do. You just drag another line from the prompt line and you connect it to another box, and in one box you select CLA 4.5 and in, or 4.6 or whatever, and in the other box you select GPT 5.2. And you see the output of both of them on the same screen. You don't have to go and open and copy and paste and have licenses, like all of this goes away and you're like, oh, consistently when I'm doing this particular process, uh, GPT 5.2 gives me a better output. I'm gonna use GPT 5.2. I don't care about Claude for this particular thing, uh, moving forward. Uh, and the same thing with image generation, and the same thing with vri. Like all these things are available with a very simple user interface. We can, you can just see the blocks on the screen and you can very quickly understand what each and every one of them does. And then I'll give it back to Ross to kind of walk you through one of these processes. You understand how amazing this is?

Ross Symons

Yeah. So I mean, looking at, this is quite a technical one. Um, in terms of, uh, it's got a system prompt as well as a standard prompt. I'm not gonna dive too much into what that means. Basically, the system prompt gives the instructions as to what the large language model needs to do. The prompt box essentially is the, the variable data. So this system prompt will remain the same. The variable data you can swap out for multiple examples. In this case here, I'm basically just getting, uh, I've created like a, what I've called an aspect ratio machine. So I can drop any image into this block. Here it goes. That's gonna be the reference image that comes in here to our large language model. The, uh. The system prompt says, okay, cool. You are A-J-S-O-N expert. JSO is the type of code. Don't worry about whether you know what JSON code is. Uh, it's just a different way of structuring a, a piece of text in. For personally, what I find, um, a lot easier to, uh, kind of iterate over and, and read because it's just easier to read anyway. It's, it's quite a technical thing. And if you're a developer, you'll know exactly what it is. If you're not, just know that it's this different way of structuring information. So in the prompt box or the, the system prompt, I've got this thing that says it's all the rules. It's like, this is what you're gonna, you're gonna create, uh, the crop must be x uh, no part of the image, um, must be regenerated. No. In painting aloud, and I've, I've given all these instructions. Bear in mind I did not write all these instructions. I obviously used large language model to write all of this, but I was giving the outside of this, I was giving Chachi pt, um, just examples of like. This is what I want to achieve. Help me create the system prompt so that when the system prompt does come in here, I can then, uh, just iterate over it and it's just easier for me to create. Then in the prompt box itself, I've said, okay, I want you to create different aspect ratios, so the standard, you know, one by one, four by five, three by four, 16 by nine. And the idea behind this is, you know, often we are. Presented particularly for social media, if you're creating social media content or banner content or anything digital, um, it requires a square image potentially for whatever, uh, or maybe it starts as a square image, but you're gonna need a, for example, a YouTube, um, aspect ratio, which is 16 by nine, and then a nine by 16 or four by five or two by three for the other channels. So your tiktoks, your, um, LinkedIns, although LinkedIn has different structures, um, or different aspect ratios. Anyway, so it's just, I just wanted something that I could just drop a, an image into it and illustrate the fact that you can turn that single image into multiple aspect ratios by keeping the image almost exactly the same. I say almost because these, these models are generative. They are generative AI models and no generative. AI model. Even by using the same prompt and the same seed will generate exactly the same thing. There is always something slightly off, slightly different, and it is something to just bear in mind. So I think that that's where the big difference comes in and where a lot of designers particularly, um, struggle with this technology because they're like, it's not exactly like I wanted it. It's like you have to be a little bit forgiving or know how to comp what, what it is you want to be exact back into the image. Um, yeah. I'll

Isar Meitis

say two things about this. One is the models are getting much better at keeping consistency, especially if you know how to give them the right references and prompt them. Two is is so marginally cost is so marginally low. 40 versions you can create 40 versions of the thing until there's one image you actually like, and it's still gonna be almost instant and it's still gonna be cheaper than actually going and doing a photo shoot of the thing that you wanna get.

Ross Symons

Exactly. Or, or, or paying a designer to go and take this image of this, you know, this, this. Hessian bag or whatever it is, this, um, material bag and like, please create, put this into different scenarios that, so that I can put it onto my newsletter, I can put it onto my website, I can put it across all five social media channels, which all are different aspect ratios. It becomes, it becomes tedious and it, it's somebody's job. Maybe some people enjoy that, but I, for me, I'm just like, no, there must be an easier way to do this. So that's what kind of inspired this so. Running outta the LLM, it goes into an array block. And this is what I touched on about, you know, just understanding a bit about code. It's basically just what an array does, or it's just a list of, um, a list of text items, if you wanna call it that each of those items runs out into a different box. Again, I think these technical sort of aspects of wey specifically are gonna get easier over time, or maybe they stay like this. I think, um, the more you use it, the more it just becomes second nature. You're kind of like, okay, well if I want multiple prompts coming out of this box, then they are gonna have to go into an array, and then they're gonna have to go into a list item of some kind. And each of these lists, you can then select which one you want. is a prompt. Um, and each of those essentially is a prompt. So it's exactly the same prompt, but I've swapped. The only thing I've changed is I've just swapped out the aspect ratio for each of them. So they have multiple inputs here coming from the large language model. And then, um, so running out of each of these, we have the list running into a, in this case I've got, uh, nano banana two, which is the latest, um, yeah, the latest craze. Everyone's using it, everyone's talking about it. It's amazing. Um, and it is cheaper and faster than Nana Banana three or Nana Banana, uh, where we are, oh,

Isar Meitis

whatever it was called. Yeah,

Ross Symons

the one before. So Nana Banana two, the one before was slightly more expensive and um, and a bit slower. So, and I just a snapshot here. You can look at, I mean, on close inspection there, we've got a square version. We've got a two by three. This one is a four by five. There is a nine by 16 here we have our 16 by nine, and this is a wider one, which is 21 by 21 by nine, which is like a lot more cinematic and wide, but. It's pretty much the same object. I mean, I didn't specify where within the frame it needs to be, but I think if, if you had to show this to somebody five years ago, they'd be like, how did you do that? Like, honestly, this is, it might not look that like much because it's only seven images, but to create those seven images and, and not only create those seven image images, but now have a machine where I can just go out here and I'm swapping the image, right? So I'd swapped the image there, which I, I ran all of these, um, earlier, so now I've placed a different image here. What I'm looking at now is just a box of, uh, vegetables in a nicely sort of lit, looks like a kitchen area. And essentially just run all of those again. And you've got at scale, you know, just by the click of a button took probably the best part of 20 seconds to run this again. And we've got all of these.

Isar Meitis

you want. I, I wanna, I wanna add two

Ross Symons

Over there and you can do

Isar Meitis

whatever you want. I, I wanna, I wanna add two very important things here. One to a little bit explain, and the other is to share with the audience that is not watching this. What are we looking at? So really what we have is a system, a process that is repeatable, that the only thing that needs to change is the input image, right? You put in a new image, you get all the outputs regardless of what they are. And I'm sure. Ross will show us other examples in a minute, but all you need to change is the input, and you get all the outputs that you built the machine to create. In this particular case, what the machine does is it creates multiple aspect ratios of the same thing. Now, what you're not seeing, going back to describing what's on the screen, is both images had really unique backgrounds. So there's one object in one of them, it's a bag, and in the other one it's like a basket of vegetables. But in both cases, the background has intricate details of shadows and countertop, and it's not just a gray, random background. And in all images, the AI manages to capture it and represent it and, and conceptually. Invent because it doesn't have that information in the original image. Invent what's above, below, left or right of the original basket in order to be able to make it a 16 by nine and and so on. The way it actually works is the prompt that then becomes multiple prompts, asks it to describe the picture, recreate it in and recreate a new prompt that will describe the picture in the same level of detail as the original prompt just in a new aspect ratio. And then what You give nano banana, you give it that new prompt together with the original image. So n nano banana is getting two things. It's getting the original reference image that it already knows outta reference, plus a very detailed prompt in a new aspect ratio. But you build all of this once and now you replace the original image and you get whatever number of outputs that you want out of it. And that's why this is so magical and that's why. These systems, if you understand how to build them, allow you to create, as I mentioned, resources at scale with practically zeor effort after you've built it and troubleshoot it.

Ross Symons

Yeah, totally. And I think the only, the only, the only, real hurdle is understanding firstly, um, how to, how to large language kind of sit with the, the large language model and get it to a point where it's actually delivering consistent results. 'cause that's, that's the thing that there's no one click solution for. Yes. Once the machine is built, it's going to reliably do pretty much what you've asked it to do every single time based on the, the image or multiple images that you put in. But it's really working with, with the LLM to kind of, um. Construct the, the prompt in a way that you are gonna get consistent results and it, and you would think, okay, we'll just add as much detail as possible, or, you know, it surely can just reference the image, but it's, it's not, it's not actually that, because you have to, I mean, for me to get it to this point, I, I had to, you know, do it at least five or six times before it got some sort of cons consist consistency. The, I mean, I'm still not a hundred percent happy with, you know, some of the angles change quite, quite, you know, just very slightly. Although it is the same. I mean, at face value, it looks like a photo shoot that was taken, um, where they did maybe move the camera slightly, which I'm okay with. And, and you know, it is, you have to be a little bit forgiving in terms of what it is. But for the most part, this is in, in my opinion, miraculous technology.

Isar Meitis

I'll add one more

Ross Symons

It really is.

Isar Meitis

I'm, I'm with you a hundred percent. I, I'll add one more thing and then maybe we'll dive into another example is going back to what I said in the beginning. Don't be intimidated with knowing what does that mean? What Ross said right now, how do I get to that first starting point? Copy somebody else. Yep. On all these platforms, there are hundreds of thousands of examples that people are sharing the entire process with their prompts, with their connectors, with their arrays, with like everything. Go and search for, I need a IV that does 1, 2, 3, 4. And there's a very, very high chance you'll get something that either does exactly what you needed to do or 80% there. And that's also the best way to learn because now you can dissect what this thing is doing. You can see exactly how the prompt is working. It's like, oh, all I need to do is change this one thing in the prom to make it mine and mm-hmm. And you solve the problem. So don't, don't be afraid of staring at a blank sheet, and I don't even know how to get started. Start with three or four that already. it and then Are there because the community is sharing it and then reverse engineer from that.

Ross Symons

Yeah, exactly. That term you've just used. Now I think that that is what AI is, I think, in my opinion, like solely built, built for, to reverse engineer ideas. We, like, we look at all these things now and, you know, certain tools make it easier, but you look at this concept, you're like, we wanna create an ad campaign that looks like that, that, that, and that mixed together. We know what we want it to look like. How do we work backwards through the machine to get our raw materials? Like we've got these brand assets, we've got images, we've got, uh, brand identity or corporate identity. How do we plug it into a machine that spits out what we have in the format that we want? And that's, I mean, that's the simple computer process input, process output. It's, it's no different. But what AI allows us to do is now gather. A lot of reference material and a lot of ideas and reverse engineer, like break it down in, break this image down, break this video down, break this concept down into its simplest form so that a large language model can understand what it is and take those, uh, those key points, feed it back into the system so that you've got the raw data and every time you send it back into the system, it produces exactly the same results. And I think that, that, we've never had technology like this before, and that's to me what is very exciting.

Isar Meitis

A hundred percent.

Ross Symons

one, which is quite a Okay. So let's take a look at, I've got another one, which is quite a similar one. This one I've actually shared on, um, on LinkedIn. This is an emotion machine, so very similar concept except we are not putting, we're not putting, um, a, an object or a product photo. And here we're putting a person, so it's a face of a person. So here we have these images are created in Midjourney. So it was just a portrait shot of a woman. She's got, uh, dark hair, kind of gray streaks in her hair, some freckles, and she's looking directly at the camera with a pretty plain, plain look on her face. Same idea. We've got our system prompt and we've got our. Uh, JSON Prompt, which has got all the expressions. So I've got nine different expressions here. We have laughing joyfully with head tilted back, something just in terms of the language being used here. I make this as generic as possible so that I can put he, she, her, it, they, it doesn't matter what I put in there or which object or which subject goes into the reference image block, it'll always produce the same result. So yeah, looking thoughtfully with a hand on her, like I said her in there, that example, but it didn't really affect it pouting, sadly. And looking down, wring, uh, wink winking playfully with a slight smile. So these are my nine emotions that I have. Cool. They run into the large language model, spit out into an array. Each of those get spat out into a list item. Each of these get selected, pulled into a Nana banana block, and now we've got exactly the same person, but with completely different emotions. Essentially. This is a photo shoot like happening right in front of us now. So

GMT20260303-185746_Recording_gvo_1280x720

this

Isar Meitis

is. Absolutely brilliant. So I play with VY a lot. Uh, not play. I use VY a lot and play. It's, it's, it's a really fun tool, right? If you're, you have any creative ideas, it's just a fun tool to use. But I've never, it's try to do emotions with it. And again, for those of you who are not watching this, this is freaking incredible. It's literally looks exactly like the original person, and the emotions look fantastic. and And, it's like you're saying it's like the perfect photo shoot because now you have the person in multiple, multiple angles and multiple expressions. Some are funny, some are sad, some are deep thinking. It's just. Incredibly good.

Ross Symons

Yeah, it's, it's amazing. Like honestly, I didn't, I didn't, think that the results were gonna be as good as they were. And I thought, okay, cool. I got lucky with the first round and exactly the same as we've done in the previous example, like swap the character out. So I grabbed this guy, he is a, um, he's an Asian guy. He is got like a tattoo on his face, sort of blondish hair, bit of a goatee, and I was like, okay, cool. Let's see how close it gets to that. So running across all of these, I'm just switching to the next, I've obviously run all of these. We could do a test and run, run with a different face, but it's probably just gonna take too much time. But exactly the same character consistency across the entire thing. And he is just changed his emotion. Um, I then did it with a third person, which was an old lady, um, with gray hair Yeah, I mean, as you and. Yeah, I mean, as you incredible you watching these change and what's fascinating is if you go, and I'm gonna zoom in on one of them and I'm just gonna cycle through the three faces that we've created and you see how similar they are, like it's captured the exact emotion.

Isar Meitis

it's literally incredible. Like, I have It is incredible. It's literally incredible. Like, I have no words to describe those. You're not seeing it how good this is from capturing emotion on three very, very different people. Yeah. As far as faces, which shows you how good these models are right now. Now I will say something else that, that I find this to be fascinating and maybe you have an example, maybe you don't, but weve, and all these other tools, like it's not just weve, we're just, we're not picking on it. It's just a really great tool, uh, allows you to also build videos, which means if you're trying to create like a multi scene section of a video and you need the person to go through. Different emotions. You now have the initial building blocks that You can use in the next step to go from this person is surprised to then they're laughing because now you have the person surprised and the person laughing. And if we combine the two things that we've seen before, you can now take the surprise phase in another step and generate it from three different angles. So now for each and every one of the expressions, you have the same person from three different angles. Now you can use these in the videos or whatever like you. You can from every one of these. Once you have the process that knows how to take one image, turn it into an array of prompts and combine it with the original image into an output, you can go four or five steps deep and generate all the assets you need for anything. You can imagine either entire campaigns or videos or whatever it is you want to go.

Ross Symons

and no, totally. Exactly, and no, totally. Um, what I did, just as a cheeky to test, you know, this was, I don't think this was a stress test, but this was definitely a, a consistency test. So I thought what would happen if I threw something completely random in here? So I threw in, I did a series of these like grumpy croissants. So it was just a crus, it's a croissant against a blue background, but he is very sort of anthropomorphic, and he is, he's got a face and he, he, he's closing his eyes, but he looks, yeah, doesn't look like he's too happy. So I thought, well, let's just test and see how these would come out. and I thought this was very fun. Um, and I thought I'd show it because again, you know, it's, this is now, this is now moving into a completely different space. I mean, think about if you're a, a, a coffee shop or a, I mean, a, you could be a brand of some kind that sells, uh, you know, baked goods. This, there's a whole campaign idea in this, and it was purely because dropping in one image and just watching. Because, you know, if you think to yourself, what is a, a sad croissant look like? What does a pensive croissant look like? It's like, I don't know. Let the machine, let the machine make it for us. You know? So, yeah, I think this was, this was quite a fun, um, again, just a, a very different version of, of what I created before, but an a, a classic example of, of how powerful, um,

Isar Meitis

this is

Ross Symons

fantastic ai. these guys were definitely, So these guys were, these guys were definitely, um, a lot of fun

Isar Meitis

to hear. I wanna, I wanna add something to, to this. You know, I, I hear a lot of people say, oh my God, you know, AI is gonna kill the creative space because, you know, now anybody can. And I'm like, it's exactly the other way around.

Ross Symons

Mm-hmm.

Isar Meitis

AI is becoming such an incredible enabler of the creative space because to do what you just did with two clicks, uh. a highly required to be a highly trained professional

Ross Symons

mm-hmm.

Isar Meitis

That, that will spend many, many hours building croissant with different expressions. Yeah, exactly. And, and, and literally would've taken weeks of a, of like a graphic designer to either paint this or, or manipulate this one way or another. And now anybody with a creative idea can execute whatever they want. And I think it's an explosion of creativeness versus the other way

Ross Symons

around. Totally. I, I completely agree with you. I mean, I, I fully understand the concern and why people are up in arms about it. There's, there's, I feel like there's less pushback lately. Um, there is a still a lot, but people are starting to realize like, okay, these are just tools and I need to learn how they work. So might as well get on. okay, let's see. I did a similar Um, okay, let's see. I did a similar thing here with a pose machine now. Same concept, almost exactly the same machine. Um, throwing a character in here, and I just wanted to see, you know, how I would get this, the one, drop a person in and then swap, you know, change out their, their poses. Now you could swap out their garments. You could swap out their, you could swap the background out. You could swap the anything. You can swap the model out. I mean, I changed the model and, you know, again, just. Very easily. Uh, what's great about this as well, just a, from a, a technical standpoint is you can grab all of these nodes and just hit, run selected, and it just runs all of them again. So it's, it really is, it's not, it's not just like, okay, this Ted d post's like, oh, well you sitting in there clicking buttons all day. It's like, it's not really it's click play.

Isar Meitis

You click

Ross Symons

play the whole thing. Right. Exactly.

Isar Meitis

Exactly. Uh, I'll say one more thing about this, that one of the ways that I'm using it, which I absolutely love, and it's the same concepts, only takes it to a, to a, a, a, a different level from a creative perspective. So I'm not a creative person, Ross is. Uh, and so when I'm showing this to clients and I need to show them what can they do with this, one of the things that I'm doing is I'm adding another step in this process that's basically saying, okay, I've got this cross on. Give me 20 ideas of how, what kind of scenarios I can come up with for ads that will use this. Person thing, item, object, conceptual service, like whatever the thing is. And then what happens in addition to what Ross is showing you of angles, directions, people, things. I get the actual ideas, I get the same thing, the cross on, on a plate in the street. Or if it's a magical cross on like the one Ross created, you know, flying business on a jet or, or riding his bicycle or sitting tanning at the beach, like whatever crazy idea you have. But it, it will come up with the ideas for you. So they're like, oh, I got these four ideas are fantastic or, and if you don't like it, you just click the cycle button and it will come up with 20 new ideas that fit whatever the thing is. So, because it's a large language model and it understands what you explain to it, right? So it understands Who's the target audience? What is the campaign about? What are we trying to promote? What kind of emotions are you trying to ev evoke? Uh, What is the thing that we are pushing? You know, if it's a, is it a garment? Is it a a service? Is it, it understands all of that? It will come up with relevant ideas on its own. So if you replace the thing that is the input, it will give you different ideas that fit the target audience and whatever, and so on, and so, and so, you can really build not just the machine that generates the output, but also the machine that helps you come up with the creative ideas that you can use.

Ross Symons

you know, Yeah, exactly. And, and I think that one, you know, if, if one thing that we teach, um, which is the fundamental, I think the fundamental principle of how to start with all of this is like, if you don't know, just ask the machine. Literally just if you don't know, you know, all the touch points of your campaign, or if you don't even know how to get to those touchpoint, ask chat. GPT, ask Gemini, ask Claude to ask you what. It needs from you in order to get that ball rolling, in order to get those ideas out. And I think from a con contextual perspective, like it's all about context, right? Like relevant context not, and context doesn't mean the more context you put in, the better it is, which sometimes that's actually, um, the exact opposite, but it's relevant content context. And if you, again, it's like some people are like, oh, I dunno how to prompt for this, or which is the best tool to use? It's like, I don't know, ask the machine. Just literally just ask Chatt, say, I don't know the answer to this. How would I, and work with me to try get to that answer. and to me that's a very liberating, um, understanding because then you don't need to know, um, the, the next example I'm gonna show like, does exactly that. So. I built this, this is like a trailer for, uh, an academy that part of our gen ai. We have a, an academy that we, um, it's like a subscription service that we, um, have tutorials every week. And one of these tutorials was, I wanted to do this, like a hand drawn, I wanted to take a sort of hand drawn sketch of a car and then turn it into a real life car and then spin around the car, do a whole bunch of stuff, and have the car ride off. So I also love, I dunno what it is, the, the concept of starting with a single image and creating an entire video, like, to me that is just, it's, it's one of the most amazing features that, um, or not features, but amazing things that, that we are able to create is literally just from a single image, just going as deep as we wanna go. So what I did here is, you know, just kind of taking the initial image, which was the, the hand drawn car. I actually started with the original car. So it was like, cool, let's start with that and then work backwards from there. So how do I get a. A sketched car, hand sketched car.

Isar Meitis

Um, I I wanna pause it just for one second. Uh, yes. 'cause I wanna connect it to what you talked about in the very, very beginning, right?

Ross Symons

Mm-hmm.

Isar Meitis

the reverse engineering process, right?

Ross Symons

Mm-hmm.

Isar Meitis

and it, it's two things that the the really cool thing, and, and it, it's two things that you just said. Now that, that connected back to this point in my head. don't have a clue how to get You know what you want to get. You don't have a clue how to get there, and you don't even have a clue how to get started, but you know what you want to get.

Ross Symons

Mm-hmm.

Isar Meitis

I want to have a video of a car that starts with a hand-drawn car and then becomes this full video of the realistic car. How do I do that? And then you just work backwards from there.

Ross Symons

Yeah.

Isar Meitis

And you ask the AI and you implement and you iterate and you ask the ai and you implement and you iterate. and And, and, this again, this kind of thing used to take months and now within minutes, or if it's a really complex process hours, you will have a decent. some cases the final version. and in some cases the final version.

Ross Symons

Yeah, exactly. And that, that's, that's the crazy thing. It's kind of like just understanding that if you do under, if you do know that you don't know, like you don't know what you don't know, but the machine can probably help you find out what you don't know. So, because it's not like the machine is sitting there and it knows, it's not like it knows everything. It just knows. It's just very good at reverse engineering. It's very good at looking at something and being like, well this is how that was created. Um, and this is how you go about recreating something. So anyway, back to this. Cs I created that, created a sketch version of it, which was pretty straightforward. I then wanted a hand in the shot as well, so kind of doing a hand drawn thing and something I wanted to bring up. So. I then did a, the exact same sketch, but now I needed a blank sheet with image generation. What? Oh, sorry. Video generation, whether you know this or not, like key frames. You start with the start frame and end frame. So it, it basically starts with a start frame and then morphs into the second or the last frame, uh, through a process of interpolation filling in the, the frames in between all of those. That's just how the video generation process works. But what I've found, and this is something I wanted to bring up, is that we are at a appointed, um, in I guess the gen AI space where there are so many models coming out and there's so many new models coming out that I think something that we must not forget as just as creators is that some of the models, some of the previous versions are actually better than the latest versions at certain tasks. So, and it's a very simple example. Okay. Is. I was not able to create this, the effect of this hand drawn, I, I didn't, I had done something before, um, which was very similar, which was this hand drawing a building, and it worked perf perfectly. And I did this about a year, maybe a year ago. And one of the models that I used was, I think it was Lum Ray, Ray two. Now, if you look at this, this is like, it didn't really do a great job. Uh, with, with these I then try to

Isar Meitis

see, I just wanna pause again for those, for those of you who're not watching. Uh, and, and, and again, to connect the dots to what Ross was saying previously, most of these video models, uh, knows how to get a first frame last frame.

Ross Symons

Mm-hmm.

Isar Meitis

And with the prompt that's telling it what happened between to create the actual video?

Ross Symons

Yes,

Isar Meitis

exactly. So in this particular case, Ross has a hand holding a pencil on an empty piece of paper, and then a beautifully drawn old Mustang, I think it is, uh mm-hmm. Kind of like already drawn with the hand in it. And, and the prompt basically says something like, you know, a handwritten hand, the held pencil picked being drawn, drawing a Mustang on a blank sheet of paper. Yeah, yeah. That was, and so, and so it knows the beginning, it knows the end. It, because it has the two actual frames that the automation created. And then it's supposed to know how to make it look like the hand is actually creating the, the, the, you know, the, the scribble or the, or or the painting of the car. And, and there's a very big variance between how one model creates it versus another.

Ross Symons

Yeah, exactly. And I think this was the one I went with, which wasn't too bad. I think there were. Or, yeah, it was this one. Sorry. It was, this was the final one, but this is a very, this is C dance one, which is, you know, not a model that you would think would actually create such a good result. I tried Cling version three. I tried um, there was another one called Grok. I tried, there's, there's tons of them that are new models, so you would think, okay, obviously it's gonna listen to the prompt and it's gonna do exactly what I asked it for. But again, it's just something I, I've recently found, I, I, I've also been using nano banana 2.5 for some images because it's more reliable. So it's, again, just because it's the new model, it does not mean that it's gonna be the best for your use case. Just something to just bear in mind there. So anyway, you clip all these, you create all these videos, put all this stuff together. Um, I then said, okay, cool, I want this car. So I said, you're a professional photographer. Gave it a prompt. Um, I want this exact car from multiple angles. And then I'm gonna kind of just move the camera around to different angles of the car. So we've got a profile shot, a sort of low angle, uh, three quarter shot, a shot from behind. Um, and then the original shot, which is kind of three quarter front, top down sort of thing. Those become my key frames, my first and last frames. And the, here, this is where I wanted to create some sort of dynamic, uh, camera angles. I don't know how to express to a model, to a video model what those dynamic camera motions need to be. So just ask the model. Just say, I've got the start and end frame. Make it cool, make it do something, make, make it go from this frame to that frame and make, it engaging, make it fun that the content is supposed to be, um, yeah, it's just supposed to be cool, fun content. So just do your, do your worst, you know? And from there, then just getting the model to kind of just create all the, essentially the clips that I would then use in sequence of, in a video post-production, knowing that although also another thing. And with post-production, knowing that although these models create five second clips, you can move faster through them in post-production. So it might be quite slow if you're trying to create something a bit more punchy and dramatic and, and then you've gotta kind of move through the scene a little bit faster. So again, this is just over time things that you kind of learn. So it was, it was these videos and then the final was just the car kind of. Loading up, revving and, you know, speeding off the, and then, you know, just putting it all together. It was, it was pretty straightforward. Like I, let me share,

Isar Meitis

one thing this tool doesn't

Ross Symons

share.

Isar Meitis

Yeah. So, so the one thing this tool doesn't do yet is the editing, right? So you end up with all these clips and you gotta stitch them together in like an editing software. There's a gazillion of them. I am a fan of cap cut, but there's like a million other tools that you can use. Uh, and literally all you got to do is take the last frame, first frame. Now the other thing that I will say that makes it very easy is if you are the last frame of video one is the first frame of video two, and the last frame of video two is the first frame of video three. You have no job in stitching them. You literally just put them in sequence and it looks like a continuous video, uh, because there's no cuts. It's okay to have cuts if you want to, but if you don't want to, that's the way to get the video to flow, uh, seamlessly from one. Five second or ten second video to the next. The other thing, going back to what, uh, Ross was saying that I, that I found, uh, relatively easy to do is I always like to add music in the background. Mm-hmm. And then just squeezing the pace of the video. So making them a little faster, a little slower to align with the music in the background also makes it a lot more polished and professional.

Ross Symons

Totally.

Isar Meitis

It doesn't happen automatically yet. I would be really surprised if that's not the next, uh, thing that you can now say, oh, here's the music that I want. Just make sure the video aligns with the beat and whatever. And it automatically, right now there's some manual process, but it's not that hard.

Ross Symons

But honestly, I find that that manual process of putting the music in, putting the sound and matching it all together, to me that just feels very, maybe for some people it's not fun, but I just enjoy, so, yeah.

Isar Meitis

I'm with you.

Ross Symons

Yeah. So let me just show you the video. I might have to kill my sound just so the audio can come through. So just lemme Okay.

Isar Meitis

This is awesome. So again, for those who are not watching, is awesome. So again, for those who are not watching, it's, it's a full video of the car being drawn on paper and then it turns into the real car and then it changes directions, uh, of the camera, just showing it, uh, from different cool angles, and then it just, you know, takes off with, with, uh, with a lot of smoke behind it. So it's, uh, really, really cool.

Ross Symons

Just cool and fun, man. Yeah, and, and that's the thing. I think that it, it really has, um, allowed us to just, just make anything, you know, like, and

Isar Meitis

so, so I wanna, I wanna connect a few of the dots that we talked about together. Uh, first it means of concepts and then it means of, of practical usage. From a concept perspective, we said start with the end in mind, right? So what is it that you're trying to produce? And, and then reverse engineer from there. The other thing that we said is that reverse engineering, you don't have to do on your own. Like, let's say you don't know how to reverse engineer, you can do one or two things, or a combination of both. One is take existing examples or people already built something similar and start from that. Or ask the ai, so this is what I'm trying to create. Uh, here are the resources I have. Here are the resources I don't have. Or you tell me what I need else that I don't have right now. And it will help you figure out the process. So that's number two. Number three is, you gotta think from a. perspective, meaning, perspective, meaning, uh, uh, Ross showed us, if we go back to the original example, Ross showed us, he said, okay, what do I need? I need 12 angles of this thing. I need 12 emotions. I need this. This is what I need. What do I need in order to get there? Okay, so I need 12 prompts in order to How do I explain what are these 12 emotions? How do I create 12 prompts? Well, there's a tool that does it. It knows how to create 12 prompts from one prompt based on the inputs that I'm giving it. So now you, you gotta think through what the process needs to have in order for it to be the least amount of consistent with the least amount of work. For you.

Ross Symons

Mm,

Isar Meitis

exactly. And I think the combination of all these things of thinking about the end in mind, working together with AI using re preexisting, I'll say something else about the, the preexisting ca, like use cases. Something that I've done in multiple cases in wey, especially when I got started, now I kind of know what I'm doing, but when I got started, I didn't have a clue. So what I would do is I would find a process that is like a 20 step process and I needed two of these processes, two of these steps. So I would copy just these two steps into a brand new canvas. I'm like, okay, now I've got these two steps that do this. One thing that I needed to know how to do. I go and take another example that knows how to do another thing and it has 20 steps. I only need four. So you copy these four steps and connect them back. So it's, it's Legos. You can reuse whatever components that you want to use and just connect them in a different way to build something new because it's Legos. And so that's the other kind of like. Unlock for me was I don't have to actually find a process without everything that I needed. I just need this step that knows how to go from this to that, and then I can connect it to the rest of my process. Um, any other big thoughts from you that people need to know or think about when they're building these kind of things?

Ross Symons

um, you know I, I think you just need to start, um, you know, in terms of it, it's. to not Difficult to, um, to not think like, well, I mean, if I zoom out and, and look at this whole entire workflow, like when you, when you land on this, you're kinda like, what is this? Like, where do I start? And like, it's got all these blocks and these lines. I think that, you know, if you are gonna be using a tool like this, just start with the basics. Like, you know, like, so like you said, um, just having those two steps and then move from there to there and then just get good at that and then try and add on. But you intuitively, you actually very quickly work out. Ah, okay, cool. If you have some, I wanna say some basic understanding of, you know, the difference between, uh, an image generation model or diffusion model and a large language model and how the two of them can be used together to create better images. Um, but once you start using, just, just play, just have a structure, have, you know, have some, have an idea in mind. I think know what you wanna create, because the problem with not knowing what you, what you wanna create is you end up just using all your credits on. Nothing. 'cause, and it, it's, it's fun. I mean, it is, it, it does feel like gambling sometimes. 'cause you, like, you put in the prompt, you hit the button, you're like, Ooh, did I get the thing I wanted? And then you don't, and you're like, oh, okay, try again. Um, so I think having an idea, and this is what I've found over time, is because, you know, credits cost money and money is not always like, just abundantly available all the time. So thinking about like, okay, cool, I'm, I don't just dive in and start prompting, I, I really think about what it is I want to do. And then I test one example and I'm like, okay, cool, that used 10 credits or whatever. I'm like, okay, what am I, what am I using this for? Um, are those 10 credits gonna scale to like a hundred? So if I make 10 of these, that's it's a hundred credits gone and am I able to use those or am I just testing? And if I am testing, what am I testing for? Is the testing down the line gonna help me? Use only 10 credits next time, or am I still gonna always be using a hundred credits? So it's the scalability of it. Um, but also just the, I I do have a kind of, you know, a scarcity mindset sometimes, which is probably not the best thing, but it does definitely save me credits and it saves me process because I'm like, okay, I would rather, before I start working, think about, okay, cool, what can I, or which tool can I use for free? Or which tool can I use that I have unlimited access to, to help me get to where I want to long before I step into this arena and start playing around? Uh, because you can, you can test a lot of things. I mean, you get free credits on some platforms, but when you start to work in, particularly in We, 'cause it's not the, it's not the cheapest, um, tool to use, but from a cons, from a, um, a production perspective, in my opinion, it's, it's the best. It's, it's one of the best. And, um, but get here with a quite a clear idea and some prompts and some ideas as to what it is you wanna create.

Isar Meitis

I think that's a very, very. Good, uh, idea, right? So don't come here and kinda like experiment your way to a solution. Come up with a plan that you can develop with another AI that you're paying $20 a month for, regardless of how much you're using it. And then come back and work in here to a solution that is now, instead of costing you $50 is gonna cost you 10. Uh, and, and then once you run it, it's gonna cost you five to run it every single time. The other thing that I will say, going back to what we said in the beginning is I've done a lot of really just fun a, you can do this for fun. I've done a lot of really just fun projects in vy just because it's fun and then you just, you know, it's fun. You're paying, just like you go to the movies, it's not free. Like you go just popcorn, it's like $20. It's fricking ridiculous. Uh, but, so you can use it for fun, but when you use it for work, of how many tokens or the ROI is gonna be there regardless of how many tokens or credits you spend, because the option is not in the same. Three orders of magnitude bigger. Hmm. Like if, if you need to do a real photo shoot even of static objects and then pick out of hundreds of images and then edit them, and then it's just gonna be, if you're lucky and it's something small, thousands of dollars. If it's, if it's humans in a scene, in an actual place, it's, if you're lucky, tens of thousands of dollars. And if it's a huge production, it's, it's hundreds of thousands of dollars. Yeah. And here, if you go crazy and you run things on the most expensive models and you run them 50 times, it's gonna cost you a thousand dollars. You do, you do the same thing. And that's, if you really, really go overboard and, and use, you know, the latest video models and you run them multiple times, uh, it, it's still compared to the, the real thing is gonna be, again, two, three orders of magnitude cheaper. And so while I agree with Ross that come with a plan and think through how to minimize the cost. If, if you're running a real campaign, this is negligible.

Ross Symons

It's, it's, Yeah, exactly. It's, it's, it really, yeah. It, it doesn't even, it pales in comparison to what you would pay for a big production.

Isar Meitis

Awesome. Ross, this was fantastic, obviously, very well thought after, very well executed, and you obviously know what you're doing. Uh, as you mentioned, you don't just do this for clients, you also teach this. Mm-hmm. So if people wanna learn from you, follow you, work with you, uh, what are the best ways to do that?

Ross Symons

Yeah, for sure. So I'm, look, I'm very active on LinkedIn, so you can look for me. Ross, Ross Simmons, S-Y-M-O-N-S. And uh, the company that I am a co-founder of is Zent, so quite easy to remember as well. So Zen robot.ai. We have quite a few training, uh, uh, products. So packages, one being a four week masterclass that we run every month. Uh, we are not running it in April. So, uh, March we start, we kick off next Monday, uh, for the next cohort. Um, yeah, and those are, that's basically like the basics of, um, gen AI for content creation. So going from image generation all the way through to video and putting it all together using cap cut and suno and all the fun stuff that it, that it, that goes with creating content. Um, yeah, that's, that's where you can find me. But yeah, orRoss@zenrobot.ai is my email address.

Isar Meitis

Awesome. Thank you so much. This was really, really fantastic. I'm sure a lot of people is gonna find this very, very helpful. Uh, go follow us, go take his classes. It's, it's really, really good if you're in this field and you wanna learn what the, forget about the future, what the present looks like. 'cause I don't, I don't, I don't even know it's gonna come out tomorrow. Uh, we know. Yeah, yeah, yeah. You think of an idea and it creates the video for you. That's like the next, the next step. Right. It's like we're, we're almost there. Uh, thank you so much. Really fantastic. I appreciate you and thank you for sharing everything you know with us.

Ross Symons

Thanks for having me. Thank you