Leveraging AI
Dive into the world of artificial intelligence with 'Leveraging AI,' a podcast tailored for forward-thinking business professionals. Each episode brings insightful discussions on how AI can ethically transform business practices, offering practical solutions to day-to-day business challenges.
Join our host Isar Meitis (4 time CEO), and expert guests as they turn AI's complexities into actionable insights, and explore its ethical implications in the business world. Whether you are an AI novice or a seasoned professional, 'Leveraging AI' equips you with the knowledge and tools to harness AI's power responsibly and effectively. Tune in weekly for inspiring conversations and real-world applications. Subscribe now and unlock the potential of AI in your business.
Leveraging AI
148 | AI Tools Mastery: How To Choose The Right AI Model For Any Task with David Wilson
AI is no longer just a buzzword—it's a competitive advantage. But with so many tools and models, how do you know which one fits your needs? David Wilson, founder of Hunch, has tested the latest AI advancements across thousands of use cases and is here to share what works.
In this webinar, we’ll explore:
- How to match AI models to specific business tasks.
- Common pitfalls and how to overcome them with clever prompting.
- Real examples of workflows that save hours and drive results.
David brings unparalleled expertise, having built Hunch to simplify complex AI workflows. With experience running over 1,000 AI tasks weekly, his insights will give you the edge to implement AI with confidence.
About Leveraging AI
- The Ultimate AI Course for Business People: https://multiplai.ai/ai-course/
- YouTube Full Episodes: https://www.youtube.com/@Multiplai_AI/
- Connect with Isar Meitis: https://www.linkedin.com/in/isarmeitis/
- Free AI Consultation: https://multiplai.ai/book-a-call/
- Join our Live Sessions, AI Hangouts and newsletter: https://services.multiplai.ai/events
If you’ve enjoyed or benefited from some of the insights of this episode, leave us a five-star review on your favorite podcast platform, and let us know what you learned, found helpful, or liked most about this show!
Hello, and welcome to another live episode of Leveraging AI, the podcast that shares practical, ethical ways to leverage AI to improve efficiency, grow your business, and advance your career. This is Isar Matis, your host. And I am really excited about today's episode from multiple sources. different reasons. one is from a very personal reason. I'm a weak and I like new AI tools and especially cool ones that allows me to do a lot of flexible stuff. And our guest today is David Wilson. He's the CEO of hunch who has maybe one of the most unknown, and it's one of the coolest and most capable tools that exist out there today, as far as stringing and connecting multiple AI tools and other tools together to do. Basically, whatever you want. it's this really cool playground. But the other reason is that the focus of today's show is actually not going to be the tool itself, but rather what you can learn from it, which I find even more exciting. So one of the things that I get asked all the time, when I teach courses, when I do workshops, people just meet me and know me, CEOs of companies I work with and so on. All ask me about, okay, so which is the best large language model? which one should I use? And the sad answer is that it depends, right? It just depends. What is the use case? What is that you're trying to do? because each and every one of them has. Pros and cons. Some of them have longer context windows. Some of them are better in reading documents. Some of them are better in summarizing stuff. Some of them are better writers and so on and so forth, and it's very hard to figure it out. And so some people just say, okay, I'm gonna commit to one and I know it's gonna be okay, and that's fine. But if you want to get the best of breed and really get the best results across all of them, there are ways for you to explore it for yourself. And that's going to be our topic for today. How can you figure out which of the large language models, open source, closed source, big, small, whatever you want, which ones do better in specific use cases, and how can you figure it out? for yourself. And I assume if you're listening to this podcast, you care about large language models and how to use them for business. And so this is a very, very important topic that a lot of people are struggling with. And that's why I'm very excited about this. Now, David himself is a serial entrepreneur. It's not his first company. So he's been in the tech world and in running businesses for a while. So he both understands the benefits of The business benefits of this but also is somewhat of a geek like me And so he really enjoys playing with this kind of stuff and testing stuff around. So he's literally the perfect person to share his knowledge With us about this. So i'm very excited to welcome david wilson to the show david. Welcome to leveraging ai Thank you very much excited to be here. Awesome David to kind of like share two cents about yourself, about your product, and then we'll dive right into how to really compare large language models and what are the best ways to do that. Sure, I mean just just very briefly. I'm the founder ceo of hunch. We we're an ai workspace that allows people to Connect any ai models all the best ones, together to do Really much more than what they're capable of doing with other tools typical, Chat tools and stuff and we can get into reasons why that is the case rather than giving the full spiel up front and, I personally, over the last year, I just ran the numbers, a couple of days ago, I have run more than five, 50, 000, prompts in, in, in the last year with, like 50 plus different models. So excited to, to share what, what I've learned here. and what we can do is, because hunch is a way of actually showing different models. Like we can go and explore and just tell me like, if there's anything that you'd like me to run or any different types of models that you'd like to try, but what I'm going to do is I'm going to switch over to, to share my screen, and we can also. take a look at a little, you can start off with a little summary of the landscape of, of the different models if, if you want to, and just how we've seen kind of models evolve over the last 18 months to two years since ChatGPT came out. yeah, absolutely. I will say two things. First of all, for those of you who are listening to this and not watching this. we will share everything that's on the screen. If you want to watch this, then there's a LinkedIn version. there's a, there is a YouTube version of this that you can go and watch. for those of you are with us, live, then obviously you can, follow us on the screen on zoom and on LinkedIn. And if you're not with us live and you want to be with us live, we do this every Thursday. At noon p. m. Eastern. every week with a different amazing expert like David that is going to share stuff. So you should join us live because then you can ask questions and see everything that we're seeing. But if you are listening to the podcast and I'm an avid podcast listener myself, we're going to share everything that's on the screen so you can follow along. Awesome. Yeah. Thanks. thanks. so the, I think that the story of the last year or two of these models, since the original chat GPT is that they, there's been a, a huge, like development in different capabilities for different models. Okay, so there's the one of the biggest things that's happened is that the very best models have gotten right. So much cheaper, to use. So the prices come down, sometimes, 99%, 99% cheaper to run GPT-4 oh mini today than I was running GPT-4 when it came out about 18 months, 18 months ago. a second is that, we, there's now some models that are incredibly fast. and then there's also models that have developed like totally different. Types of capabilities. So we'll go through a few of those different ones, but I just want to start off by saying that, we can find the best models for all of these different kinds of capabilities, but like a general rule of thumb. I think it's been this way for about six months, but Claude Anthropics Claude 3. 5 sonnet is Probably the first place to go for, for a really good, large language model, for most tasks. I think it is the best model out there for writing, the previous sort of Claude III Opus, which is, More expensive but slower and it's theoretically part of this previous generation or half a generation back Is also still very good for writing if you just want to write things But in general for the majority of knowledge work that people want to do All 3. 5 sonnet is really good. it's pretty fast. It is, highly capable. It can code. It can, it's, it really is, extraordinary. Still, how good it is. So I don't know one thing about that. So I, in generally, I agree. I really like, 3. 5 sonnet. there are, our ChatGPT just came out with the latest version of 4. 0, like literally a few days ago. That is 4. 0 whatever, like they have several different versions of 4. 0 since it came out. And it's specifically upgraded on the topics of creative writing. And it took back the top of the leaderboard, like the, what is it called? The, LMSIS, AI large language model leaderboard for creative writing as well. So while I'm like you, and I'm very much from a personal believer, a fan of, 3. 5 Sonnet. It's always worth keep on testing out because they come up with these new models all the time and what was true That the day this comes out may not be true the day after it's episode comes out You gotta keep on testing but I in general I agree with you. I think cloud 3. 5 sonnet is an awesome tool yeah, and look I just like I'm very skeptical about, about benchmarks, because there's a whole lot of different reasons, I think, to be skeptical about, benchmarks. and I think that just the best way to really get a sense for yourself is to try different models and to keep trying them, try them with different prompts. like some, we can talk about prompting at some point, but I think just One kind of overarching tip that I, I see a lot of people using different models. I work with a lot of people using different models and, and I think that the mistake that people make is trying to, especially ones that have used like AI models quite a bit, is that they'll try and create a sort of an overwrought. Prompt, the very like from the beginning, like you are a whatever and create this like I think it's actually very easy to start testing models. just put in the tersest possible A request that you can write see what it gives you and iterate from there because not only does it sometimes or very frequently surprise you with how good the responses are To even very vague or requests with typos and stuff like that but because the models are relatively fast and affordable You can actually just try it again. Just You know, if it doesn't give you exactly what you want now, in what ways is deficient so you can iterate on your prompt. So that's a kind of like overarching tip, but we can get back to that. I agree with you. Let's go back to the list, go through it quickly, and then actually how we can test stuff. Yeah, sure. the, I would say the one, the one model that really exceeds, clawed in, in, in important ways is, is O one. And, O one preview is what's out at the moment. O the full O one model might be, might come out in the next week or two, but it's really good at math and, hard science problems. and coding it's better than Claude for some coding challenges. It is very slow though. So I've seen people switch over their chat GPT to 01 and then get really frustrated when it takes a really long time to, to respond. but I think how we became addicted to immediate gratification right when we say really slow It takes it 30 to 60 seconds to respond GPT or claude, six seconds to respond so it's not like you're gonna wait an hour You're gonna wait a minute In the worst case scenario Look, I do think that there is like gpt40 is like really clearly crafted by openai to support their chat product and as with very interesting trade offs made there, but like primarily towards latency, whereas O1 clearly is towards capability, reasoning capability. But I think something that is, that most people that I talk to haven't yet discovered with O1 is that You can get it to do a lot of work for you and the instead of prompting it with, with, with instructions of how to do things where we're Claude and typical LMS, you want to talk about the step by step to get to what you want. Oh, one, if you ask it for exactly the, what you want. And then almost tell it the appendix to give you it can give you it will do a tremendous amount of work in one shot, it might be slow, but it's it can do a huge amount but that is really I think that the model today that Is the hardest for people to wrap their heads around what exactly it can do. some of the other models look This almost like forms, the typical product project management thing Of, you can have things fast, cheap, or, high quality. and this is like how the models are bifurcating across these different capabilities where, you know, high quality or high, the things that it can do would be, Claude and, 3. 5 sonnets and oh one, and then, fast is, and cheap would be things like the. the llama models hosted on grok extremely fast, and cheap. I think, GPT 4. 0 mini is really affordable per token. and, pretty good capabilities really. It's not that far behind GPT 4. 0 for most things, in, in, in my view. and then you, we have the sort of. somewhat something of a strange, a strange family of models, which is the Gemini models, which have by far the biggest context windows, still, they have, they can, they have multimodal inputs. So you can feed it virtually anything, videos, images, PDFs. and they really are quite good at certain kinds of tasks, like there's some kinds of writing, for example, UX writing, that's kind of niche. I found it to be very good at. and Gemini 1. 5 flash, which is the sort of faster, cheaper version is very capable. and it's, really it's almost in a class of its own in, in, with respect to the capabilities, because of just. How large the context window is. Another thing that's very good at, I, in my opinion, best, across the models is image, interpretation and transcription from images. So it is, it's very good at a bunch of different, tasks. Yeah. I'll say two things about what you said. First of all, a great summary. those of you, you mentioned Lama running on Grok. Those of you don't know Grok. Grok are a hardware company that creates the fastest inference chips on the planet right now. so it's. It's computer chips that are not planned to train models, which still the best way to do that is GPUs from NVIDIA. But to actually run the models and they have taken on their platform so you can sign up for a license on the grok platform. So you don't need your own chips. You don't need to buy them. You can just use them on their data servers. and then they have customized Lama 3. 2. To run optimized on their thing and it's insane. It's worth there's a free way to test it out. You can just go there It's nothing like you've ever seen before two pages of output just shows up as soon as you hit enter. It's just Incredibly well, we actually we can put that to the test right now. So we have a prompt here With a butt connected to a bunch of different models, including, so a model from grok, and I'm just going to zoom in on and read the prompt very quickly. it's create the most surprising. I'm going to edit it a little bit so that this reruns for us. Create the most surprising insights. and LLM can come up with, make it as novel and inventive and yet as plausible as possible. So draw upon the most surprising and obscure of connections to yield a novel insight. Okay, this is a Before you hit go, I want to pause you to explain what we're seeing on the screen for people who are not seeing it. Yes. So this looks like a big canvas board for lack of a better term that you can zoom in and zoom out and move stuff around. and the prompt lives in this one box, but then there's lines connected to, I don't know how many boxes we have here, like 18. a lot. And each and every one of them is basically connecting to a separate large language model, which now is running this prompt. Yeah. Let me, sorry about that. I itchy trigger finger. I can, I'll re I'll zoom out and then I'll rerun it so we can see the speed, see exactly how it goes. Yeah. So we have all the different models that we just discussed as well as a, as well as a few others, connected. So there's, we have eight models here, so we also including Mistral, large. So Mistral is a, is a French company, a very interesting, model. it's also quite different. I wouldn't place it at the top of the leaderboard in any of those categories, but it is, it has very different kinds of guardrails to the others as well. So there, there's sometimes where everyone's surprised how you can get rejected by Gemini, for example, one of the other. like guardrails. I would say that's actually something that's gotten a lot better over time over the last 18 months across the model providers. and, which is makes a lot of sense and it's fortunate, but, mistrial is really interesting. I'm going to, I'm going to change this and I'm just going to rerun. I'm going to rerun all these models so you can see they're all running and the rock immediately fills up the fastest GPT 40 is behind that GPT 40 mini. I'm not sure why that is that hasn't come through yet. That's usually pretty fast. Sonnet was actually quite fast. and of course, OpenAI 01 is still thinking, even though all the different models have, were done and now it has yielded its, its answer. yeah, that didn't, I want to ask you. Since you gave us a quick preview of your product, this is really fascinating. Again, those of you not watching, we put a prompt in one box and we got answers in basically as many as we want to connect. And all we did is each and every one of those other boxes has a different language model connected with a line. So I have a bunch of questions. The first question is how easy it is to connect each and every one of these models. Is it as easy as just getting my API token and putting it in there? No, so actually it's easier than that. all we have we it's all of our api tokens, and you can just walk in and use it and just get going and like you don't even need to log in to get started you can go to our tools or app. hunch. tools you can just start using the product immediately and then after a certain amount of, of tokens, you need to log in at least and save your work. but yeah, you just get going. Very cool. Okay. so we discussed a little bit about the tool. We discussed a little bit about the models. How do you go and actually compare them for different scenarios and so on? And I think this kind of gives us an idea, but then how to actually compare, or how do I then combine the best of both worlds? If I have, oh, this model is really good at this. And this model is really good at that. And then there's something you already mentioned, there's a cost behind where, just to give people an idea once you start dealing with the API, because a lot of people hear the words, the letters API, and they, they get sick because they don't know what it means, but what it means is it's just, it's pretty simple. Talking in a server to server like connection language, which you don't need to know anything about the biggest difference from an end user perspective, using the chat interface versus using the regular versus using the API is when using the chat interface, you're paying a subscription fee. So you're paying 15 a month, 10 a month, 20 bucks a month, 25 a month, depending on which model you're using. And they usually don't cap you or they cap you on an amount that is not a problem for most users. When you use the API, you're going to pay per token, meaning every input you put in and every output you get out, you're going to get taxed on. Now it's very small numbers to give you a perspective. If you're losing LAMA, which you mentioned before, it's going to be, LAMA 3. 2 is about 3. 5 cents for every million tokens, both in and out. So for every, and those of you who don't know what tokens are, a million tokens is about 750, 000 words. So you're going to pay 3. 5 cents. Or 750, 000 words. If you're using multiple books, many books, that's a lot of books. Yes, that's on lava. If you're using the very far end of the scale is Claude three opus. And that's about 70, 7. 0 for a million tokens. So that's still not a lot, right? It's still paying, 70 for something to generate 750, 000 words for you. That's again, not expensive, but it's way more expensive than three and a half cents for the same amount of work. And so when you deal with these kinds of tools, the backend are running the APIs. There are several considerations, one of them is, what is it effective on? The other is, can I chain it together with the other tools? And the third is, okay, if two tools are the same, which one is, and because again, the differences could be significant. So with that, I will leave it to you to walk us through how do you actually do this magic and compare things and combine things and so on. Yeah. Yeah. I think, everything you said, is right. and, ultimately it's use case dependent, but what I, I think that this is why the kind of a couple of rules of thumb are really helpful. and the first one to me is that the best model for most tasks is called 3. 5. And you can access that, as you mentioned on claude. ai, you can also access that on, on, in, in hunched up tools. it's actually, even though we use the API, we've, what we're doing right now is we're making it free for our users. because it's really useful. we don't we don't train on anyone's data. We have you know, we have a Fairly as we have a strict sort of privacy security policies and stuff like that but what's very interesting to us is just how people use these different models and we're actually using hunch as a to like Ourselves and doing a lot of work with ourselves and we can get into some of that if you want to. but yeah, Claude 3. 5 Sonnet, I think is like a really solid place to start. If you want something cheaper, The like gpt40 mini is a really good a good place to start and between those two I think that you're getting a pretty good a pretty good set and honestly, it's like you just got to iterate from there You can try different models. You can use this template by the way this Everything that you see on the screen everything you just talked about. It's a template that is available We can share the, the URL, for anyone who's listening, anyone who's, we can put in the show notes if you like, so anyone can try it and iterate on it, them themselves to, to learn more. like we can go into like very specific kind of, where would you like to take this into the specific niches that different models are better at. yeah. Let's do this. And I think what will be interesting is two things. One is if you can show us. One or two examples or three of specific niches, and we specific tools do really well, and I think maybe we go extreme. Maybe we go, oh, one as a slow and yet deep thinking kind of solution versus a grok where something will happen in two seconds where speed matters or llama. Yeah, and then the other. And then the other thing that I think will be interesting is to see how in your tool, you can mix and match how you can actually go from, I do this step with this. And then I hand it over to this tool to do the other thing. I think that will be even more powerful because then I think most people think about it as either or like I'm either going to use Lama or I'm going to use Claude or I'm going to use Chachi Piti when the reality is. In tools like hunch you can combine different steps in different tools and get the best of all worlds Yeah, so let me let's do both of those things. So what i'll what we'll do is Let's start off with, this prompt. let's go back to this prompt. There's, anyone here can see that, that this is initially from Ky Chen who has a, he's the founder of a, a really interesting, company. And he is, he's automated a lot of stuff with, with ai. but I, what I'm gonna do is I'm gonna augment this prompt to, to say, follow. up with a table, detailing your, like all the, like maybe up to 20 potential insights and why this and an evaluation of each. Now, let me just turn this off quickly. What I'm going to do is I'm just going to look here at the current, if you see the current Claude 3. 5 sonnet. Answer. Okay. It is. It's just a paragraph and a little bit. The current 01 answer. Let me open that up. Is Longer sort of a page, maybe a little bit paragraph plus bullet points plus implications in here. Yes, then we have GPT 4. 0 Also similar kind of length and the reason why is that I'm looking at this is will be clear in a second, but and then let's go to grok Okay, grok is actually the longest it's just over a page the, when I say grok, I'm talking about the llama model. So now I'm going to, I'm going to just run this again and we're going to see, grok is going to generate the fastest. So let's actually just open that up. Now what's interesting is that the answer that it's given us now is shorter. It's about half the length. Of the original answer, because now it is creating a table as well. And they, so this is the answer. and here's the table. It's, and it's a little bit of information with within each one. This is a pattern that I would expect to see across the majority of the models with. So they're gonna give us one, but then the table on the bottom with one exception. Oh, actually 01 in this particular case, it's a little bit shorter than it was before, but the table is much more detailed. and it's something that is, is just that's something where I don't think that O1 in my experience is that great at like writing, for example, and creating sort of polished products, but it's very good at doing a kind of a lot of thinking and evaluating for you. research and summarization kind of work. Yeah, that's right. So that's that's an illustration of kind of what's good and what things are good at Let me go into a different Maybe it's something worth mentioning we have A we kind of wrap whatever we think is the best advanced model at the at a time We give our own system prompt to that and we call it just the advanced text model I'm going to open it up here. It's it gives a very brief, it's it's It's prompted to give briefer, more direct outputs without the preamble of Oh, here's what you asked for. And here's what to answer. So it's more like a business. But it's actually has like a secret chain of thought beforehand. Where it, it has, a whole series of thinking steps that, that we don't show, because that's not necessary for people, but it is useful for generating a higher quality answer. So that's something that if you're not using O1, I highly recommend telling a model. general LLM, to, to think beforehand be before answering because those initial thoughts, the way that these, auto aggressive models work is that, the every token that it outputs, all of its initial output helps to steer the subsequent outputs, you in, in, in some way. So if you get it to think from first principles or from the pieces that you want. it helps to get, get the answers that you want. and that's actually why chaining together different models is so useful and so powerful. I'm going to, I'm going to click through to a Different template. Let me just see if this is the one that I want. Exactly. Yes. Okay. So what this template that I just opened does is it takes a prompt. it runs that prompt on GPT 4. 0, Gemini 1. 5 Pro and Claude 3. 5 Sonnet. In parallel, it does a bunch of quote unquote thinking with Claude 3. 5 Sonnet, and then it critiques all the answers that it's, that it gets from all the different models. And then it pushes all of that thinking, all of the initial models, outputs, and all of that critique into an open AI O1 task. and we don't tell the open, we don't tell O1 where those different inputs come from. We just say here's some you know, here's some original ideas or thinking on the topic and now basically you take it from here And that's and that's a really effective way of getting interesting stuff So i'm actually going to go back i'm going to Take the original prompt just for the sake of getting I'm getting something and let me just go across. I'm going to put that into the instructions for 1 second again for the listeners of this. What we're looking at when we're saying about this new template is. Again, think about a dynamic kind of like canvas with a big flow chart, but in every box in the flow chart, there's also the input and the output. So it's not just what's going to happen, but the actual results from the actual language models show up in the boxes. And so when David is saying, okay, I wrote the prompt shows up in one box, but then there's literally three lines connecting to three different boxes with three different language models. That are then connecting to the unified universe where it puts everything together, which then all of them flow to the last box, which is Gemini, which is, oh, one, the last model that is now like taking the inputs from all of them. So you literally build your own workflows, but it's not just the workflow. It's also the user interface, the input and the output all are happening within this one canvas page. So you don't need to go. back and forth with different tools. you can literally work with all the large language models together with really sophisticated workflows, all in one place, never living, this. Universe and still enjoying the benefits of all of them together, which I personally find really powerful. Yeah, and it's what's really Great here is that if you have there's some tasks that you have that are really important and that You know you want the best possible Outcome for and so being able to feed all the different language models at once is super useful. I'll give an example that I do all the time I'll actually give two examples and then we can dive into your process But one example is when I think about content ideas Or new things I want to add to the course. I want to brainstorm so I use a tool So the tool before a new hunch The tool that I was using all the time, and I still use it, is called, ChatHub. It's a Chrome extension that is a paid Chrome extension that allows you to open several different language models all at once and run the prompt once and like chat with all of them at the same time. And The benefit is when you're doing brainstorming, you now have an advisory board, right? Now I have, and Claude and Jim and I get ideas from all of them, which is fantastic, right? You just get more ideas and some of them are going to be the same. And some of them are not going to be the same, but the cool thing here that ChatHub does not allow you to do is to then pick up the best ideas based on whatever criteria you define or. Summarize it where you don't have all the duplicates so you don't have to read all of them You can now create a version that's going to be a unified version That's going to take the best of all worlds and do more work for you So some of the work I have to do with chat hub I don't have to do here because hunch itself can do it for me Which again just saves you more work, which is the whole point in using ai tools Yeah, that's yeah 100 actually brainstorming is one of the best You Reasons to use multiple models because each of the models come up with different ideas and it's, it's great to get like quick ways of doing that. So we, not start to make that at this point specifically about hunch. But what we can do is that template that we just created. Or that we just opened with all the best models together. that's that we can access as a single block if we want to. and, run the sort of prompts, easily on another canvas, but we have something similar, which is for brainstorming with all the different models as well. And you can basically run that in the same way and it is super powerful But whether you're doing it In a tool like this or whether you're doing it across multiple other tools or in a chrome extension like chat album Which sounds very useful it's just using multiple models and seeing the outputs is a really good way of getting a sense, just like you get a sense from talking to people, what their responses to things are going to be, you get a sense from working in a team, who's going to respond, who's going to be the best person to do different things. You get that sense very quickly. From working with different models and the more you do it, the more attuned you are to when new models come out, trying those very quickly. people talk about a vibe check from models. it's absolutely real very quickly within a couple of prompts, you can get a very good sense of a model's capabilities, what it's going to be good at and what you want to use it for. I agree. Ah, API overloaded. We may have overloaded Anthropic with all the requests. There we go. yeah, here's a brainstorm from a bunch of different models, basically. That's very cool. anyway, we have We've spoken about text models this whole time, there's other very useful categories of models. And I think one of the, one of the big stories in AI for the next year plus is going to be the disappearance of these different categories of the distinction between them. this is, clearly one of the. One of the things that, GPT 4. 0 is designed for is multimodal, being able to take in anything and output anything. But it's, a lot of those capabilities actually aren't available yet, whether it is in, in the chat GPT product or by API. So we're gonna, we're gonna see more of that. But for now the best way of creating different sorts of images or different, you know Doing different things is with very different types of models very different models we could skip to into images text to speech Image generation stuff. All right Great. so I think that there's three images, three image models that are, that I'd like to mention, and really a kind of, among them, I think two are really notable. So the first is, it's a pretty new model. It's called flux, 1. 1 pro, from, black forest labs. and it really is, a very good, text to image model. They've now released a bunch of other image to image models and other models to there's so many, to info in like details into images and do other things, transform images, upscale them and, outpaint and do all of that. This is just the. the actual image generation. and you can actually use this model on, like you, you can on, on, on file dot, do ai, but it's also the model, I believe it's the model that's used in, in, in grok with a k on x.com. You know the Xai model? It is, it's the model because it's a really powerful open source model and it's now behind the driving the image generation of. Grok with a K. It's driving the image generation of perplexity. It's driving the image generation of mistrial. It's driving. So yes, it's now through many different sources. And yes, it's an awesome tool. it's it's awesome. it's pretty good with text. it's it but it really is, it Very good. schnell is a different kind of part of the family, from flux one schnell And that is a kind of really fast and much cheaper I think that's this is one of the things that image models over the last, year plus have become much more capable But have not become And it's, it's still relatively similar kind of price per image. with some of the models even getting more expensive. and then the last one is, is ideogram, two. It is, it's one of the models from ideogram and it's really, it was designed for, For typography, it's a general image model and, it can generate all kinds of images, but it is the best today still for trying to mostly accurate text, which most image models really struggle with. and so those three together are, like really the kind of top individual models. Obviously there's. There's mid journey. you can't really talk about image, text to image without mentioning mid journey, which is, it may still be the sort of best underlying model, but it's really, it's not available anywhere except in the mid journey UI which is undergoing kind of development at the moment but the gap between mid journey, which was it used to be huge between mid journey and the next best model Has really narrowed significantly between mid journey and flux I agree with you a hundred percent. I think the biggest benefit that mid journey still has is control right because now they've developed their website and there's actual better user interface that they have developed and you can do a lot more stuff And they have their parameters. So I think if you're a, if you're an advanced user, then mid journey still gives you more ability to control the output, but I also agree with you that from a pure quality perspective, flux 1. 1, definitely the pro model is. Up there with Midjourney as far as the, what the output looks like. Yeah. And it's gonna be really interesting. There were the sort of a very recent, announcement and demo of about like a text to world models. I've heard Midjourney is also developing something like that where, you know, it, you don't just, it doesn't just create an image. It's an immersive, a an image that gets turned into kind of an immersive. World that you can then explore on. that's also really exciting. It's going to be. It's going to be interesting to see how things develop from there. today it's really mostly text image. I want to see your example of the image generation. But then there's a question from Gwen, which I think will be very exciting for everybody. So you talk to me about your example with the image generation, and then I'll ask you a question that is related to text and data in general. so here's, I think it was the first prompt that I put in here and it was, I think it's like a pretty, useful comparison of the outputs that you'd get. So the prompt was a holiday greeting card, very simple. and we fed that into those three image models that we talked about. Flux 1. 1 Pro has created a really like quite, quite a beautiful scene. and, it's a sort of. almost a fairy tale kind of feel with stars and the trees Kind of an almost an old timey feel flux one one schnell has Created something that looks a little bit more like a typical holiday card Even down to a little url a fake url in the bottom which doesn't really make sense but it's, it is misspelt happy holidays, mapping holiday. Yes, exactly. So I think this is inadvertently a very good example of the state of the art at the moment, but the, or the state of the, so this model and these sort of faster models. Whereas, Igram two immediately first shot has created a really nice little, you knows, Santa like illustration of Santa a little boy and the accurate Merry Christmas, speech bubble. so something else that's, that we do just like we can put together really good. as chains of models with text, we've done the same thing for images, where, what it actually does is it takes a prompt, brainstorms ideas using, using llama and GPT for a mini writes a prompt using Claude and then generates an image using flux 1. 1 pro. And that's what over here in this sort of final panel, which we could. Which we can rerun, you can see how it, how it works if you're, if you're interested, so just brainstorms of the different models and creates one. but yeah, a few things about one thing about this, and then the question, and then I think we'll be done because we touched on many different things. about this specifically, this is something I do today quasi manually when I create presentations, right? So when I create a presentation, I will use any of the tools, doesn't matter. Usually Chachapiti for this particular reason, and I'll explain in a minute to brainstorm the flow of the presentation and what needs to be in it and so on. And then I will ask Chachapiti, I said, okay, What do you recommend should be relevant images for each of the slides that we just discussed? And the reason I do it in chat GPT is because it understands the context of the overall presentation, which just the image generation tools don't. And so it will give me ideas and then I will pick up the ideas that I like, or I will run away with those ideas for. Just, it gives me a creative idea and then I can further develop that. And then I asked Chachapiti to create the images for me. And when I get stuck, because Chachapiti's image generation is not good enough, I would then take those two images and go to Midjourney or, Flux and create better images for one or two images out of the 25 that I'm generating. And what you just showed really does it in. One step, right? I can literally build a processing hunch that will do all these things that will give me multiple options for the images Right the first time I click the button and we'll generate the outline in the presentation and so on and the images and the examples Across multiple tools, which is just for that one use case that I mentioned is nothing short of magic. I'll go back to the question, which I think is a very interesting question is, can you use it to check data in order to reduce hallucinations? So can I say, okay, run this through this model, then run this back on the actual data that I gave it to actually check. But the data is there to give me a verified or a better checked outcome of the output I gave it. Yeah. yes. and I, and this is actually a really interesting, topic that I love, I love talking about, but, this, I think that for a lot of, I think hallucinations are becoming less and less of a problem. and in fact, I think for the majority of tasks that, that we have, it's Really you don't really need a hundred percent accuracy because you don't actually get that from humans anyway. I think like we are conditioned to, to want that whenever there's a computer involved in a task. But if you give if you ask a friend to do something for you or someone else to do it, it's not gonna be a hundred percent probably. So there's there, there's that kind of I think expectations, a adjustment that's like. Actually reduces one's stress a little bit when dealing with these things, but yeah, absolutely. I think that this is the advantage of using multiple different models as well. Like at the same time, like in parallel and also as a check. So if it's something that's really important to get right, you can then prompt a model to say, okay, like here's the answer, double check everything. And write out, what's, what is, what it may have missed. I think that where people keep, seeing hallucinations is honestly, GPT 4. 0 in ChatGPT Is just not as it isn't as good as avoiding hallucinations as Gemini 1. 5 pro and called 3. 5 sonnet for sure in my in my experience So I think people experience the problem more than they need to but yeah getting other models to double check output is great and an example from kind of history is that when you had people this is like Before computers were widespread, if you take civil engineers, for example, you're building a bridge, like you'd often have two teams or multiple teams of people doing all the same calculations in parallel. And it's only if you get to the same answers, do you have, a hundred percent confidence. workflow by running the same prompt twice or by running it with different models and seeing the outputs. Fantastic. David, this was a really valuable be really exciting. I think. And there's a lot of comments that you're probably not reading because you're talking, but people are saying, Oh, my God, this is awesome. I can use it for this. I can use it for that. This is so exciting. So people definitely are enjoying the idea if people want to find hunch if people want to find you if people want to connect with you What are the best ways to do that? Yeah go to hunch. tools And from there, connect with us on discord. we're we're on discord. we have a great community there we're on linkedin on twitter. I'm david db wilson at like on twitter you know would love to connect awesome. Thank you so much Thank everybody who joined us We had a bunch of people on linkedin and a lot of people on the zoom great participation and conversation all across. So again, if you are listening to this after the fact, come join us on Thursdays. We do this session every Thursday, unless it's a holiday or I'm traveling for business. And even then, usually I try to squeeze it in, and come and join us so you can do this as well. And again, David, this was. Fantastic. Your tool is really unique and fascinating and very valuable to many use cases. thank you so much for sharing your knowledge and your time with us. You're welcome. Thanks very much.