Leveraging AI

71 | OpenAI's GPT-4.5 Turbo leaked launch, E.U. Passed the World's First Comprehensive AI Law and many more important AI news for the week ending on March 16

March 16, 2024 Isar Meitis Season 1 Episode 71
Leveraging AI
71 | OpenAI's GPT-4.5 Turbo leaked launch, E.U. Passed the World's First Comprehensive AI Law and many more important AI news for the week ending on March 16
Show Notes Transcript

In this episode of Leveraging AI, Isar Meitis shares the hottest recent news in the AI world.

  • AI's role in documenting conferences
  • The impact of GPT 4 Turbo and its availability
  • Microsoft's new GPT builder and its integration with office tools
  • Anthropic's Haiku model and its speed and vision capabilities
  • The competitive landscape of large language models with Claude-3 and GPT 4.5 Turbo rumors
  • The ethical concerns and advancements in AI-generated videos
  • The evolution of image and video generation tools in AI
  • Enhancements in Google Slides and CRM tools integrating AI
  • The future of humanoid robots in the workforce

Check out Claude's Prompt Library here

Take Action Now!

Don't miss out on the future of AI. Subscribe to our newsletter for the latest insights and updates. Stay ahead in the AI revolution!

About Leveraging AI

If you’ve enjoyed or benefited from some of the insights of this episode, leave us a five-star review on your favorite podcast platform, and let us know what you learned, found helpful, or liked most about this show!

to a weekend news episode of Leveraging AI, the podcast that shares practical, ethical ways to leverage AI to improve efficiency, grow your business and advance your career. This is Isar Meitis, your host. I just came back from a few days in New York where I was doing something very interesting at a conference. In addition to doing my regular talk that I do in many conferences as a keynote AI introduction. I actually was helping them to document the conference through AI tools. So we connected video cameras and microphones to all the speakers and the attendees that were asking questions from the audience and the conversations, and we documented and summarized everything in real time, releasing short snippets of what happened during the conference to the community very quickly, shortly after each and every one of the sessions. I haven't done anything like this before. It was challenging from a technical perspective, but I'm glad to say that the results are very interesting, and I haven't seen anybody do that before. if you're running a conference, and you're looking for ways to expand the impact of the conference beyond what you're doing, that's an interesting approach that you can definitely take.

Isar Meitis:

This episode is brought to you by the AI business transformation course. It's a course we've been running successfully since last year, we're running two courses in parallel every single month, and we're fully booked through the end of March, but the next cohort opens on April 1st and you can sign up on our website. There's going to be a link in the show notes so that you don't forget. If you're looking to advance your understanding in AI to advance your career or to drive growth in your business or your department, it's an incredible course that I've been taking by hundreds of people at this point that are all leaders in different businesses. So check out the link right now To this week's news.

We will start with Microsoft. Microsoft made two announcements this past week. One is that they are now including GPT 4 Turbo, which is OpenAI's most powerful model in their free co pilot access, which is very interesting from two different reasons. One, it was previously only available to their pro subscribers who pay 20 a month, which now is available to everyone. But the other reason that it's weird is because it's only available to OpenAI users who are playing 20 a month. So basically, you can now get access to GPT 4 Turbo either by paying OpenAI 20 a month, Or by using co pilot for free. So this interesting frenemies or weird partnership of between competitors between OpenAI and Microsoft is continuing to be interesting. Those of you who don't know, GPT 4 Turbo is the most powerful model by OpenAI. It was released in November of 2023. It has the largest context window of 128, 000 tokens, which is about 100, 000 words, which is about a 300 page book. So as I mentioned, if you're interested in using this tool, you don't have to have the paid version of OpenAI, ChatGPT, you can now just use it on CoPilot for free. In addition, Microsoft announced that they're rolling out a GPT builder within CoPilot. This was announced earlier this year in January, but now they're actually starting to roll it out to various users. Same kind of thing, it's first going to be rolled out to all the paid users, both the business users and Personal users, but it's very similar in capabilities to the GPT store by using the paid version of ChatGPT. Now, I think right now, it does not provide any additional benefit to using OpenAI as GPT as is. Where I think this will get very interesting is when they will start integrating GPT capabilities together with the operating system itself and the office suite. So think about the capability to include documents or Excel files or things from the operating system such as computer setup and things like that within GPTs, including PowerPoint presentations, et cetera, that will be extremely powerful. And there's no doubt in my mind, that's the direction he's going once this is available, I think there'll be a huge benefit to using a GPT builder within Microsoft versus just using it on the OpenAI platform. From Microsoft and OpenAI to the other really successful model that came out recently. So I told you last week that Anthropic released Claude-3. Claude-3 is their latest model that is in its most powerful version competing and some people say, including myself on some use cases, is better than GPT 4 Turbo, which was the reigning king of large language models for a very long time. Anthropic announced three different models, but they only released two. And this week they finally released the smallest version called Haiku. And Haiku was optimized for best performance on speed and cost for enterprises. That's what they had in mind. So low latency and fast speed. If you want to know what high speed means, it can process 21, 000 tokens. So about 30 pages of data in just one second. So this is obviously very impressive and it's going to be probably the fastest, good, strong model out there that's available to the entire public in easy to access ways. In addition, they're saying it has advanced vision capabilities, allowing it to process and analyze visual data such as charts and graphs and photos in that speed while maintaining all the enterprise safety measures that Anthropic is known for its enterprise solutions. It is going to be available through Anthropic's API. And for Claude pro subscribers through the Claude AI interface, it's also going to be available on Amazon Bedrock and shortly after on Google Cloud through their Vertex AI platform. So the goal is obviously to have enterprises and individuals have access to this new model through any interface and any platform they want. I really like Claude and I use it all the time for many different things. And having this new capability will be interesting to test. I personally do a lot of visual analysis things for myself and for some of my clients. So I didn't get a chance to test it yet. As I said, I was in New York at a conference, but I will test it out and I will share my results with you. Another interesting thing that Claude did this week is they released a prompt library that they have created with about 60 prompts that do a lot of different things from personal well being to business to nutrition, fashion, leisure, etc. It's available and I'll put the link in the show notes so you can get access to it. Just a few interesting examples that they released. One is a Punderful Adventure, which allows you to create puns on any topic that you want. Another one allows you to make meals so you can tell it what your nutritional needs are or dietary preferences and what ingredients you have in your home and it can offer customized personalized meals and recipes that you can use with the stuff that you have according to your preferences. They've released a website wizard so you tell it what you want to have on a landing page and it will create the entire landing page for you including html, javascript and css. A huge variety across multiple aspects, and I think they did it for two different reasons. One is to show people how capable Claude-3 is across multiple aspects of life, but also as maybe a way to compete with GPT is showing you that you can create quote unquote very specific customized use cases without really using a GPT, but just using a prompt library. I would be really surprised if they don't come up with a GPT like creation tool sometime in the near future, but for now, very powerful, very capable, huge variety of models from Claude-3 available through multiple aspects and with some tips on how you can use it. So go check it out. Definitely worth it. Now, in something that might be the response to the release of Claude-3, there has been leaked information about GPT 4. 5 turbo being released. Now, that information was not released by OpenAI. Was actually indexed by search engines by Bing and Duck Duck Go before an actual official announcement was released. But when trying to follow the link to that page, it goes to a 4 0 4 page. For those who don't know what that means, it's basically a webpage that doesn't exist. But in the text from OpenAI, it says that it's going to be the fastest, most accurate and most scalable model to date. This is obviously really exciting and in another interesting aspect, as far as when that might happen, there is an interview scheduled between Sam Altman, the CEO of OpenAI, And Lex Friedman, one of the most known podcasters out there, and that interview is scheduled to the one year anniversary release of GPT 4. So there are a lot of rumors that might be the date when they announce GPT 4. 5. In some of that information, it says it's going to have an open window of 256, 000 tokens, which is about 200, 000 words, which is double what GPT 4 has right now. Now, the interesting thing is that these rumors were denied by people at OpenAI who are suggesting they're most likely going to skip 4. 5 and go straight into GPT 5, which they also announced this week that has finished its training and now is going through a red team process. So I don't know, obviously, which one is true. We'll have to wait and see. Either way, we need to expect OpenAI to release something either in the very near future if it's 4. 5 or sometime later this year with GPT 5 that is supposed to be a complete game changer from everything we've heard so far. Still on the topic of OpenAI, Chief Technology Officer Mira Moradi was interviewed about Sora and there's goods and bads aspect of that interview. So on the good side, she stated that they're definitely releasing it this year, most likely within the next few months. She also mentioned that they're going to integrate audio into Sora, allowing it to have a more realistic film. They're also looking for ways to allow users to edit the content of the videos generated by Sora, which I find very interesting. Right now, these videos are what they are. And if you can go back and say, I want to change this and that in a specific scene, it will be extremely helpful powerful and I definitely something that a lot of people can use that I would like to see OpenAI release. Now releasing Sora obviously raises very significant concerns on the usage of very realistic videos. In general that are generated by AI, but especially on an election year. This is going to be the largest election year in history. More countries around the world having elections at the end of 2024 than ever before, including the U. S. And there's obviously serious concerns on how fake videos that look highly realistic can be used to manipulate the elections. This is something we need to be aware of in our lives in general. You heard me say that before. Don't believe anything that you see on digital media or social media and so on. Because it can be fake and there's really no way to tell Right now. And in the same way don't share this kind of information before you verify that information is actually real because it may or may not be and it's something we'll have to learn how to live with at least until somebody figures out a solution. In that interview, Maura Moradi also shared that they are going to, in the beginning, limit SORA's capability to create images of known figures and that will have a watermark in order to fight exactly these kind of issues. That being said, watermarks are a very limited function because you can just crop the video to not show the watermark and then release that. And we already know that there are workarounds to make these models such as DAL E create any images despite their limitations just by working around them and wordsmithing what you're trying to get. And so it's definitely raises a lot of questions that I don't think anybody has answers to. But now beyond this negative aspect, the other negative aspect that is somewhat of a scandal, Maura Moradi was asked in the interview what data did they use in order to train Sora that if you haven't seen Sora, Go check it out. It's absolutely mind blowing. It's highly realistic, resolution, full minute video that comes from a very short prompt, and the outcome will absolutely blow your mind. So she basically said, that they trained it on publicly available and licensed data. And the way the interviewer was trying to push her and ask her does that mean YouTube. She said? I don't really know. We just use publicly available and licensed data. So she kept on and asking her about other sources as Instagram and so on, and she kept dancing around it, saying that she doesn't really know. Now I must say that my opinion on this is she a) if she doesn't know, it's a very embarrassing position to be as a CTO of a company that does these kind of things. So I have to assume she knows. And it means they just don't want to share where they got the data to train these models. That will be obviously very problematic, especially that the EU just finally signed into law their AI act. And one of the things that the AI act says very clearly is that the developers of these models will have to share how they've trained their models and which data they've used. So while I think right now, Open AI is trying to avoid sharing their information in order not to get into battle with whoever they're going to get into battle with. They already have several different lawsuits against them because of scraping data. The most known one is probably the lawsuit from the New York Times. But I do think this will run into issues releasing it in the EU. And I hope similar laws will come to the US, which means they will have the same problems here. So while Mora and OpenAI are trying to avoid sharing how they train Sora, I think they won't have a choice but to actually share it, which may open a whole other can of worms. Staying on the topic of video, PikaLabs, which is one of the most advanced tools right now to create videos just released another sound related feature. So now you can add sound effects into your video in the prompt. This is not the first audio thing that they've released. They've also released the ability to do lip syncing and enables users to create video where the characters actually speak in the video through voice. But these sound effects can mimic real life effects of things like sizzling bacon, roaring lions, and footsteps to enhance the realism of the video. If you look at the demos, they're generally impressive and they're relatively realistic. The only problem is that their sync to the video is still not perfect yet, but I'm sure that the next few versions will enhance these capabilities until we get to the point that will be absolutely perfect. The next big thing that these AI video companies are working on is the ability to create a image to sound prompt. So basically use the video as the input to a sound generating a I model that will be able to create the sound and the effects and so on for the video just based on what's happening in the video without having to manually prompt it. So the race is on. I told you several times that 2024 is the year of video in AI and everything that we're seeing lead to that with companies like PicoLabs and Runway and LTX Studio and FinalFrame. ai and obviously Sora all moving very fast and adding more and more features. Still, I think the only thing I haven't seen solved it in this thing is consistency, meaning the ability to have the same character across different scenes, the ability to have the same background across different scenes, lighting, and so on. So this is the only problem, by the way, it's the same problem that we see in the image generator. Like you cannot regenerate an image of the same thing from a different angle. It literally just regenerates the image, trying to mimic it, but it's never exactly the same. And so I think this is the last hurdle that needs to be resolved. Now that I've seen Sora's capabilities to create really highly realistic long form videos. I think once consistency is solved, this will be able to completely democratize the creation of videos from right now, needing videographers and lighting people and camera people and editors and actors, et cetera, to anybody will be able to create any kind of video they want for any purpose using their own computer. And going from video to images, Midjourney had an outage on Saturday night, and Midjourney shared that this outage was caused by botnet like activity that pointed back to Stability AI employees, which basically what they're saying is that Stability AI are scraping MidJourney's existing database of images and prompts that is available to anybody who is a user in order to train their models. Stability AIs CEO denied these allegations but said he's going to open an internal investigation and Midjourney founder David Holtz said that they have provided them information that will support their internal investigation off that thing. This is not a big surprise to me. And I must say it's even ironic because both mid journey and stability, I was scrutinized and criticized before, including facing different lawsuits for using scraping to get the data, to train their models to begin with. So the fact that they're now scraping each other's data is not much different from anything they've done so far. Only this time they got caught. That being said, Midjourney is still the leading model out there. And from what I've heard, Stable Diffusion 3 produces as good images as Mid Journey as well. So both of these models are extremely capable, and obviously there's a fierce competition between them. Whether it's legitimate or not, you can decide for yourself. But this is currently the situation in this battle to dominance in the image generation field. Staying on images, but going to Google, Google is adding more and more capabilities to Google slide, which is its presentation tool. They have now added a capability for all Google Gemini paid users to remove background from images straight within Google slides. This is a very useful feature that I'm actually personally very happy about the way I'm doing this right now is every time I have an image that I want to use in Google Slides, which I use all the time, and I need to remove the background that either go to some kind of an external tool, whether using it in Canva or in remove.BG, which is a website that does it or on Mac itself, there's the ability to remove the background from files by right clicking on them, but being able to do this straight within Google Slides will be a great time saver. This comes shortly after Google announces that you can now record yourself to show up in a little bubble, speaking overlaid over the slides built into Google slides as well. I must say, I'm really excited about all these things because if you think about the office suite as a whole, whether it's from Microsoft or from Google or from somebody else, it was more or less stagnant for the last, I don't know, 10 years, nothing new has happened. And now this wave of AI capabilities is finally putting some new capabilities and new efficiencies into these tools that we all use every single day. And I think it's a very good step in the right direction as far as creating day to day business efficiencies. Speaking of business efficiencies, there is a CRM company from Boston called Criteo, I hope I'm pronouncing it properly, that have a CRM software, and they have now launched A whole set of large language models integrations. It comes out of the box with solutions and capabilities for sales and marketing and customer service and things such as intelligence, customer storing and campaign flow design and personalized response generation and so on and so forth. Really everything you need within a CRM. Built out of the box, but they're also introducing a copilot studio, which enables users to create their own little mini apps to do basically anything they want using the data that is inside the CRM. This is something that both Salesforce and HubSpot said they are going to do. Salesforce started doing HubSpot, not yet, but I definitely see every single platform that we use regularly coming up with these capabilities. What I really like about the solution that this company has now announced, and again, I think everybody will go in that direction that they both giving out of the box, here's a canned approach to do one, two, three, or four, but also providing us this Co pilot builder or studio to allow any company, any user to build whatever use cases they want, like GPT is by ChatGPT, and this will allow every person in every company in every department to build workflows that they need specifically, which will make efficiencies within businesses even more impactful than just getting something that was more generic. So expect to see that in probably every single tool that you use, especially the bigger ones. And from just software to talking about hardware as well. So there's more and more advancements in the last few weeks and even in this past week from some of the leading robotics manufacturers from the humanoid robotics aspect of things. I will share three pieces of news that I find fascinating and that are very interesting. First of all, Figure is a company that came out of stealth in 2022. They generate humanoid robots and as part of their announcement of their series B, they've also announced a partnership with open AI to power the cognitive aspect of their robots. And they just shared a bunch of videos that are absolutely amazing on the capability of the robot to perform different actions that it wasn't capable to do before the integration with open AI. in the videos, you can see the robot conversing in real time and acting based on various requests and performing different tasks together with a person that's standing next to it. All of this in a relatively short amount of time, assuming that really the announcement of the integration with OpenAI started where they announced it. But even if it started a little earlier, it's still extremely impressive. In another announcement, Mercedes, the car company, has announced a partnership with an Austin based robotic startup called Optronic, and they are looking for ways to test the humanoid robots generated by the Austin based company at their manufacturing facilities to perform what they're calling, and I'm quoting, Automate low skill physically challenging manual labor tasks. So if you combine these two pieces of news together, it shows you that the robotics world is catching up very quickly. So companies who could generate the hardware so far, and there are more and more of them, are now integrating with the capabilities of these large language models and are starting to perform day to day tasks, whether house tasks But also manufacturing task. What does that mean? It means that while we're looking at a huge transformation or revolution or call it whatever we want to call it, but a huge white collar knowledge work, and this is going to be a tsunami that's going to wipe out so many jobs that we have right now. Within the next few years, it is also going to go after probably not that much later, which means still within the next three to five years after blue collar jobs, like manufacturing jobs or cleaning jobs or any other field physical job you can think of with these robots as they will become cheaper and be able to do more and more things. So what does this mean to society? I don't think anybody really knows. I think the impact on knowledge work is going to be almost immediate, meaning within the next three years, we'll see incredible advancements that will allow AI to do most of the knowledge work we do today. I didn't share that with you, but about two years ago, a quote from Sam Altman was released. And Sam Altman in that interview was asked about, what is going to be the impact of AGI on marketers specifically? And I'm quoting 95 percent of what marketers use agencies, strategists, and creative professionals for today will easily, nearly instantly, and at almost no cost be handled by the AI and the AI will likely be able to test creative against real or synthetic customer focus groups for predicting results and optimizing again, all free, instant and nearly perfect images, videos, campaign ideas. No problem. Now, when asked when this thing is going to happen, he said about five years, give or take. And now I want to elaborate two things on this answer. One, he was asked specifically about marketing, but I must say that I'm pretty sure this relates to any knowledge work. So that's problem number one. Problem number two, it doesn't happen in five years. It's continuous advancements that happen almost on a weekly basis that are going to gradually get us there. So within the next year, and then the next year and so on, more and more of these tasks will be able to be performed perfectly. And in most cases, better than humans across every knowledge work. So as a society, nobody is ready for what's the implications of that across everything I can think of. And this could lead to very few companies generating huge revenues for being able to drive these successes. While most people losing their jobs or a huge unemployment, probably bigger than we had in the Great Depression in the 1920s. So where is this going from a social perspective? I don't know. From a technological perspective, it's obviously really exciting, but it raises a lot of questions. And now if you add robots to this mix and you think about the things that robots will be able to do, and that's maybe not in three years, but maybe in five to seven years, then we have an even bigger impact on society and work and personal fulfillment as we know it today. So if you never thought of that, I apologize if that's terrifying to you. But either way, I rather you Knowing and being ready and starting to think about it and maybe acting and talking to people. So jointly as a society we can come up with solutions on how to benefit from the positive aspects of this and hopefully avoid or at least reduce the negative aspects of this revolution. There are a lot of other news that happened this week, but I don't want to make this episode too long. So check out our newsletter or join our Slack channel and you can get access to all of them. And on Tuesday, we have another fascinating interview episode coming to you, so don't miss that. And until then, have an amazing rest of your weekend.