Leveraging AI

182 | Sam Altman’s Vision of the AI Future, ChatGPT Goes All-In on Vibe Coding, Enterprise AI Disruption & more Top AI News form the week ending on Apr 18

Isar Meitis Season 1 Episode 182

Is the AI race moving too fast for its own good?
 

In this episode of Leveraging AI, we unpack Altman’s revealing TED Talk, OpenAI’s silent safety rollback, and why ChatGPT just went all-in on vibe coding, enterprise disruption, and social media dominance.

This week’s AI news is anything but boring — and has massive implications for business leaders.

 In this session, you'll discover:

  • Why Sam Altman believes ChatGPT will become your lifelong digital companion — and why that’s both exciting and terrifying
  • What Altman didn’t say about AI safety and the future of AGI
  • How OpenAI is aggressively chasing developer dominance with its new GPT-4.1 family
  • The $3B coding acquisition OpenAI is reportedly chasing — and what that means for the future of software
  • Why OpenAI skipped publishing a safety report for their newest models (yep, really)
  • How ChatGPT is crushing TikTok in app downloads — and quietly testing a social network
  • The enterprise AI gold rush: from supply chains and email to avatars and customer service

👉 Fill out the listener survey - https://services.multiplai.ai/lai-survey
👉 Learn more about the AI Business Transformation Course starting May 12 — spots are limited - http://multiplai.ai/ai-course/  

About Leveraging AI

If you’ve enjoyed or benefited from some of the insights of this episode, leave us a five-star review on your favorite podcast platform, and let us know what you learned, found helpful, or liked most about this show!

Hello and welcome to a Weakened News episode of the Leveraging AI Podcast, the podcast that shares practical, ethical ways to leverage ai, to improve efficiency, grow your business, and advance your career. This Isar Metis, your host, and before we dive into the main topics today and the rapid fire topics today, I have some exciting news. The first piece of the exciting news is that we just hit 250,000 downloads for this podcast this week. This is an insane amount that I never dreamt that I will ever get to, and it's all thanks to you. So I appreciate every single one of you listening to this podcast, sharing it with your friends, sharing it on social media, et cetera. I cannot be more grateful than having you as listeners of this show. But to make it even more interesting, I wanna make the show even better. I wanna adapt it to your needs. And hence what we decided to do is to create a survey, which will take you less than a minute to take the survey and it will give us feedback and what you like and don't like in the show. What would you like to add, change, remove, improve, et cetera. And so in your show notes right now, if you open your phone and click in the show notes, you will have a link to the survey and we would really appreciate it if you will spend the minute or maybe two to take this survey and help us make this podcast better for you. Now to today's episode, there are two main topics that we're going to talk about. The first is the TED appearance of Sam Altman and everything he shared over there. The second ChatGPT growing ambitions in the coding world, and then there's gonna be a lot of rapid fire items. Many of them about open ai. They've been on fire this past week with releases and news and things that they're doing, and apparently there's more coming. So we probably could have done two news episodes just about open ai, but we're gonna try to run through that very fast. And there's obviously a lot more updates from other companies as well. So let's go on April 11th, a week before this podcast goes live, sam Altman has been a guest on Ted interviewed by head of Ted Chris Anderson on the main stage, and it was a very interesting conversation that revealed a lot on how Sam thinks about himself, about the future with AI and so on. I don't think I've heard any major new things about it, but there's a few topics that I really want to cover and dive into that are very, very interesting. Chris pushed Sam on several different topics and Sam wasn't happy about it, but yet he was trying to give answers, at least in some cases, and I will get into that in a minute as well. So he pushed him as an example on the topic of what's happening to creatives. Is it okay or not okay to quote unquote. Steal their work and use it for other purposes. And what would be the model to compensate people for their creativity? And Sam basically said that it would be really nice to figure out a new model to compensate creatives, but he couldn't suggest even a single idea how exactly that would happen. He just thinks it's not necessarily a bad idea. And that would be the theme of the rest of the conversation. He was pushed very hard about future safety of AI models and even responded to Chris and saying, Hey, you're not really a pro AI guy, right? And Chris actually said, no, I actually use AI every single day. I think what you're doing is remarkable, but I'm terrified and again, Sam provided many generic answers to these questions saying, oh, it's gonna be okay. We believe that like in previous revolutions, everything is going to be fine. People find what to do. The human race figures it out. There was no concrete, here's our plan or here's what we're trying to do, or here's our collaboration with groups 1, 2, 3, in order to achieve these things that he believes or wishes or assumes that are gonna be well. They also talked about, their pushed towards new open source participation. We talked about this in several different past episodes recently, but, they are planning to release a, and now I'm quoting very powerful open source model and Sam, while he was admitting there late to that game, he is claiming that it's gonna be near the frontier and that it's gonna be better than any other open source model out there. That's a very significant claim. I haven't heard that one before this particular interview. Now another statement that Sam said in that interview is that they are now reaching about 10% of the global population. So these two announcements puts them at around 800 million users, which is an absolutely insane number. With regards to personalization of AI and specifically the new memory features in ChatGPT sam had a very interesting quote, which is, but you will talk to ChatGPT over the course of your life and someday, maybe if you want, it will be listening to you through the day and sort of observing what you're doing and it will get to know you and it will become this extension of yourself, this companion, this thing that just tries to help you be the best, do the best you can. So the vision that Sam Altman in OpenAI has for these models and specifically for ChatGPT, is something that will record literally everything that we do, and we'll be able to pull the relevant information at the right time to help us make better decisions and take better actions. Now, I am like everything else with ai, really excited and really terrified with this vision, and you start questioning a lot of things like privacy and our ability to make our own decisions. Not being driven by an external force that may or may not be influenced by different things, but the fact that this is a direction they're going and that the fact that they're actively pursuing that and not even being shy about it tells us that this is where they're trying to go. They're trying to get to a situation where this thing has limitless memory and can literally capture everything that we do and use that information in order to help us do better things. Now, in addition, he was asked about AGI and Sam said that we're not at AGI yet, and the main reason that he gave is that these models cannot learn on their own and cannot make improvements without us helping them versus humans who can't do that. He also said that it's very obvious that while these tools are getting very good, they cannot yet do everything that a person at a desk can do. And hence, we're not at AGI yet, which again, is a system that will be able to do everything humans do, at least on the cognitive level, at or above human level. So if they cannot do everything that we can, we are not there yet. That being said, Chris asked him specifically about, okay, what is the internal definition of AGI? And they don't have one. The answer was that if you put 10 engineers in the room, you're gonna get 14 different definitions of what AGI is. And so despite the fact that's the holy grail that they're chasing and beyond, which is something that Sam clearly stated, that AGI is not now a target. It's just somewhere on the timeline we're gonna get there and then we're gonna cross that. And it's not the final target anymore, but they don't have a very clear definition of what that is. When it comes to agents and safety, again, Chris pushed him pretty hard and same came back and basically said that he understands that safety is a concern, but he also understands that without safety, they don't have agents. Basically what he's saying, if agents are not going to be safe, nobody's gonna use them, and hence they won't be able to be developed further. So his claim, again, is more of a logical explanation to why he thinks agents will have to be safe versus actually explaining why they'll be safe or how they are, or what steps they are making in order to make them safe. So not. Great answers in my eyes, and it will be very interesting to see how that evolves more on that from many different angles in the rest of this episode. Now, when he was pressured about all the people who left OpenAI with safety concerns and spoke loudly about it, Adnan basically acknowledged that there is internal disagreement with levels of the required AI safety. But he's saying that if we go back and if we look at their previous track record, when it comes to delivering safe AI systems, their track record shows that they're doing a good job. Now, while I agree with the statement that so far their systems have been safe, that is not a logical argument to use in this particular case because AGI and beyond will have significantly more powerful capabilities than anything we know today. They will be able to manipulate us in ways we cannot even think about. So it's basically saying, Hey, we were really good in keeping safety when we had bullets so we'll be safe with atomic weapons. That's basically the argument that he's making, which I don't buy. I think the fact that previously they were able to do things in a safe way means very little to their ability to deliver safe, really advanced AI systems once we get to AGI and beyond, and it really, really scares me that that's the answer that he was able to provide. Another very interesting statement that I heard him say for the first time is that in general, he believes that moving away from centralized control and decisions is a good direction to go. And I'll be more specific what might mean by that. He was asked, should there be like a summit or a committee that will be a collaboration between leading companies and governments to decide and help define a better future for ai. And he basically said that maybe having a small elite summit is not the right way to go. Maybe allowing 800 million users that they have right now, more or less express what they think is right and wrong versus a select few and following their definition now, he acknowledged that that's never been a successful way in the past. But I tend to think that there's a happy median between these two approaches. One of saying, okay, these 10, 12, 50 people will decide the future of AI and humanity versus, okay, let's involve a hundred million people, 200 million people, and have them have a word on where they see the future is going, what they would like to see, and I think they have the opportunity to do that right now. I think it could be very simple for open AI right now to put a one question, two question questionnaire once a week to all the users that you have to answer in order to continue and learn what people actually want, what they're afraid of, what they would like to slow down and do the same thing in all the other platforms on Gemini and Claude. Then deep seek, et cetera. And then we can learn collectively because we have the access to the masses, what people actually want, what they don't want, and maybe then have a summit to discuss these findings and continue in the right direction. Do I see that happening? No, but I think it's not necessarily a bad idea. Sam was also asked about the impact of being him, becoming a parent on his decisions, and he basically said that having a child has profoundly impacted his personal approach to life. But he was as committed to the safe delivery of AI even before having a kid. And speaking about parenthood and the views of Sam, about where the world is going with ai, he had two profound quotes, or I dunno if they're profound. I mean they're logical on one hand and very interesting. On the other hand. Quote number one said, my kids hopefully will never be smarter than ai. Basically, he's anticipating that in the near future, AI will be smarter than any of us, including his kids, and if they have his genes, they're gonna be smart kids or individuals in the future. The other very interesting quote was, I hope that my kids and all of your kids will look back at us with some like pity and nostalgia and be like, they live such horrible lives. They were so limited. The world sucked so much. I. He truly believes that AI will enable a future of abundance, like nothing that we have today, both in means of personal reach and capabilities, as well as overall success of humanity and the planet. And while this is great, it might be blinding him from the difficulties and the risks that are actually there. And I don't think he is overlooking them. I think he's looking at them, but I think he deeply feels that the benefits will overcome the risks and the dangers, which I personally don't necessarily agree. And now to our second topic, which is opens ai, extreme push towards leadership in the code development with ai. So we spoke many times on this show about the trend of vibe coding, of basically using natural language, in this particular case, English, to create code across multiple platforms. This phrase I've been coined by Andres Cari, who's a former open AI engineer, but it caught like wildfire in the world and the trend of writing code with just simple English is growing aggressively both in the developers communities as well as people like me who have never written code ever before and can now create applications more on that. In a future episode in the next couple of weeks about an application or applications that I'm developing and you will get to experience, in the next few weeks, I'll give you a little hint. It will allow you to find any information you want from any past episode of this podcast and get answers about anything that happens in the news or specific workflows or specific technologies or tools, anything that was mentioned in the podcast, get a answer, but also get a link to listen to that particular segment in a specific episode. It's already in testing and I've developed it myself using English only. And like I said, I will have a whole separate episode about that, but OpenAI has been pushing very, very hard in that direction. And in this past week, they made two huge steps to world dominance in that aspect. One is they've introduced four different models, two in the GPT, 4.1 family, so GPT, 4.1, GPT, 4.1 mini and GPT, 4.1 nano, as well as O three, the full model. So far we only had access to O three mini. The GPT-4 0.1 model family is geared specifically for developers. It's available through the API, allowing developers to use them and integrate them in everything that they want. They excel in every code task, so they outperform every other chat GPT model before by a big spread in coding tasks. And they come with 1 million tokens context window, that's 750,000 words of code ish. So huge jump in context window aligning with some of the leading competitors right now, Gemini 2.5 Pro has a 2 million tokens context window, and LAMA four, which was just released last week, has a$10 million context window, but I don't know how good it is in coding, and so definitely a huge jump forward. Now, the other very interesting thing about these models is that they can manipulate images in their reasoning process. Meaning when you want to create code or do anything else in that matter, you can upload images including diagrams, charts, et cetera, notes that you've taken on a piece of paper, on a back of a napkin, flow charts, things like that. And the AI knows as part of its reasoning process to zoom in, zoom out crop, look at specific segments in order to understand better the tasks ahead. Think about how powerful this is to developers when you can upload your Kanban board as is or a flow chart that you've created in order to describe the user interface or the user flow and stuff like that. And your system actually understands that and can write code accordingly and break things in the right components based on that. So extremely powerful capability that is now available in GPT-4 0.1 Now, if you remember a few weeks ago, we shared with you the huge growth in Claude's income and they have grown dramatically in this past year that allows them to raise a lot more money right now. And the majority of their growth was coming from API connections to the main code writing tools because most of the developers in the world right now prefer Claude Sonnet 3.5 and 3.7 over any other tool. So most of these code writing tools allow you to pick which large language models runs in the background and Claude Sonnet, 3.7 and 3.5 were the leading models which were driving a huge income to Claude, which is one of the reasons why OpenAI wants to be more aggressive in that game, and hence the introduction of these new models. If you remember, they said they're not going to release more models until we get to GPT five, and now instead of having GPT five, we have GPT oh three and GPT-4 0.1 in three different variations. So that doesn't count as we're not gonna release any other models. But again, these are geared specifically for developers. To show you how that competition is driving them to do stuff that we may or may not agree with. This is the first time that they released a model without a safety card. So these system cards that open AI has released every time they released a model are there to provide transparency to basically tell us what measures they took from a safety perspective, from a testing perspective, to verify that these models are okay and they have very little risk, if any. And yet the 4.1 family of models were released without a safety report. Now when OpenAI was approached about this, they basically said, well, this is not a frontier model. Hence it's not a necessity. Basically saying, don't tell us what to do. We will decide when to do safety checks for models and when not. again, I'm not sure I agree with that, but it's very, very obvious that the fierce competition in this field is pushing companies to do stuff that is beyond what should be acceptable from a safety and security perspective. And we'll have another point on this later, but I want to continue on the topic open AI and its push in the coding world. So there's already a lot of examples, mostly people posting on X saying how powerful this model is, that it's game changing and how it writes significantly faster and significantly cleaner and better code than any other ChatGPT model before, but also better than most other models out there. So definitely a big improvement in code writing for ChatGPT, but that's not the big coding news from OpenAI this week. The big news from OpenAI this week is that there are rumors that they're in advanced talks to acquire. Windsurf for$3 billion. So let's go back for a second to the whole topic of vibe coding and the platforms that enable it. These platforms are divided into two main categories. One is platforms that enables people who do not write code to create applications and write code. These are the types of tools that I use like Lovable and Rep and several different others. But there are tools who are built for developers allowing actual software engineers to write code faster, better, and do it in a more efficient way. The leading one in the world that really took the coding world by a storm is cursor and apparently OpenAI were trying to acquire cursor first. Now putting things in perspective, cursor is now assumed to have$200 million in a RR while Windsurf. The company who will probably get acquired has 40 million in a RR, so five x bigger. Now Bloomberg reports that Open AI met and had conversation with 20 other AI coating firms before moving in that direction. So they are definitely going to acquire somebody and it seems like it's going to be Windsurf. Windsurf was founded in 2021. They raised$243 million so far. And like I said, they're one of the most popular coding platforms for actual code writers and code developers. And I think this is a very interesting move by OpenAI to basically have an integration of the AI coding tool and the audience combined with its ability to drive the models behind the scenes. It will not surprise me if that will be followed by other leading AI companies buying other leading coding platforms. It just makes sense because this is now the most advanced, the most widely used application of large language models beyond the large language model themselves, and being able to acquire that allows you to control that part of your destiny versus letting people pick whatever they want to pick. Now, this may or may not go smoothly because there's several different reports that the FTC may raise concerns about this given two different aspects. One is open eyes close ties to Microsoft who controls their side of the development world and open OpenAI themselves previous investment in cursor through their startup fund. So the output is that if they acquire windsurf, they will have their hands in the three top AI coding platforms in the world today, which may or may not be acceptable by FTC regulations. Now, in addition to these new models and the acquisition, OpenAI just released Codex CLI, which is a lightweight open source coding agents that runs directly in terminal environments. It works obviously in conjunction with all three and all four mini and it's another step in the direction of agentic software engineer that cannot just create the code, but can also deploy the code directly into the projects as required. Now because it's based right now on the O four solutions and the O three, the tool supports the multimodal inputs that we just discussed earlier, allowing users to provide screenshots and sketches alongside with text in order to enhance the understanding of what you're trying to develop with the code. Now users can control the level of autonomy in this tool from approval mode. Meaning just show me what you're trying to do. Let me approve it all the way to full auto approval modes where it was just going to deploy the code that it creates as it feels necessary. Now from a security perspective, codex CLI maintains privacy by running everything locally on your machine. Rather than calling two remote services for every prompt that you write. And to encourage the adoption of this system. OpenAI set aside$1 million in API tokens to eligible software development projects in blocks of 25,000 each. So they're gonna put money into companies who are going to use these capabilities in order to drive adoption. Again, showing you how all in they are on this topic. So in quick summary, OpenAI is all in on the coding world, both in means of developing models specifically for coders as well as spending a lot of money on acquisition to get a significant lead in that aspect of AI usage. They definitely have the money, they just raised$40 billion or 3 billion, is a small change for them. but it will be very interesting to see how that evolves from here, both in means of their integration with windsurf, if this actually happens. And obviously what are going to be the implications for the rest of the industry. And now two rapid fire items. This past Monday, OpenAI announced that they're going to phase out GPT-4 0.5 and being replacing it in their APIs with GPT-4 0.1. And that is not surprising at all. If you've listened to this podcast for a while, then you know when they came out with GPT-4 0.5, I said it made absolutely no sense that they released that model. It wasn't necessarily significantly better than 4.0 and it was significantly more expensive for them to run. And on the coding side, it definitely wasn't doing any better than the previous models. And so now that they know a lot of the API side of things is going towards coding and that 4.5 cost them a Fortune and 4.1 actually delivers better results, they're going to phase out 4.5 from that aspect. To put things in perspective, when I'm saying it's significantly cheaper, it's a better deal for you as well. So GPT-4 0.1 is going to cost$2 per million tokens of input. The same thing will cost you$75 instead of two if you use GPT-4 0.5. And on the output tokens, it's$8 for million tokens on GPT, 4.1 and$150 on GPT-4 0.5. And that is, while OpenAI said that at the pricing of four of 4.5 that they put out, they're still losing money. So they built a model that was extremely inefficient. And I assumed they distilled that model in order to create 4.1 or maybe use other tools. But they're going to phase out 4.5 and just keep 4.1 on the API. A cool feature that was added to chat GPT this week is because of all the craze with the images, they added a image library. So on the left menu just below Explore GPTs, there is now a library segment that shows you how many images that you've created with ChatGPT since the launch of the image generation, and you can click on that and see all your images in a gallery. I find this very useful and very cool. And since OpenAI sees the craze that this has generated, they are apparently developing a social network built around the image generation capability. So the project is still in early stages, it's only in internal prototype, and it focuses specifically on creating a social feed around AI generated images. They saw the craze that happened on traditional social media with their image generation tools, and they said, why not ride that wave and have something internally? And now they gain several different benefits from it. First of all, they get to poke meta and X to companies that are their direct competitors in this field. Not to mention sticking it to Elon Musk, in his own game, but in addition, it will provide them data to train their models similar to what Meta is doing with their platforms, as well as X is doing with its platform. So you can see that this actually makes sense. Now, does it have a chance of being successful as a new social platform? I don't know. What I do know is that there's an insane craze right now around ChatGPT we just said they got to 800 million global users and they're growing at an insane speed and everybody's downloading their app and adding a social feed as a feature where people can see other images that other people are generating and comment on them and share them might actually work. Now, speaking of trends from ChatGPT that are taking over on social media, there's a new trend after the Ghibli insanity then the action figures craze. Now people are doing reverse location search on images. So many people have shared taking images from multiple sources, uploading to ChatGPT specifically oh three, and asking it where the picture was taken. And it's showing really good results at understanding from small nuances and things in the image where the picture was taken based on landmarks or different hints that it finds in the images. While this is really cool, it obviously raises serious concerns because any picture of yourself that you put online, let's say on social media, can now be used to tell people where you were at the time that you were taking the image. It's also showing you how powerful these tools are right now to understand nuances and little hints from images to get a much better understanding of more things than we probably know. Those of you who have seen the very first demo of Gemini or the Google female employee was demoing Gemini Live and she was wearing the glasses and she looked outside the window and all you could see is rooftops of other buildings around. And she asked Gemini where it thinks she is and it told her that she's probably at King's Cross in London, which is where she was. So these tools exist, this capability exists. It's not exposed in a formal way, but it's there and you can use it. And it will be very interesting to see now that people know that what they develop from an API perspective, what other applications can be developed to benefit from that capability. If you think about the negative stuff, there's a lot of negative stuff, but if you think about the positive stuff, allowing you to ask where you are in order to get better navigations, if you don't know where you are or how to find the one thing that you're looking for in a street where you don't have the exact address or stuff like that, just by seeing what you're seeing, that's the direction where it is going. On the last two items from OpenAI, I want to touch on the hallucinations and mistake side of things because I think it's very important to touch upon it. According to OpenAI internal testing, all three and all four mini models that were just released hallucinate more than both company's previous reasoning models and traditional models. So GPT-4 oh and oh one Mini and oh three mini, to make it even more confusing or less promising, open AI themselves do not know why these models have higher levels of hallucinations. And they said, and I'm quoting, more research is needed. So they do not understand why when they add additional capabilities to the models, they get additional hallucinations. By the way, in a pretty big spread. So on a benchmark called person QA oh three hallucinated, 33% of the questions while oh one and oh three mini hallucinated only 16 and 14%. So it doubled the rate of hallucinations in this model. That's obviously not a good thing, and as of now, it doesn't seem that OpenAI has a good solution for that. Now, on a more interesting aspect of this, OpenAI researchers published a paper last week revealing that their deep research technology, which I started using all the time, not just from ChatGPT, but from the other providers as well, they found two very interesting facts. They were testing it against a very hard benchmark, and the idea was to compare it to a previous technology, just GPT-4 oh versus deep research, and B, compare it to actual human researchers in those same really difficult tasks. And what they found that deep research did dramatically better than humans and a whole different level from the old tools. So the old tools basically failed miserably. If you just gave GPT 4o one of these tasks, it couldn't complete any of that research effectively. Humans, many of them gave up on 70% of the questions after two hours of effort and not finding answers and for the people who did continue all the way, still had 14% incorrect answers. Now, deep research got answers to all the questions, so they didn't give up on any of them, but it only came back with correct answers, 51 point a percent of the time. Basically just a smidge accurate than flipping a coin on the right answer on something. Now that's not very promising, but to be fair, this is on the hardest a set of questions that they could find that, again, took human researchers hours to actually complete. So what do I think about this? Personally, I started becoming more and more dependent on these deep research tools. It is very alarming to me to find out that they could be wrong 50% of the time. So what this tells us is that while these tools are incredible and they really do amazing research, you still need to go and click on the links of the sources that it provides you and verify the information. So the research is not gonna take you just five minutes. It's probably going to take you 25 minutes to actually go through the relevant links and see what's reliable information, what's not. But it's still gonna save you a few hours of doing the research yourself because in many cases, these websites will check 70, 80, a hundred, 200 websites in order to get you the answers, which if you have to do on your own, will take you hours of work. As I mentioned, there's been a lot of other news from OpenAI that will move to the rapid fire items. So let's dive into those. First of all. According to App Figures, ChatGPT's app surge to 46 million downloads in March 20, 25. A 28% jump from February and making it the number one globally downloaded app in the world overtaking Instagram and TikTok. Now, the interesting thing is, what drove that madness is obviously their new image generation tool and all the Ghibli style images and then the action figures madness that happened shortly after that, which is ridiculous to me because there are so much better real value actual use cases that you can use ChatGPT for in your personal life and in business. That gives you many way more reasons to download the app and use it regularly than creating anime style images. But I guess it doesn't really matter because that drove this complete craze of downloads of the ChatGPT app. And it is amazing to see their brand dominance where basically ChatGPT became synonym with ai. It's just like Google became Google in search in the two thousands, almost the same exact way. People just know ChatGPT most of the people in the world are not like me or you, the listeners of this podcast that know multiple chat platforms and use them in different use cases. Most people just know ChatGPT and that's the only thing they're using, and it's an incredible brand power that OpenAI has right now over anyone else. An interesting piece of news that is actually a good news from my perspective when it comes to using open AI's APIs. OpenAI is planning to implement a new verified organization process for you to get access to their most advanced models. So this is not gonna impact the models that exist right now, but it will be required for future models. And that means that any organization who wants access to these future APIs will require an approved government issued ID from one of the countries that is supported in the open AI's API, and without that id, you won't be able to get access to these tools. I think this is a great step in the right direction. I don't know if you'll solve all the problems because there's many bad people and bad players within countries with actual legit ideas. But at least it reduces the risk of the wrong people and the wrong groups using these really powerful APIs for the wrong things. Now, this might also be as a response to what happened or presumably happened with deep seek when they were more or less scraping or if you want the professional word, distilling their models by basically running it against their API. That was probably blocked in other ways before, but having an approved government ID as a prerequisite to getting access to advanced models, I think is a great step in the right direction of higher security when better models come out. Now, I told you before that I'm gonna come back to the safety risk that is driven by the crazy competition in the AI universe. Well, OpenAI just updated its preparedness framework and they're stating that they may adjust their safety requirements if competing AI labs release high risk systems without similar protections in place. Now they are claiming to be fair that they're only gonna make these adjustments after rigorously confirming that the risk landscape has actually changed publicly acknowledging that they're making those adjustments and that these adjustments do not meaningfully increase the overall risk of severe harm, and still keeping the safeguards in place. So in the Ted Talk interview of Sam, Chris made an interesting statement. He basically said that all the leading labs because of the competition are basically saying, we gotta run the fastest that we can because it's inevitable that somebody will get there. And if we get there first, it will be safest because we're safer than everybody else. And he was questioning the. Inevitable aspect of that scenario? what this is telling us that it is inevitable. It is inevitable because the labs themselves are claiming on paper in their formal security guidelines, that if other labs crosses the line, they will cross the line too. And this is not even spoken in, closed rooms and behind closed doors. This is their formal statement basically saying if other labs move forward with less cautious, we will do the same. This is really scary to me because it basically means that this competition can drive us over the edge and all the leading labs will play their role in that scenario. In another interesting acquisition by OpenAI, they just hired the team, not the company that started becoming a norm in the AI world. They acquired the team of Context ai. Context AI is a startup that was founded in 2023 by two former Google employees. And what their tool allows to do, is to look into the actual doing of AI language models, understand them better, and use that to develop better and safer AI solutions. So OpenAI just acquired the team and obviously all their knowledge, the company's gonna dissolve its previous operations. And Scott Green, one of the founders, is now product manager at OpenAI Building evals. Those of you who don't know what evals are, evals are the ability to evaluate what the AI is doing, and evals are a requirement to building high performing AI applications, but they're very hard to get right now because it's quote unquote a black box. And what their tool allows to do is to look into that black box and have a better understanding of what it is to evaluate how the AI is working to develop better and better AI systems faster. Like I said, this could have been a whole episode about just open AI and ChatGPT, but there's a few interesting news from other companies. The biggest one comes from Microsoft and they just unveiled three really interesting things. The first one is really promising for the future, and they have released a one beat, really small model that can run effectively on A CPU. So it's a very powerful, really large model that actually provides leveled results with LAMA 3.2 and Google's Gamma three. And Alibaba's Quin 2.5. So not the latest and greatest, but just the level before that on a very small footprint that can run. Locally on smaller computer with significantly less memory. Why do I think this is promising? I think this is promising because the world's compute and resources are now all being drained in order to support future AI models. And if we will find ways to scale models not with that level of compute, I think the planet will thank us. Now on the more practical things that you can start using right now. Microsoft released two big releases this past week. One of it is AI agents that can control your computer. So the concept of computer use the same as we got from Claude a while back and from OpenAI a few months ago. You can now use copilot studio to build these agents that will take over everything on your computer, so anything in your browser as well as applications. Basically, it looks at the user interface and can perform anything that humans can perform on the screen. Now I'll say two things about these kind of agents and computer use solutions. One, I think in the long run it'll open the door for a lot more automation and will probably eliminate completely any tedious tasks that we have to do today. So from that perspective, it's really good. Two, any person who has tried to use the full copilot studio knows it's an absolute nightmare. So first of all, they have two products called Copilot Studio One is basically custom GPT wrapped as a Microsoft product, the other is their old school automation tool with some more AI capabilities, which is absolutely impossible to use. It's the worst user experience in the world. I consider myself an advanced user, a techie and a geek, and every time I try to build something there, I have to. And every time I try to build something there, it's very hard to do and I gave up more than once. So I really hope that part of the process is allowing the AI to create the AI so I won't have to fight the tools that they have created. so I'm not sure how the implementation is on the Microsoft side. It will be very interesting to test and I'll keep you posted as I do that. The other thing that I will say that there's a huge risk in allowing these tools to take over your computer because you don't know what they will do. You don't know when they're gonna go crazy. You don't know whether on purpose maliciously or just by random fluke mistake, they will do stuff like changing your passwords or locking you out from specific pieces of information that you actually need or anything else that you don't want that would happen on your computer or on your company network. And what I have done in order to test these kind of agents is I've actually created a virtual machine on Google Cloud. I'm using that to run a browser and I'm testing everything in that virtual machine. So if this goes rogue, then nothing happens to my real universe and I can safely test different use cases, including different agent capabilities. I highly recommend that to anyone, and I will create a full episode about this, showing you exactly how I did it so you can do it as well. The other thing that Microsoft released this week is they made Microsoft copilot vision available to anybody on Edge. So vision is if you want a subset of agent, it can see everything on your screen. It just cannot take actions. And this was available for the paid users of copilot and now any user of copilot, including the free users can show the AI everything on the edge browser, and you can even activate your microphone and talk to it. That will allow you to basically ask the AI about anything that you're doing, whether it's emails or spreadsheets or research or anything that's on your screen you can ask about and collaborate with ai. I've done this multiple times with Gemini experimental capability because that's been available for a while. It's been working, I would say about 50% of the time effectively, and the other 50% it crashes before you get to a solid outcome. That's at least on the Gemini platform. You can also do this on open ai, but only on their mobile app, which is weird to me because it will make a lot more sense to have it on desktop, either on the browser or on the desktop app. But right now on OpenAI, it's only available on the mobile app. But anyways, it's now available for Microsoft users. anybody who's using Edge can now activate this and use that. It's only currently working with several different specific websites such as Amazon, target, Wikipedia, TripAdvisor, food and Wine Open Table. So basically websites you use in order to do your day-to-day personal things and not necessarily business tools. And as I mentioned, I see a very long time between now and a time that an organization will allow these models to deal with actual enterprise systems because I don't know how reliable it is yet, and I don't know when it'll be a hundred percent reliable or at least as reliable as humans doing data entry. and I think we still have time to go until that point. Going from Microsoft to another. Widely used platform that has released interesting AI capabilities notion, officially released Notion Mail. It is an AI powered email client that connects to your Gmail account and allows you to do really cool things. The first thing, it allows you to manage the emails, it reads all your emails, and you will categorize them and put it in different buckets so it will be easier for you to review. It also knows how to suggest answers. it also knows how to look for suggested meetings. So if somebody said, let's meet, it will look through your calendar and make suggestions to when you are available and things like that. This is obviously the direction that all these tools will go. I think this is a great move from notion to moving that direction and offer that integration. I must admit, I'm surprised, and I shared that before that doesn't exist in Google itself already as part of Gemini. But I think once Google and Microsoft figure out how to integrate all their Gemini tools and all their co-pilot tools into one Gemini that can see and connect everything, and the same thing with co-pilot, we will gain huge benefits to efficiency on our daily business usage. Now broadening from day-to-day work of everybody to what AI is doing in the broader scheme of things from an application perspective and what aspects of the world it is going to touch. Two former Tesla supply chain leaders has started a company called Atomic and their goal is to develop an AI powered platform focusing on streamlining inventory planning and supply chain management. These two leaders have experience the craziness of Tesla's near collapse when they were scaling model three production in the beginning. If you remember Elon Musk sleeping at the factory floor at Tesla in order to go through that period, so they deeply understand the problems and the need to a better supply chain, control and management. Most of the supply chain management in the world today right now is done manually. I have several clients who have warehouses and inventory and supply chains. And the amount of work that is done by copying and pasting data from multiple Excel files and emails into a unified environment to understand what is actually happening in the company is incredible. We've been solving this problem through automations and custom gpt, and they're gaining amazing benefits. So first of all, if you have issues like that in your company, reach out to me on LinkedIn. I'll gladly help you out in stuff that you can start doing tomorrow versus in months or years when this company takes off. But their goal is to basically put AI software that allows users to quickly simulate multiple scenarios that would normally take hours or days to calculate, and based on that change the whole supply chain structure. In early pilots, they have shown several different examples, such as reducing inventory by half while maintaining 99% in stock rate for the relevant needed components. This is obviously very promising, and I'm sure that a lot of companies need a solution like this. Another field where this is happening very aggressively is obviously customer support. So Zendesk, CEO, just been interviewed and shared a lot of interesting information about AI in customer service. He sees a very near future where a hundred percent of customer service interactions involve ai, and 80% are solved by AI without any human intervention. Now the interesting thing is according to a survey that Zendesk themselves did, 51% of consumers are saying that they prefer interacting with bots over humans when seeking immediate service. So I wanna unpack that for a second. first Of all, the survey was done by Zendesk, so they have a vested interest to say that because that's where they're pushing their platform. But that being said, now I'm gonna speak from my personal perspective. Every time I have to deal with any of the big companies, such as at and t or any medical group or insurance or airlines. It drives me crazy. The amount of calls, communications people I have to talk to in order to solve problems that seemingly shouldn't be that complicated. I just spent last week, two hours on the phone try to get my tickets from StubHub. That is insane. That shouldn't happen. and I agree that if these tools are done correctly and they're connected to the right systems, and they can find the right information and take the right action quickly and effectively, I would rather talk to that than the traditional customer service agent any day, any time, and only escalate to a human when it cannot solve my problem, which happens anyway in more than 50% of the cases. An actual example that I found is that the city of Buenos Aires is developing AI chatbots to manage over 2 million queries per month without human intervention. And that has reduced their burden on their actual staff by 50%. So that basically tells you that they are dealing with a huge amount of responses that are successful because it's not getting to the humans after that. It is very clear that the customer service universe is changing dramatically as we speak. And I already predicted that in the beginning of last year, that I think the concept of call centers or connect centers will disappear from the world. And I know that's a scary thought for an entire industry to disappear, especially in countries like India and Philippines, where there's millions and millions of people who make their living through that industry. But I don't see this evolving in any other way. Another interesting data point about AI implementation in enterprises comes from Johnson and Johnson. They just released the findings of an internal research that they've done, and they found that 10 to 15% of artificial intelligence use cases delivered 80% of the value. That's in the story that was shared on the Wall Street Journal this week. That's not too surprising to me, but I think it's not the right way to measure it. And I'll explain. The reason it's not too surprising to me is the 80 20 rule always works, or in this particular case, the 85% rule, but it always is true that whatever are the big projects are going to deliver the most amount of value. That doesn't mean that the other 80% of projects do not provide value or do not provide positive ROI. So I can tell you that many of the things that I'm doing with a lot of my clients, or that people that take my courses, do in their companies are small day-to-day initiatives that are going to change the efficiency by small percentage, but free a lot of time for specific individuals to focus on bigger things. None of these things will be visible as a significant one line item on the bottom line of a large scale company, right? So if you wanna move the needle for Johnson and Johnson, the benefit needs to be in the billions. If you just created an initiative, a small custom GPT that removes the need to do a task once a day for an hour, that is worth a million dollars to the bottom line. But it's not gonna move the needle for Johnson and Johnson. So they're going to say, that didn't qualify to move the needle for the organization. But if you do 20, 30, 50, a hundred of these in an organization, well that adds up very quickly to a significant amount. So while I agree with that, that the 80 20 rule works, I think every use case needs to be tested as its own ROI use case, and if there is a positive RO, I continue doing it, even if it's not a huge addition to the bottom line. And I think there's an important way here to differentiate between company initiatives that require a huge amount of resources. And over there you do want to focus on the ones that will produce the most amount of value versus education and training to your employees on how they can apply AI tools on their day-to-day stuff to gain these small benefits. And then just encourage anybody who can get a positive ROI out of that process. Staying on the topics of industry and new tools Light Source Labs has emerged from stealth mode. They just raised$33 million in seed and Series A, and the company addresses a critical industry gap. This side of procurement. So again, we're still staying in that universe of supply chain and so on, and they're stating exactly what I told you before, that 70% of procurement teams still rely on manual processes using no software tools and sourcing using no source software tools, and instead managing billions. If you look at aggregately in the world through email spreadsheets and randomly formatted document, exactly as I mentioned earlier, and they want to transform that aspect of the business. So a different company, slightly different flavor, but I'm sharing with you all of these to start understanding where the world is going. AI is going to be in everything that we do, and we'll be able to dramatically improve everything that we know in every aspect of the business. And as a business owner, a business leader, all you have to do is think about what are your biggest pain points? What would you like to solve? And start looking for these kind of solutions because they exist. And if they don't exist, they will exist in the next few years. Staying on the topic of new, interesting developments in the enterprise space, Adobe just took a strategic investment in Synthesia, so those of you who don't know Synthia, they're a British startup that allows you to create and use AI avatars. This comes at the same time that Synthia announced a hundred million in a RR, which is positioning them as a leader in this field. The two leading companies in the AI avatar universe are Synthesia and Hagen. Both of them have very advanced capabilities, but the investment from Adobe makes it very interesting because this now brings potentially in the future these kind of capabilities into the Adobe universe, which will allow interesting combination of the Adobe tools with Avatar video generation. And if to quote Synthesia, CEO, he said that their vision is aligned with Adobe to democratize high quality content creation, and making enterprise communication faster and more effective. I couldn't have said it better. I think it's a very interesting partnership and it will be very interesting to see if Adobe actually starting to leverage that inside their tools. That being said, despite the amount of money that Synthesia is making right now, and despite the fact that they saw a hundred percent growth year over year, they still lost over 25 million pounds this past year. And continuing on the topic of impact on industry hugging face, the open source behemoth, the one that hosts most of open source models in the world, just made a very interesting move and they acquired a French based robotics company called Pollen. The sum wasn't released, but the robot called Richie two is now available for sale for$70,000. Now, the interesting thing about this particular robot is that it comes ready to go, but with an open source platform that you can continue to develop on your own. Hugging face team lead for robotics is Remi Caden, who is a former Tesla robotic scientist who worked on the Optimus program. So they're definitely very serious about their move in that direction, and the fact that they're releasing the world's first advanced open source model makes it very, very interesting and showing you how many huge advancements are happening and are going to happen in the robotics space. I think it'll be very interesting to see what companies do with that infrastructure and architecture that is now open space. And it's a very interesting move by hugging face, putting them in the hardware field as well, and in the very competitive and highly lucrative robotics universe. There are a lot more robotics updates and if you wanna find out about them, you can read about all of them in our newsletter, which you can sign up for in the link in the show notes. All you have to do is open your phone right now, click on the links and you can sign up for a newsletter. And over there are a lot more news that we cannot share on the show, but our next topic, after talking about enterprise and what's happening in that field, I wanna share with you something interesting from Grok. So Grok just shared Grok Studio. Grok Studio is their version of Canvas in ChatGPT or artifacts in Claude. And this is a side by side split view where on the right you have a document that you can edit. And on the left you can continue to have the chat in Grok. I started using it yesterday. And I must admit, it is not to par with Canvas and not even with artifacts. It can run documents and code at the same time. It has one cool benefit that it has more of a document style editing capabilities where you can change things to bold or italic and different headings straight there in the user interface, but it's lacking a lot of other aspects that Canvas is so helpful at with the main thing from my perspective, is the ability to highlight a specific segment of the text and getting the AI to work just on that segment. To me, that's the most magical aspect of Canvas. The other problem with the way Grok studio works is that it writes everything, the full answer, which sometimes for me is like three pages on the left side, basically on the regular chat, and then copies it to the other side, which takes twice the time, which absolutely drives me crazy. So a good step in the right direction. I think these collaborative environments with AI are fantastic. I think the current implementation by Grok is lacking and needs to be improved. That's it for today. Don't forget right now to open your app and look at the survey and tell us what you want this podcast to be. I want to listen to what you have to say and make adaptations to this podcast to serve you better. But for that, please fill out the survey. Also, don't forget, there's an AI business transformation course starting on May 12th. So if you need better structured training on how to implement AI to drive your career, to improve your team, to improve your business, your company, it's an incredible opportunity. We teach these courses all the time, but usually for close groups and only once a quarter we open them to the public. So the next one after May will probably be in September. So don't think twice and come and join us again. The link for that is in the show notes as well. On Tuesday, we'll be back with a fascinating episode that will show you 15 different use cases on how to use the new AI image generation capabilities. 14 out of them are business oriented and not for fun. So it will teach you a lot of valuable stuff. And for now, have an awesome rest of your weekend.

People on this episode