Leveraging AI

176 | ChatGPT’s Change Everything (Again) with a New Image Generator, AI Boosts productivity by 37%, Capabilities Doubling Every 7 Months, AI Doctors & Teachers Will Be as Common as Cellphones – and more top AI News Week of March 28, 2025

Isar Meitis Season 1 Episode 176

Is your business ready for an AI teammate who works faster, costs less, and never takes a sick day?

This week, the AI landscape didn’t just evolve—it exploded. From Harvard-backed research showing AI-powered individuals outperform entire teams, to OpenAI’s shockingly good new image generator, the future of business just took another quantum leap.

AI isn’t just an assistant anymore—it’s becoming the MVP. And the pace? AI capabilities are now doubling every 3–7 months. If you're a business leader and you’re not keeping up, you’re already behind.

In this tightly-packed, high-signal episode as I break down the most critical AI developments business leaders need to know—from real research, new capabilities, to the societal shifts no one’s prepared for.

In this AI news, you'll discover:

  • How a solo employee using AI can outperform a 2-person team
  • Why 37% productivity gains might be just the beginning
  • AI's doubling speed — and what that means for your job, team, or company
  • GPT-4o’s jaw-dropping new image generation capabilities (and what it kills off)
  • Why Canva, Photoshop, and even designers should be paying attention
  • Bill Gates’ prediction: AI doctors & tutors as common as smartphones
  • The hidden emotional risks of using AI and how it’s impacting users’ mental health
  • The rise of humanoid robots that lift 65 lbs, cook eggs, and might be your next warehouse crew
  • Latest copyright battles (and contradictory rulings) shaking up AI law
  • Deep dives into Claude 3, Gemini 2.5, DeepSeek, and what benchmarks actually matter
  • Why you’ll soon have voice agents handling customer service, sales—and sounding eerily human

 Registration is now open for the Spring 2025 AI Business Transformation Course, designed to help leaders like you implement AI company-wide with confidence.
🎓 Use code LEVERAGINGAI100 for $100 off https://multiplai.ai/ai-course/ 

About Leveraging AI

If you’ve enjoyed or benefited from some of the insights of this episode, leave us a five-star review on your favorite podcast platform, and let us know what you learned, found helpful, or liked most about this show!

Speaker:

Hello and welcome to a Weekend News episode of the Leveraging AI Podcast, the podcast that shares practical, ethical ways to improve efficiency, grow your business, and advance your career. This is Isar Metis, your host, and we have a packed show today. There are so many big things that we need to talk about that I could have probably done two episodes this weekend, but we're going to try to make it as efficient as possible for all of you. So as a first step. we are going to talk our three main topics today, even though they could have been easily six, but I still had to pick three, are going to be two research papers. One, talking about how AI improves the efficiency of individual and teams on a research time by Harvard together with Ethan Molik. The second is going to be a research that is showing how fast AI is doubling in its ability to perform business tasks. And the third is an incredible new release by chat GPT. But then there are a lot of other releases that are still really important and capable and many other things to discuss, including some copyright laws I improvement in robotics and many other things to talk about. So let's get started. As I mentioned, our first topic is going to be about the impact that AI can have and is having on people who know how to use it at work. In a study that was done by Harvard together with Proctor and Gamble, they revealed that individuals using AI can match the performance of a two people team. And if you want the specifics, that's a 37% boost to a solo individual performer. The way this research was done is it tested, 776 Proctor and Gamble professionals in several different categories. There were people working alone, there were people working in teams. And then there were individuals working with AI and then teams working with ai. And they compared the quality and the quantity of the work as, as well as their emotional wellbeing during the work performed. The tools that we're using our GPT-4 and GPT-4 oh. They were providing clear training for these people on how to use AI properly for these tasks, which, if you've been listening to this podcast, I told you many, many times before, is the numbeR1 key for success in AI implementation. Now, what they found is, as I mentioned, very, very interesting. They found that a single individual that knows how to use AI properly matches the results of a team of two people. Basically, you can save 50% of the workforce and get the same results. They also found that teams working with ai dramatically outperformed teams working without ai, and the best output came from teams working with AI that work better than teams working without AI outperforming solo individuals working with AI by 40% and creating three times more top percent ideas and solutions than teams working without ai. Now, in addition to generating better results, they also worked faster. So individuals shaved 16% of their task time working with AI and teams cut off 12% of their time working with ai. I can tell you from the things that I'm doing with clients and myself, that once you deploy this company wide and not on very specific use cases, these numbers could be significantly higher, shaving 50% of some tasks, and 90% of others on specific tasks. Now in addition, there were a few other benefits that maybe weren't expected. One of them is that in the teams that working without AI got stuck on their area of expertise. So the tech teams came up with tech ideas and the salespeople came up with more commercial ideas. But the groups that worked with AI came up with a more balanced solution. So the tech team working with AI also came up with commercial ideas and vice versa, which tells you that the versatility of bringing in a quote unquote AI teammate that is more well-rounded in its approach can provide significantly more benefits generates value to the company way beyond the team working without ai. In addition, the other interesting thing is that participants that worked with AI reported a more positively emotional experience to the work compared to people who worked without ai. So what does that tell us? It tells us what we already knew, but it's a scientific proof that working with AI breaks the old concept of faster, better, cheaper, pick two. That has been the common state in every work in the company, in every task in the world. You can either, if you wanna do something faster, it's gonna cost you more money. If you wanna do it better, it's gonna cost you more money or gonna take you more time, et cetera. And now for the first time in history, literally anybody or any group or any company can work faster and better and cheaper and apparently have a better work experience from an emotional perspective by working with ai. Now what do those improvements to capacity and capability mean for the broader workforce? That's not something that this particular research has tackled, but if you think about it, even if they're talking about 12 to 16% improvement in efficiency, you can use that to grow the company by 12 to 16%. But if your market or your competition will do the same thing and will not allow it, that means that 12 to 16% of the capacity of the company will not be necessary anymore, which can have really bad implications on the global workforce. There is not endless demand, and hence, every time I hear the people saying, oh, so you can grow a business, it's true right now, but when everybody does it and everybody will have that extra capacity and extra capability, or when these tool becomes significantly better and are not providing 12% improvement by 30, 40, 50% improvement, then we have a serious problem on our hands when it comes to our society. Now to pour some more gasoline to what I just said in a new study by a organization called me, METR. They research the length of tasks that AI can handle autonomously. And what they have found is that this capability for the past six years doubled every seven months. And they also found that in recent months it's been doubling every three months. So not only that, it's doubling very fast, way faster than Moore's Law. It is now accelerating in the speed of what it can handle. But let's dive a little deeper to how this research was done and what exactly they're trying to show. So first of all, what they were trying to see is the length of time of a task that AI can perform at a 50% success rate. Basically, they took humans and tried to see how long it takes the average human or. Great humans to perform a specific task. That's by itself is an interesting benchmark because does that compare it to an average human or a very good human in that task? But it was mostly coding related tasks. AI back in 2019 could perform tasks at a 50% success rate for only for tasks that were just a few minutes. So anything longer than that, it would fail miserably. In. Now, Claude 3.5 sonnet has hit the 50 minute mark, so almost a full hour of coding on its own without any human inputs and still performing tasks at a 50% completion success rate. Now, they shared another interesting statistics that over 80% of successful AI runs has cost less than 10% of the human software engineer wage for the same exact task. Going back to what I said before, that is a 90% savings if you let the AI code it versus the human code, it. Now, this wasn't just one task, it was all across the software development universe with 66 different software tasks including coding, debugging, and other software related tasks. So while it's still within the software development domain, it's not just one type of task like writing a specific kind of code. Now what they are stating is that if this trend holds, and we're gonna talk about, there's a lot of ifs in the background of this. If the trend holds by 2027 to 2029, somewhere in that timeframe, AI might be able to manage. Eight hour days, or maybe even full weeks or in the end of that maybe months of work unsupervised. That provides a huge amount of potential and a huge amount of risk and fear if you ask me. But it is where this is going if we follow the existing trajectory. Now, they are themselves are clearly stating that they're not sure of any of this, including the fact that they looked at tasks that are at 50% success rate, which is obviously completely unacceptable in any workforce. Think about your employees failing in everything that they do 50% of the time. So that's obviously not a good enough benchmark. But what they were trying to show is not AI's ability to solve today's business problems, but it's ability to improve in doing this over time. And I think it'll be very interesting to see i what happens if they try to do similar tests or somebody else picks it up, this concept and does similar tests for a, the same success rate as humans, right? It doesn't need to reach a hundred percent, it just need to be as good as the average coder, not even the best coder because you have a few great code or any other people in any other domain, and then most of the people are average or around the average. So that will be very interesting to see where that goes. The other things that they're saying that it's not clear that the scaling laws will continue because of multiple reasons. we ran out of data. There's not enough chips and many other reasons that can slow down the acceleration of ai. The flip side of that, not from the research, we are seeing many new innovations that drive acceleration beyond the amount of chips, beyond the amount of capacity in GPUs, in power, beyond the amount of data, through reasoning, through better algorithms, through things like that. Like all the latest updates that we talked about from companies like Deep Seek, et cetera. So we are seeing these things happening, so it may actually go the other way, and they're stating that as well. So that timeframe of within a few years, these tools might be able to handle full days or full weeks, or even months of independent work may actually happen faster and not just slower. Why is this important? And especially when you tie it back to the previous topic of humans working with AI delivering significantly better results than humans working without ai. The world we know is changing and it's changing pretty fast, and they're even stating themselves that even if they're wrong by an order of magnitude, this thing will change everything we know it will just take a little longer. One thing that I can tell you is that nobody is ready for that because this may mean that we need significantly less people working. This may mean we need a completely different education system. This may mean we need a way to make people find stuff to do and get paid so they can buy things. So the economy continues to work and This is just another wake up call to the trajectory of this thing that is potentially accelerating. And if you combine all the points and if you combine what all the leading labs and the leadership of the leading labs are saying that a GI is coming, some of them are saying within this year, some of them are saying within the next two years, but in the very, very near future, we have to start taking action as a society and figuring out what that means on a broader scale And from research to very practical news that we're gonna dive into, which is the latest release by OpenAI. OpenAI released a new version of GPT-4 O this week. Now it does two things remarkably well, and I'm gonna dive into one of them and then mention the other. The thing that it does incredibly well is that includes the capability to create images that is replacing Dali. So you heard me say many times before Dali in the past year was an embarrassment. All the other tools out there, whether it's open source like Flux or Midjourney, or other professional tools or running circles around it, and Dali was eh. it could generate images, but none of them were great. this new release that is just baked into GPT-4 oh, there is no tool. You just use it within the regular chart, GPT. Just choose GPT-4 oh in the dropdown menu and you can create images. Is nothing short of magical and incredible. I've tested it across. Dozens of business and fun use cases in the past three days since it came out and it's blowing my mind every single time. Anything from its ability to understand context from its ability to follow instructions, to keeping consistency in characters and objects and rooms and views from its ability to combine things that understands from the chat and apply it into the images. Its ability to iterate on images while keeping consistency. It's ability to understand style guides and so on. Literally everything I threw at it text very accurately, including the capability to generate entire infographics from scratch just based on whatever data or topic you wanted to create the infographics for. It just does things that were not possible before. I tried combining several different images into one image, and it does that very, very well. It creates text on objects accurately so it understands the volumetric nature of things. It can change the point of view. I took an existing image from the internet and asked you to create a God's eye view of that room, and it did with the sofa and the cushions and the pictures on the wall and the carpet and the design of all the different things. Now, it wasn't a hundred percent accurate, but it was good enough to fool a person or be one of those tests where you need to find the differences between two different images. It's really good across a huge variety of capabilities that were not possible before unless you are a professional designer. Now this broke the internet because literally everybody went in and started generating images. Most people started creating images of themselves, their family, and their coworkers in Ghibli style anime, which is really cute, but not very helpful from a business perspective. But all the other things that I said, and there were a lot of people who tried like me to do actual business use cases. And that drove Sam Altman to tweet the following. It is super fun seeing people love images in ChatGPT, but our GPUs are melting. So creating these images is a high compute demand, and the fact that more or less every chat PT user went in and started creating images, drove the demand to a very, very high limit from the first time I tried it, which was shortly after they announced it. It was a lot faster generating images. Now they're generating relatively slow. Sometimes it gets stuck in the middle. Sometimes you have to refresh the page and so on. So it's bringing ChatGPT to a halt just because of demand. I think this will die soon in a week or two or three weeks after everybody's exhausted, their need to just play with it and we'll start using it for actual business use cases. But in the meanwhile, they stopped access to free users. So initially it was available to free users as well. Now you have to be a paid member, either the$20 a month, or the$200 a month. But they're saying that they will provide it to free users eventually. Now, in addition, it allows you to create images that are questionable. And now I'm gonna quote Sam's Tweet on X. This represents a new high watermark for us in allowing creative freedom. People are going to create some really amazing stuff and some stuff that may offend people. What we'd like to aim for is a tool that doesn't create offensive stuff unless you want it to, in which case, within the reason it does. Now I'm jumping a little bit forward. We think respecting the very wide bound of society will eventually choose to set for AI is the right thing to do, and increasingly important as we get closer to a GI. Thanks in advance for understanding as we work through this. So what does that tell us? It tells us that Sam Altman is leaning more towards his nemesis, Elon Musk's approach with grok. If you think about why people really like grok, and I started using grok all the time, is that it's more edgy and punchy and cares less than the average AI about what is politically correct. And this move by open AI is moving in that direction, is basically saying we are not going to be the people who decide the red line in the sand or what's acceptable and what's not, and what people can and cannot generate with this, at least in most cases. So you will be able to create more or less anything you want with this image generator. And it will be very interesting to see. I haven't seen that yet blowing up on the internet, but I'm sure that it will because it will now allow to do things that other closed source models will not allow you to do. Now, they're also asking for feedback on that. So it will be very interesting to see how this evolves and how they correct. Now, in addition to this amazing image generation capability, and I'm going to record an entire episode about this together with a similar capability that was released by Gemini on their experimental platform on Google's AI studio, which they actually released before Chat Chip that has similar but not as advanced capabilities. So I'm gonna do a complete episode on a Tuesday, diving into how to use these tools and what you can do with them, in addition to that capability. It is a very powerful coding tool. so artificial analyst, that is an independent analysis group that measures AI models to choose the best models on API providers for different use cases. So they just evaluate AI APIs on different tasks they're saying and now quoting from their tweet. Today's GPT-4 O update is actually big. It leapfrogs CLO 3.7, sonnet non reasoning, and Gemini 2.0 flash in our intelligence index and is now the leading non reasoning model for coding. This makes GPT-4 oh the second highest scoring non reasoning model coming just behind deep seek version 3 0 3 24 released earlier this week. So what does that mean? It means that in addition to all the stuff that GPT-4 oh did before, it now is one of the top quarters in the world. Second, only to surprise, surprise, the latest Chinese version from deep seek. But it passed the leading tools before that. It is also now ranked number two on the LM chatbot arena leaderboard that we talked about many times in the past. That is based on votes of people not knowing which tool they are using, and it's surpassed GPT-4 0.5 in many other models that were just recently released by both ChatGPT and other companies, and it's now second only to, well, Google's latest release, which is Gemini 2.5 Pro, that was launched on the same day. So if this doesn't make you feel like everything is accelerating, I don't know what will. So think about it. In the same week, we have a new model from Quinn, a new model from deep seek, a new model from Gemini, and a new model from ChatGPT. All of them are dramatically better than the models before, which is shuffling, who's on top of the leaderboard. But all of them are ahead of the models. We had just a week ago. Now beyond the acceleration, this new image generation capability is dramatic change from everything we've seen before. From my perspective, it's a new chat GPT moment. And the reason I'm saying that is it's now good enough and actually a lot better than good enough to replace many SaaS types of software that were designed for image generation and editing and replace a lot of people who are now taking a lot of time doing this work. So the implications of this is not just a cool image generation capability. It is the fact that it can do a lot more professional work than could be done before with these tools all baked into Chachi pity as a feature. Now, I think the next evolution of this, and it's. Practically already there, just not without the right user interface. The tool itself identifies every aspect of the image. I did some incredible things in the past few days, so it knows every component of what it sees because it understands the images, it understands the depth, it understands what you're asking from it, and you can make small nuance changes to the image and get the same image again just with small changes. To tell you how crazy this is, I decided to test a campaign for a sunscreen lotion. I took a low quality image of the sunscreen lotion from the internet. Ask it to create a female hand that's holding it with the background of the beach. It did that. The product looks perfect, the hand looks perfect, the beach looks perfect. It looks like a professional thing. I asked it to extend the image. I asked it to add text to it. I asked it to paint the nails of the woman in the colors of the US flag and turn this into a 4th of July campaign, and it did. So it knows how to grab fingernails and paint them in the US flag, like the nuances I was able to get to across multiple use cases. This is just one example is insane. So I think the next evolution will be that these tools will allow us to also visually pick specific components. So think about. Doing what you can do in Photoshop, but literally just by asking for it, I want you to, I want you to allow me to grab this part of the text. I want you to allow me to manipulate this component and make it brighter, make it darker, change the direction of it, and so on. Just by grabbing components out of an image and being able to move them around, manipulate them in any way we want, either through user interface or through words, literally by just by talking to it. And that will be the end of professional design tools and definitely tools like Canva and I'll be really surprised if that doesn't happen very, very shortly. Now there's the issue of it will stop you every now and then from doing things you want because it's not aligned with its policy. Well, as we've seen before, every time one of these tools comes out, the open source universe catches up. So I'll be really surprised if within a few months from now we don't have this capability on open source, which means you can do whatever you want. We talk a lot in this podcast about the importance of AI training. Multiple research from leading companies has shown that this is the numbeR1 factor in success of AI deployment in businesses large and small. I'm excited to announce that we just opened the registration for our spring cohort of the AI Business Transformation course. I've been teaching this course for two years, starting in April of 2023 and hundreds or maybe thousands of business leaders has went through the course. We had people in the recent cohort that ended in February from India, the Emirates, several different countries in Europe, South Africa, many places in the us, Canada, and even Hawaii. So regardless of where you are in the world, this could be a great opportunity for you. In previous courses, we had people as far as Australia and New Zealand, so weird hours of the day, but still getting a lot of value from this course. The course is four sessions of two hours each spread over four weeks on Mondays noon Eastern time, starting on May 12th. If you are looking for ways to accelerate your personal knowledge and career, or to change the trajectory of your team or your entire business, this is the right course for you. It is really a game changer, and within four weeks and only eight hours with some homework, you will. Dramatically change your understanding of how to use AI in a business. We give multiple hands-on examples and use cases and teach you the tools and the processes on how to use them. And we end up with a detailed blueprint on how to actually implement AI successfully business wide. So if this is interesting to you, go and check the link in the show notes. You can open your phone and click on it right now and go and check all the information about the course And because you are a listener of this podcast, you can get a hundred dollars off of the price of the course with promo code leveraging AI 100. I would love to see you join our course in May. And now back to the episode. And now we're gonna change to rapid fire items, there's a lot to talk about. And the first one relates to everything that I'm saying about the world and how it's going to look like. Bill Gates, Microsoft co-founder was interviewed on the NBC to NI show and he stated that within 10 years AI will render humans unnecessary for most things. And he calls it the era of free intelligence. And a few specific examples that Gates shared with Jimmy Fallon is things like that top doctors and teachers will become commonplace. Basically, AI will deliver great medical advice and great tutoring at scale, potentially slashing the cost and boosting access of these really important capabilities. But what does that mean to human doctors and human tutors? that's not clear to anybody. And the interesting thing that he's painting a future where AI tutors, AI doctors, and if you relate to that and broaden that ai, everything will be as normal as smartphones. So if you think about 20 years ago, nobody had smartphones. It was a concept we couldn't even grasp. And now everybody has smartphones, and we probably cannot imagine our day-to-day without them. Well, he's stating that within 10 years, AI will be that. But while we were just talking about these examples, if you generalize this means manufacturing, logistics, farming, managing, writing code, literally everything humans can do, AI will be able to do. And the only thing Gates admits that will probably survive is things like sports. So basketball players, baseball players, and people who do theater and things like that. Which people would still want to watch. But more or less, everything we know as a professional job, not in sports or arts, will be done by AI probably a lot more broadly than by humans. Now to a few quick news from OpenAI. On March 24th, OpenAI revealed a major leadership reshuffle. So Sam Altman will pivot to focus more on technical research and product development while Brad LightUp, who is their COO, will step up to run the day-to-day operations. In addition, Mark Chen rises to chief research officer to steer the scientific breakthroughs and Julia Villagra takes chief people officer position to manage all their talent. So a lot of shuffle in the top leadership in OpenAI that obviously follows departures of a few big names in the past 12 months. People like Mira Marti, who was the CTO and IA Saver, who was one of the leading, or maybe the leading scientist and many others. So new shuffle in open ai. And it makes sense. The company has grown dramatically in the past 12 months. They have different needs. Right now. They have 400 million clients that are active all the time. They're delivering new products and they're in the process of converting from a non-profit to a for-profit. So it makes sense that it's gonna be reshuffling. It'll be interesting to watch to see what impact does it have on the performance of the company. Another big piece of news from OpenAI and to the AI world in general is that OpenAI is adopting Anthropics model context protocol called MCP, which is an open source standard that connects AI models and AI agents to data and tools. The protocol was released by Anthropic a few months ago and has become extremely popular with developers of agents. And now OpenAI is basically saying this is a great model, we love it and we're gonna join and use the protocol as well. So as of now, it's already available as part of Open AI's agent SDK desktop app and the API support coming soon per Sam Altman's tweet, which said, people love MCP and we are excited to add support across our products. Anthropic chief product Officer, Mike Gregger, obviously was very happy and he said, excited to see that MCP love spread to OpenAI. Welcome. What does that mean? It means that CPS approach is successful. It is open source. It is open to everybody, and it allows to create a unifying environment and solution for agent development. I think the more standardization we're going to see across this industry the better because it will allow for collaboration. It will allow for companies to choose, switch and replace the agents that they're using because the underlying infrastructure and connectors between them are going to be standardized. So overall, I'm myself very excited about this as well. OpenAI are not the first company block. Apollo Repli, code Sourcegraph and other companies has already adopted MCP. So it seems that it's going to be the infrastructure to allow agents to connect with data, connect with themselves, and connect with tools moving forward. Now on a different topic from OpenAI, it seems that they are talking to several different providers to purchase billions of dollars worth of data storage, hardware and software aiming to build its first ever data center for themselves. That's per the information article from March 26th. Now they're planning based on this article to purchase five exabytes of storage. That means absolutely nothing to me and probably to you, but this raves Apple's iCloud capacity just a few years back. So this is a huge amount of data storage that Open Air wants to purchase and own. And this will obviously work in tandem with their target project, so their GPU project that they're working together with SoftBank, that is a$500 billion investment over the next few years to build data centers. So these two will probably work hand in hand. On a different topic. OpenAI in collaboration with MIT Media Lab has released a research exploring chachi's effect on emotional wellbeing of its users based on its huge amount of, as I mentioned, 400 million people using it every single week. What they found is only a small fraction of users were emotionally connected to ChatGPT, despite the very large volume. Now they've done two studies in parallel that were measuring different things and different emotional connection and impact on people. But what they did find is that users who trusted and bonded with ChatGPT were more prone to loneliness and dependence. Hinting that there is a way to develop emotional dependency on these tools, which is something we all need to be aware of, especially those of us who have kids who are already potentially addicted to social media. This might be a lot worse if it spreads and it's again, something that we need to be aware of and at least educate people about. Now for those of you who are enjoying deep research on ChatGPT, I'm definitely one of those people. As if you are a regular paid users who pay 20 bucks a month, you only have 10 searches in deep research a month. And if you're paying the$200 a month, you limited to 120 deep research queries, but it was impossible to know where you are in the count. that was solved right now, it creates a pop-up that tells you how many you have left every time you run it. But something even cooler if you hover you cursor above the deep research button, it will tell you where you are on your current count and how many you have until the reset date and what the reset date is. So this is actually very helpful. Just hover your mouse over it and you will know how many deep research queries you have left. On a not so exciting news for OpenAI on March 23rd. Kai Fu Lee, who's the ex Google China head and the founder of Zero one ai, which is a really successful AI startup. He's claiming that they're teaching their proprietary models, mostly ChatGPT to adopt Deep Seek's, open source tech. He's claiming that running Deep Seek free and efficient models is costing him 2% of what it cost him to run the open AI APIs in the backend to perform the same tasks that his company is performing. This is a huge push towards the really advanced capabilities of open source ai, and I think this will continue pushing the pricing of AI down. Going back to what Bill Gates dubbed the age of Free Intelligence. Now we're gonna shift gears and talk about contradicting signs of where the AI copyright battle is going. So the first one will stay with open ai. So on March 27th, 2025, a federal judge in New York ruled that the New York Times and other newspapers can move forward with their copyright lawsuit against open AI and Microsoft. Those you remember, there's multiple lawsuits against open AI and other AI leading labs on using copyrighted material to train their chatbots. They claim that this is fair use and obviously the owners of the cooperated material think otherwise. And so the judge allows you to move forward, which may go to a jury trial. Now, as you know, a jury trial can go either way'cause the jury will decide what's gonna be the fate of this lawsuit. That being said, that will end up most likely the Supreme Court either way. So whatever the jury decides will stand for a very short while, but then will be questioned in the Supreme Court and whatever they decide will be the way this moves forward. In the current political environment, is pushing for AI innovation regardless of the costs and the consequences. And so I think in the current political system and with the way the Supreme Court is leaning since the previous Trump administration, I think the outcome will be in favor of AI innovation. But I might be wrong. But to tell you how confusing this is, anthropic on March 25th actually got a really big win. And in a case in California, a federal judge rejected a bid by the music giants like Universal Music Group and others to stop the AI firm from using cooperated song lyrics to train its chatbot, Claude. So the judges were calling the claims vague and noting that they couldn't prove. Irreparable harm to their business in the lawsuit. Now the interesting thing here is you have two states that are on the same side of the map. So both are hardcore democratic states, and yet we found two opposite rulings by a court, which aligns with the call from OpenAI last week to the federal government to step in and define rules and regulations that will be on the federal level instead of allowing a state level patchwork that is already in the making across multiple states and multiple regulations. How that evolves, I don't know, but it will be very interesting to track and I will keep you posted. Now Staying on Anthropic. They released a, I don't know, a final, but a new update on their ability to look into the brain of Claude and see how it works. Introducing new tools and methodologies on how do, if you want, read the mind of how AI works. So historically, AI was a black box. We know how it's getting trained and we know the outputs, but it was very, very hard to track how and why it's actually doing what it's doing. Anthropic has been at work at trying to understand how the AI brain works. They now came up with a tool called microscope that allows them to trace cloud internal steps and how it thinks and solves different problems. It's worth reading their release. It's written in non-technical terms and it actually reveals that Claude thinks in very universal concepts before translating into specific language. So it may think and analyze a question in several different languages like English, French, and Chinese to a simple question like what is the opposite of small? And only then it comes up with the language to describe it to the user. They also found that when given a simple math question, it actually computes and goes step by step. But when it gets a very complex mathematical question, it tries to wing it. And if you want bullshit the user with an answer without actually trying to solve the problem. The importance of this research, it goes way beyond that because the goal is the ability to see what AI is doing as it's doing it, to be able to prevent it from doing things we don't want it to do either in the current scenario, but way more important when we hit a GI or a SI and beyond, where AI will be significantly smarter than us. Now, is that really doable at that point? I don't know, but I think it's a very, very important step by Anthropic in the right direction, and I really hope that these tools will also be able to run on other AI models and that other AI companies such as Open AI and open source companies will adopt these tools in a similar way to the way they adopted MCP to allow us to see how these models work and stop them before they do something that we do not want them to do. Still Anthropic joins forces with Databricks. Those of you who don't know, Databricks is one of the more successful enterprise data management platforms. And this collaboration will allow the 10,000 corporate customers of Databricks to run Claude's models, including the latest, Claude 3.7 to get insights from their data. This is obviously the big promise of AI when it comes to corporations and then eventually smaller companies as well, is to be able to look across the entire company data and get insights that so far were either very hard or on the verge of impossible to get, and that will become obvious to anyone just by asking the questions or even just the AI by itself suggesting insights based on what it's seeing. Anthropic also released version two of their economic index, which is their view on how people are using Claude for which tasks we shared with you when they shared the first one a few months ago. This one now has Claude 3.7 sonnet in it. That wasn't available when they released the previous survey, And it's very, very clear that Claude is the king of coding because it's now taking 37% of usage Claude 3.7 sonnet. if you go to the review, you'll be able to see bar chart that shows the different tasks that people are using Claude four and the first 10 or so tasks are computer related. Beyond that, the next things are education, science, and healthcare, but these are very far behind in volume compared to coding capabilities. I use Claude 3.7 for coding small things, and it works very well for me. I'm not a coder, so it's very hard for me to compare to other tools, but I did find the most amount of success building small applications with Clot. Now from Anthropic to Google, as I mentioned earlier, Google launched Gemini 2.5 Pro Experimental, which is what they're claiming is their most intelligent ai, yet with built-in reasoning and it jumped to the top of the LM Arena, as I mentioned earlier. Now, if you remember a few weeks ago we talked about human's last test, which is trying to come up with the hardest questions on earth to test AI's capabilities. So previously open AI oh three mini scored 14% and deep seek R1 scored 8.6%. Well, Gemini 2.5 pro experimental now scores almost 20% with 18.8. But beyond that, as I mentioned, it's on day-to-day tasks that people are evaluating it. It's scoring better than any other tool on the planet as of right now. It also has the same huge context window as the other Gemini tools. So right now, 1 million token context window, soon to be switched to a two millions context window. And it handles text, images, audio, video, and code making it probably the most versatile tool out there. It is still the only tool that knows how to quote unquote watch video and understand what's happening in the actual frames and not just listen to the audio track and can understand what people are saying. So if you need to analyze video, this is a very powerful capability that did not exist before this tool was released. In the demo by Demi Saab, he showed significant improvement in coding and they showed a few demos of creating mini web apps and interactive games with just a few prompts. It is currently available at the Google AI Studio, and if you're a paying user of Gemini, of the 20 bucks a month, basically Gemini Advance, you have access to it as well. Staying on the topic of new models, as I mentioned, deep Seek released a new version of its V three model. This is dubbed V three dash 0 3 24 for the date it was released. It has an MIT open license, so it's an open source model that you can take and use and it has a significant upgrade compared to its predecessor, mostly in reasoning and code generation and user intent understanding. So it scores very high on all these things. As I mentioned, it's right now on several different benchmarks, the best AI coder out there, period. What does that mean? It means that there's two races that are very active and aggressive. One is the race between China and the us. The other is the race between open source and closed sourced ai. And it is very clear right now that both these races are very close and are gonna be very close moving forward with both open source and China pushing models that are as good and in some cases better than the leading labs in the US. And if that's not enough from China, Alibaba just released Quinn 2.5, Omni seven B, which is a 7 billion parameter multimodal AI model that is designed for end-to-end processing of text, images, audio, and video. Again, something that will compete with the US models across the board in a real true multimodal AI solution. Now, despite the fact it is a relatively small model with only 7 billion parameters, it rivals many other larger single modality models and multimodal models as well from the leaning labs around the world And like its previous models, this is also an open source model that everybody can go and use. And it's available on both their website as well as other solutions like hugging face. Continuing on this path of this insane week. As far as releases MIS trial just launched MIS trial 3.1. 3.1 has 24 billion parameters, and it is scoring better than some of the other open source models like Google Gamma three, and even GPT-4 oh mini across several different benchmarks. It supports 21 languages, including European and East Asian languages. It's still lacking on the Middle Eastern side of capabilities when it comes to languages. And it has 128,000 tokens context window. And it's an open source under Apache to license, which means it's free to tweak, deploy, and allows developers to do basically whatever they want with it. So what does that mean? It means there are more and more capabilities that are available as open source with very powerful tools coming from everywhere around the world for free or almost free. And they're going to keep on pushing the pricing of AI down while the capabilities are gonna keep on moving upwards. Staying on the open source topic on March 19th, hugging face submitted a open source AI blueprint to the White House AI action Plan. and what they're claiming is that open source models can outperform closed source models. One of the examples they gave is Olympic coder, which is a 7 billion parameter coding parameter out codes, clawed 3.7, which was considered the top of the line coder until that point again, now was taken by the new deep seek model. Now hugging face is definitely a powerhouse when it comes to open source. They're hosting 1.5 million public models that are available for everybody to use and co-develop, and they're claiming that these models can compete with the closed source, one at a fraction of the cost. The strategy that they're suggesting to the White House is a combination of three different components, collaborative innovation, resource efficient models for smaller players and transparent security. And basically what they're saying is they're saying that instead of following opens AI's suggestion to the White House to be a light touch on regulation, they're claiming the open source world will provide a much stronger and safer foundation because everybody can see what is actually going on with these tools. Now I must admit, I'm not sure which side of the fence I'm on. I don't think I know enough to make a fair judgment call. I definitely see the logic in argument of both sides. One that's claiming that giving it to anyone to change and manipulate is scary. But on the other hand, I'm also thinking that not knowing what open AI and cloud and so are doing, and definitely similar companies from China and other places around the world, versus allowing everybody to see what's happening in the code and in the systems and in the weights and so on, also makes sense to me. I don't think this is gonna get resolved, and I think we're gonna have both universes running in parallel, just like we had with a lot of other tools like operating systems. Now, how will that impact the Trump administration team that pushes innovation and very little regulation? It will be interesting to follow that as well. Now a few quick updates from the robotics world. So a new company from California called Dexterity just introduced a new robot called Match and it's a interesting combination of like old school robotics and humanoid robotics. So it's a moving platform versus a humanoid robot, but it has arms that are mimicking human arms just with more capabilities and these arms can lift 65 pounds each. So it's making it the robot that now can lift the most amount of thing and move around a factory. Their first use case is to manage packages on different supply chain lines and put it on trucks or other conveyor birds and so on. And it's very efficient at doing that in the demo that they were showing. It can lift high weights while still doing it very accurately. Now, the interesting thing is they're claiming that one worker can oversee up to 10 matches. Robots match or mix robots at the same time, which means from a labor perspective in it's not just replacing the human workers who are moving the packages around, but also the amount of managers required to do that. Their initial use case, as I mentioned, is truck loading, but they're definitely looking to scale for solution in the general logistics world. Staying on the US side of robotics on March 25th figure, which is one of the leading humanoid robotics companies in the world, have released a video showing that their robot figure two can walk like humans. So it doesn't have that robotics walk anymore. And the way they achieve that is through a physics simulator that allows AI to get reinforcement learning of years within just a few hours just by running its brain if you want, in the sim. And now it is working a lot more like a human with hill strides, tow offs, and arm swinging, which makes it look very human. The interesting thing here is not the fact that it works more naturally or what seems more naturally to us, but the fact that you can train these robots in a simulated environment to do tasks that previously took months and years to do, and now it can do it in a very short amount of time, and that means that they'll be able to mimic literally any activity that we want and learn it very quickly. Moving from the US to China. So we mentioned many times before China has some of the most advanced robots in the world. So Abot, which is one of the Chinese successful startups, are planning to produce three to 5,000 humanoid robots in 2025. So this year, this is a jump from only a thousand units in 2024, so a three to five x increase, they are releasing three different models. Abot link C two Genie operator, O one and YZA two. The Yan Zg A two is very interesting because it has very delicate capabilities and precision, and one of the things they demoed is the ability to thread a needle by the robot, which requires obviously very gentle dexterity and understanding what's happening in its surrounding. So different robots for different scales and they're scaling up their production. And the quote that they released is, we aim to deploy new products in industrial scenarios this year, replacing humans in specific tasks to deliver tangible customer value. A, I agree, it will provide more value because the products that they handle will be able to be cheaper, at least from a supply chain and operation perspective. But the question again is what will happen to the people who held these positions before, and if they don't get paid and don't have money, who will buy the products that the robots will pack and ship and do everything with? I don't think anybody has answers to that but we're going in that direction faster and faster. And another interesting robotics thing that happened in China this week. Chinese influencer, Zang, gen one, rented a unit 3G one, which is their smaller robot that we talked about a lot in the past few weeks because it learned how to do ninja moves and kicks and do different cool things. Well, that you can now rent them in China for about$1,400 a day. And he rented it and documented the whole thing and show the world how it was doing, kicking, cleaning and even attempting dancing. It was cooking eggs, sweeping floors, and even running alongside Zang in the park. It couldn't do the dance move, but it still was very funny to watch. So you can go and watch that, video online. But the whole concept that you can now rent these robots because people bought them,'cause their price is now$14,000. You can buy a robot for th$40,000 and it created a whole new market of robot rentals. So think about you want to throw a party, or you need to clean your garage, or you need to do any other big project. And instead of buying the robot, you can rent it for the day to help you with that particular task. Now, going back to what I said before until I know for sure that these robots won't destroy my house, or more importantly hurt my kids, there is no way any one of these is coming into my house, but as soon as these safety measures are in place for a$14,000 amount, being able to do all the chores in the house and around the house, plus some other things that we can't even think of, it's a no brainer even at today's cost, which will dramatically drop as these things scale up their production. Switching gears to a different topic, re or Rev, I'm not sure how to pronounce it, I must admit, is a new company in the field of image generation. They just released their first image generator code name Half Moon, it is running in a different business model than most of the other models. You're just paying for tokens and it's 1 cent per image. So you're paying$5 to generate 500 images, which is cheaper than most of the tools out there today. So think about a monthly plan in Mid Journey starts at$8 a month. Here you can pay$5 and until you run out of credits, you can keep on using it. Most people probably are not going to generate 500 images per month, which is actually really good in head-to-head tests by independent evaluators, it actually was as good and better than Midjourney Flux and Ideogram across multiple types and styles of image generation. So going back to what we started with ChatGPT and Gemini generating amazing images as part of their current payment plan, and now with other quote unquote professional tools that do it significantly cheaper. This will also drive the cost down of image generation across the board between open source and closed source models. Now, the cool thing is you can try it for free, so you get a hundred initial free credits and then 20 daily freebies. That should be enough for most users, unless you're a professional image generator. I said that about a year and a half ago, I do not see a future in which image databases survive, there's absolutely no reason for them to exist anymore. Because you can generate any image you want in any style that fits exactly what you need in seconds. Why would you ever spend time searching an image database for an output? so the whole mess that came out last week with Google Gemini that can eliminate watermarks from existing images on platforms like Getty Images is completely irrelevant from my perspective because I haven't downloaded or searched an image on the internet for the last year and a half, and that was before the tools were as good as they are right now. That's another industry that is gonna be probably wiped off the face of the earth As more and more people understand how to use AI image generators, and the fact that they're now built into the actual chatbots is gonna make that a lot more widely available. Staying on the creator's universe. On March 18th, the US Federal Appeal Court declared AI generated music, unco Writeable, reinforcing its initial stating of that situation and aligning it with the Copyright Act that covers music from 1976, and it aligns with everything else that the corporate office has said. If humans are not heavily involved in creating something, it's not copyrightable. I think that's actually a very good thing. As a musician, myself and somebody who loves to play music, these tools can generate hundreds of thousands of new songs every single day, even if very few of them become successful. It will wipe out any human generation of music very, very quickly, as we will drown with music generated by ai and then musicians will not be able to make any living. Now you're asking what does one have to do with the other people can still create music and people will love it. Yes. But there's no way to monetize it. So if there's no way to monetize it, there's no reason for a lot of people to play it. And then there's no reason for this industry to exist, which means the human side of things will still be able to make money. Maybe challenged a little bit, but not wiped out by AI generated music, Staying on the topic of creation of things with ai. Two very interesting releases when it comes to voice models. The first one is Maya, which is a model that was released by a company called Sesame Maya is a very punchy human-like edgy, if you want voice assistant, and it has blown up in the internet with really cool, crazy examples of how human-like and cheerful it sounds compared to the relatively dull other voice models that sound too robotic. They have released their model as open source for anybody to use, so you can go to Hugging Face or GitHub and get the code for Maya and run it as your own and get a very human-like voice used in any application that you want. In addition, on March 26, grok with a queue. So Grok, the AI Infrastructure Inference Company has released that they're teaming up with play AI to release what they call dialogue, which is a text of speech model that runs 10 times faster than realtime speech, meaning its ability to synthesize and create speech is significantly faster than the speech itself. They're claiming that it's very, very flexible and it uses full conversational history to nail rhythm, tone, emotion, and make voices sound very natural. So it's currently available on Groks high speed inference platform, making very little mistakes, sounding very real, and it has English and Arabic as the languages that run on it. so far the Middle Eastern languages, Arabic and Hebrew were usually not available on any of these voice platforms. And Arabic is the fourth most spoken language on the planet. So it's obviously a big deal. So if you are an Arabic user or you have Arabic clients and you wanna use an AI voice agent, you can now do this on this new platform. What does that mean? It means that we will see more and more voice agents working across multiple aspects of industries. I assume the customer service world is gonna be the first one taken by storm, but later on it might be salespeople and a lot of other customer or internal facing capabilities to just have conversations on any topic. Think about everything that we talked about before with agents becoming a part of the workforce. You'll be able to talk in natural language with an AI agent that is not human and not know the difference between that and a human employee on a conference call. Does that make sense from a productivity perspective? Yes. Does that connect very well to what we started with as the research from Harvard and Ethan Molik? Yes. Is this going to be a very weird future? Absolutely. Now we're talking a lot about moving forward to a GI and where we are in the process with all the different advanced capabilities that we're seeing almost on daily basis right now. we talked about the ARC Foundation previously where Arc A GI was the benchmark that was blasted by the leading models right now. Well, they just came out with ARC AGI I two and they came out with it on March 24th, and the goal is to push the limits of what AI can do from a reasoning perspective. So similar to ARC A GI one, it is a visual pattern puzzles that are geared to challenge AI in solving them. And it's very interesting to see the results. With over 400 humans averaging 60% success rate on this test. While Deep Seek R1 varies between 1%, 1.3% accuracy and GPT-4 0.5 and CLO 3.7 sonnet are around 1% as well. So significantly far behind humans showing that throwing more compute and even reasoning capabilities has its limits at least right now compared to the human's brain's agility and understanding of specific problem solving. The ARC is also offering a prize called the Arc Prize 2025, and the price is gonna go to the first company organization that's gonna hit 85% accuracy on this test, but only for 42 cents per task. So what they're claiming is that it's not just about being able to solve it, but also being able to solve it efficiently, which will actually make sense from a financial investment perspective. The last piece of news that we are going to talk about is actually an interesting trivia. So on March 20th, the Computer History Museum together with Google has dropped the original 2012 AlexNet source code on GitHub. This was the first neural network that actually worked because it took the error rates from 25% to about 15%, and it was basically what started the current AI era that we know. So this is very, very cool. If you wanna run the very first neural network that worked properly, that is actually really small and can run on a single computer with the right GPU. So it's a GTX five 80 and you can run it in your house. If you're really geeky and you really wanna play with it, it is now available. That's it for today. We'll be back on Tuesday showing you exactly how to create amazing videos, including all the different steps between research of what you need in the video to scripting the video, to creating the video, to creating the audio of the people in the video, to actually rendering the video and editing the video. All with ai, all within less than an hour. So if you need to create videos, and most businesses do, this is gonna be a fascinating episode. We're also opening a spring cohort of our AI Business transformation course. So if you have not had AI efficient and effective business related training, and you're looking foR1, check out the course in the link in the show notes, share this podcast with other people who can benefit from it and have an amazing rest of your weekend.

People on this episode