
Leveraging AI
Dive into the world of artificial intelligence with 'Leveraging AI,' a podcast tailored for forward-thinking business professionals. Each episode brings insightful discussions on how AI can ethically transform business practices, offering practical solutions to day-to-day business challenges.
Join our host Isar Meitis (4 time CEO), and expert guests as they turn AI's complexities into actionable insights, and explore its ethical implications in the business world. Whether you are an AI novice or a seasoned professional, 'Leveraging AI' equips you with the knowledge and tools to harness AI's power responsibly and effectively. Tune in weekly for inspiring conversations and real-world applications. Subscribe now and unlock the potential of AI in your business.
Leveraging AI
166 | Grok-3 is No. 1 on the AI Leaderboard, ChatGPT 4.5 and 5 and Claude 4 are around the corner, AI impact on the workforce and other important AI news for the week ending on February 21, 2025
Is Grok-3 officially leading the AI race?
In this episode of Leveraging AI, we break down how Elon Musk’s latest model just outpaced OpenAI—and why it matters for your business. Plus, get the latest on upcoming releases like ChatGPT 4.5, GPT-5, and Claude 4, and what these advancements mean for your competitive edge.
We also dive into how AI is reshaping the workforce—boosting productivity, shifting roles, and raising important questions about the future of work.
In this AI News episode:
- Why Grok-3 is setting a new AI standard
- Big AI releases on the horizon—and what to expect
- How AI is transforming productivity and job roles
- The rise of AI “employees” and what that means for hiring
About Leveraging AI
- The Ultimate AI Course for Business People: https://multiplai.ai/ai-course/
- YouTube Full Episodes: https://www.youtube.com/@Multiplai_AI/
- Connect with Isar Meitis: https://www.linkedin.com/in/isarmeitis/
- Free AI Consultation: https://multiplai.ai/book-a-call/
- Join our Live Sessions, AI Hangouts and newsletter: https://services.multiplai.ai/events
If you’ve enjoyed or benefited from some of the insights of this episode, leave us a five-star review on your favorite podcast platform, and let us know what you learned, found helpful, or liked most about this show!
Hello, and welcome to a weekend news episode of the leveraging AI podcast, the podcast that shares practical ethical ways to leverage AI to improve efficiency, grow your business and advance your career. This is Isar Matis, your host and another packed week. It's a little different than Regular weeks because there weren't a lot of big releases. That was one very important one, which we're going to start with. But there are a lot of rumors of what's coming down the pipes in the next few weeks. And we're going to talk about that. We're also going to focus on the impact of AI on jobs and some interesting statistics in that world. And then we have a very long list of rapid fire items that we need to cover. So let's get this started. The biggest news of this week is the release of grok three. So those of you who don't know grok is the large language models coming from x. ai, which is Elon Musk's AI company. The first two models Grok 1 and Grok 2 were mainly toys that potentially provided some information if you're a heavy X. com user, previously Twitter. But if you're not, they really didn't provide any significant value and I don't think they deserved any real attention. Grok 3 is a whole different kind of animal. First of all, Grok 3 was trained on Colossus, which is the largest supercomputer ever built. It has more than 100, 000 GPUs going up to 200, 000, right now as we speak. So they've trained it on a huge computer with a huge amount of data. And the model immediately achieves very powerful capabilities. first of all, they released several different models. There is grok3, grok3 mini, like many of the other models, just a smaller, faster, less expensive variation of the same model. They also released grok3 reasoning, grok3 mini reasoning, again, aligning with the new trend of reasoning models. And they've also released a deep research capability, just like all the other recent tools. We're going to talk about that as well. So immediately at the release, Grok shared that they basically beat all the other models on the major benchmarks that are out there today, both the aim for math and GPQA for PhD level science problems, as well as some other benchmarks. Now, as I mentioned, they also released deep research and they released what they called think and big brain models that are coming with some more advanced thinking capabilities, and they're also about to release live voice mode in the next couple of weeks, and an enterprise API access level is also coming up in the next few weeks. Now, in addition, they're planning to open source grok two. So the previous model, as soon as they finalized the deployment of grok three, we heard a similar approach from Sam Altman a few weeks ago saying that they were on the wrong side of history of not releasing their models as open source. So open AI is going to do the same thing. They're going to start releasing older models, not their latest frontier models as open source. I think this doesn't make a big difference because the most advanced open source models are more advanced than their previous models, but the open source world will gain access to some additional models. Now, as you heard me say in the past, I think these benchmarks are highly overrated because they're very specific and you can train the models on them. And they actually don't mean anything for real life, but what does mean a lot for real life is the LMSIS chatbot arena, which we talked about many times in the past. For those of you who don't know what it is, the chatbot arena basically is a blind test. It gives a user, any user, you can go there tomorrow and do it as well. You put in a prompt and you get two results. And without knowing which result is which model, you can. Pick the one you like better. just less than 24 hours after releasing Grok, Grok claimed the number one spot at the chatbot arena. So more people across multiple use cases around the world find Grok 3 to be a better model than any other model out there right now. Now, Andres Karpathy, who is a former leading open AI engineer a highly appreciated AI scientist said Grok 3 plus thinking feels somewhere around the state of the art territory of OpenAI's strongest models and slightly better than DeepSeq R1 and Gemini 2 flash thinking. So somebody who is a serious person in the AI field also thinks that this is a serious contender and not just the masses using it on the day to day. By the way, I think the masses using it on the day to day are more important than any expert because they're actually using it for real use cases. Now, in parallel to this, XAI has dramatically changed their pricing model in a very confusing way, I must say. first of all, they've increased their premium plus level subscription, which gives you access to basically everything, from a 40 to 50 instead of 22 that it was before, with an annual subscription going to 350 to 395. And, That's also up from the previous annual subscription. Now, when I say a range of pricing, it's really confusing because in different places, X actually has different pricing. So their support page says 50 a month, the signup page says 48 a month, and the actual checkouts showing 40 a month. So I don't know if this is just a mistake because they were worried about other stuff in the release, or they're just testing different price points. But the reality is they dramatically increase their subscription level. If you remember the highest subscription level right now on open AI is 200 a month, so they're still not there and they're providing highly comparable capabilities in grok three right now. Now in parallel to all of this, in addition to releasing again, right now, maybe the best model on the planet, XAI is seeking a massive funding round of another 10 billion with a proposed valuation of 75 billion if they raise that, which I don't have a doubt that they will a, because it's Elon and B because they now have the leading frontier model in the world, it would bring their total funding to 22. 4 billion. Now, What's amazing about this achievement is that XAI started the development game of new frontier models very, very late. And they were able to accelerate their process to delivering a model that is top off the line in the shortest amount of time compared to any other model out there that's following the development of the Colossus model in a few months, which usually takes normal companies a year and a half to develop. So they're definitely found ways to a) build faster in their hardware, as well as develop the software side of things. And the AI models right now faster than anybody else. I already tested Grok on several different use cases, including deep research. And I was seriously impressed. It's a real model that provides real results. And the deep research model of it is actually really, really cool because it lets you see how it's thinking and doing its research process. Similar to how all three and deep seek is showing you how they're thinking, but combined with the actual search results themselves. So a very cool approach that allows you to actually see what the model is doing. And allows you to provide better inputs on how to make the search better because you know, what's happening behind the scenes. I highly recommend to go and test it out right now. It's available for free, but it's probably won't be for long, at least not at a high volume. So if you want to check it out, just go to grok, sign up and check it out right now. Now, as I mentioned, there are a lot of rumors on new releases and capabilities that are coming out from other companies in the next few weeks. So Claude has shared that they're about to release Claude 4 that has a lot of the capabilities that it was lacking, like reasoning capabilities similar to O1 and O3 and a lot of other models right now. Web search capabilities, hallelujah, maybe the thing that troubled me the most while using Claude regularly is that he didn't have web access. So that's coming in Claude 4 and a lot of other small steps. And all of that was revealed by somebody drilling into their Claude iOS app and finding new icons that are going to appear in the next version. So there's going to be a lot of new capabilities coming up from Claude. I must admit that I'm surprised it's not out yet. Anthropic has been surprisingly quiet with all the madness of releases in the last few months. And so I'm anticipating that they have something really big to release that if I have to guess, will do more than just align with everybody else's capabilities of web access and deep research and reasoning, but probably will add more stuff that we haven't seen before. And that's presumably coming in the next few weeks. and the rumors are saying that the release may come in the next few weeks or a few months. The company itself has not provided any timeline, but these are the rumors that are going around the industry right now. Now to other interesting things that are happening in the industry, and these are quasi rapid fire items, but I will roll them in right now with what's happening in the industry and coming releases and funding rounds. So super intelligence, the company founded by Ilya Saskover, who was one of the co founders in open AI in negotiation to raise another billion dollar, which with a$30 billion valuation, they just raise a similar amount not too long ago. So this would be their second fundraising in less than six months without any clear plans to revenue or even path to a product. But just because of the people that are leading that research, they're gonna probably raise that amount of money. Another interesting person who left open AI, who finally shared what they're doing or share what they're doing is Mira Moradi. She was the former CTO. She left open AI in October for a stealth startup. And now she finally revealed what they're about to do, or maybe kind of revealed what they're about to do. So their new company name is going to be Thinking Machine Lab, and she has attracted several leading developers, such as John Shulman, who's another co founder from OpenAI, and Barrett Zoff, Who was the ex chief research officer, so a lot of big figures, but they haven't really shared exactly what they're going to do. What they said is that they're going to close the gap between the capabilities that are now saved to the leading research labs with the actual needs of people in the industry with a focus of breakthrough applications, science and programming. And so it's very vague. There's nothing really clear there. If you go and read the actual page, it's a lot of mumbo jumbo that doesn't really explain what they're going to do. I don't know if they're doing that on purpose or they're doing that as part of raising funds, which will not surprise me, but right now. The only thing that's clear is that there's some serious hitters there. And just like Ilya Saskover, they will probably raise the amount of money that they're trying to raise. Another company that making a big splash again is Mistral. We talked about the many times on this podcast, they're a French company and they just hit 1 million downloads in just 14 days after launching their iOS app. So a big. Success from their perspective, and this comes because the French president, Emmanuel Macron, has suggested to French citizens to go and download the app In his speech in the Paris conference, just less than two weeks ago. So that drove immediate, huge amount of downloads to the French application, mostly by French people, very little people outside of France actually downloaded that, but they're definitely doing something right. And they actually have a very powerful and capable model. The other interesting piece of news about Mistral is that they're making a very strategic, strong pivot into the enterprise world. We talked about this as well, but now we know that they're already working with companies like and Franz Stravail and European defense company Helsink. So really big clients are using their technology. The, one of the biggest benefits of Mistral enterprise capabilities is that you can install it on premise and actually run it on your own servers, not exposing your data to anything outside your existing environment. And that is obviously very attractive to a lot of other companies. Now going back to releases, Perplexity has launched their deep research mode. So another deep research capability, similar to the one we have from Google and u. com and open AI, etc. The biggest difference is that it's available to free users. By the way, u. com is another platform where you get very powerful deep research capability for free. So the perplexity deep research model was able to achieve a 21. 1 score on humanity's last exam, which we talked about in previous shows outperforming Google Gemini thinking at 6. 2 percent and GPT 4. 0 at 3. 3%, but still behind OpenAI's deep research at 26. 6%. I don't think grok has taken the test, or if they haven't shared the score, which means they probably didn't score very high. Now, the other cool thing in the perplexity deep research tool is it actually provide answers faster than both Google and open AI. Google takes a few minutes. Open AI, the same thing, and this usually takes About a minute, up to two minutes to actually give you the research answers. So definitely worth using. And as I mentioned, if you don't have any paid service, that's a very good option to go after. Right now, it's only on the web, but they're planning to release it to their Mac iOS and Android applications. Now I want to stop for a second to talk a little bit about this deep research madness, but say how significant this is. Those of you haven't used any of the deep research tools, they're now available, as I mentioned, from. Google Gemini, if you have the personal pro version, not the business one, I don't know why it's not on the business one. It's actually really annoying as a business user of Google as well. It also available from open AI, but only on the 200 a month licensing level. They said that they will open it with a limited access to the lower levels, but that haven't happened yet. At least not for me. You. com as I mentioned, which had that functionality for a very long time is a very good free option. They also have a premium option as well. DeepSeek has the same thing and now Grok. The biggest deal about these tools is that they involve an agentic capability of understanding what is it that you're trying to find, understanding the goal, defining multiple steps of research, and then doing those steps of research. And only after they're done aggregating the information and providing an answer. So instead of a very quick, shallow answer, you get a. Well, research in depth answer that sometimes is a summary of hundreds of websites visited. So a work that used to take a person hours or maybe days to do is now done by these tools in just a few minutes and it's an extremely powerful capability to find information and do in depth research on any topic. And I highly recommend to you to try the ones, at least the free ones from DeepSeekU. com, Grok, and now Perplexity. Now let's switch gears to the next big item, which is what's happening with different AI tools around the world right now, and how many people are actually using them. So the first one to talk about is obviously the 800 pound gorilla, which is open AI, their CEO, Brad Lightcup shared that they have reached a staggering 400 million weekly active users on the platform. And they're definitely leading ahead of everybody else. Enterprise adoption has doubled since September of 2024. So less than two quarters ago. Their enterprise users has doubled with over 2 million businesses now using ChatGPT at work. Major corporations like Morgan Stanley, Uber, T Mobile are using OpenAI as their AI infrastructure. And they're now getting into government agencies. And we talked about that in the past few shows, but now the USAID is implementing Chachapiti Enterprise in their administrative work. Now, we said last week already, but I'm going to mention this as we started talking about what releases are coming. GPT 4. 5 is imminent. It's going to be released most likely this coming Monday. So if you're listening to the show this weekend, September 22nd or 23rd, it's probably before, but you might already have access to GPT 4. 5. if you listen to that after that, and GPT 5 is planned right now, the rumors talk about beginning of May. So just around the corner, if you remember, GPT 5 is supposed to be unifying the universe of the GPT models and the reasoning models, the O series of models while eliminating the drop down menu and just giving us a great user experience that will have a lot of intelligence and we'll figure out in the back end what it actually needs to do. I cannot wait to see what that's going to do. I assume 5, GPT 4. 5 and definitely GPT 5 will take the leading spot in the LMSIS chatbot arena and we'll remove Grok from there, but I obviously don't know that and time will tell. Now a new survey from Future Publishing has looked at other applications and how they're doing around the world. So ChatGPT maintains the first position as the most popular tool with 37 percent of people surveyed saying that they're using it regularly, but only showing a modest growth of 7 percent from the last time they did the survey. Google Gemini, on the other hand, is taking second place with 22%, but doubling the previous survey that they have done. Microsoft copilot takes number three with 20%. I think that's not totally fair because Microsoft copilot is available with every Microsoft license out there. So this might be like people are measuring META'S daily users on ai, which is everything Meta is doing has AI built into it. So I don't think that's a totally fair comparison. But still in this survey, they're number three with 20% Grammarly, with 20% of users being used daily. I use Grammarly all the time. I've been using it way before the AI craze started, and probably a lot of you as well. And maybe you don't consider that an AI solution, but it is an AI solution. A surprising next thing is DALI, the image generation tool from OpenAI, with 9. 5 percent of the people saying they're using it. The reason I'm saying it's surprising is that there are much, much better, Image generation tools that are absolutely free. I think it's the number one ranked image generation tool, just because it's a part of ChatGPT that has the highest deployment in the world. And that's why more people are aware of it. Following place is tied by Dream Studio, Mid Journey, and Stable Diffusion with 8. 5 percent of users. All of them are significantly better image generator than DALI, but yet ranked lower. Again, I think just people don't know about them. And then the emerging platforms that jumped into the list. In this survey compared to the previous server, I Perplexity at 11%, Claude at 10 percent and Jasper at 9%. So this is telling you that there's still very low exposure to AI tools on a random survey across people that are not necessarily in the tech industry and so on. So the top tool, the big dog open AI is only used by 37 percent of people who were surveyed with most of the other stuff in single digits. And to our next big topic, which is the impact and the usage of AI at work. And where is this going? We've been tackling this topic in several of the recent news shows, but there's some interesting new facts that I want to share with you. First of all, a comprehensive study from Stanford university research. Finds that 30. 1 percent of U. S. workers are currently using generative AI tools in their jobs as of December of 2024. So this is a very recent survey. Nearly 50 percent of workers with graduate degrees use AI at work. 37 percent of college graduates utilize AI technology. And it is very clear that AI usage increases significantly as the income increases. There's a big jump with people making more than 50, 000 a year and almost 50 percent of workers earning 200, 000 or more use AI tools at work. Now IT services are leading as the top industry that is using AI with more than 60 percent of employees saying they're using it, followed by real estate construction and education with more than 40%, which is actually surprising, and that sectors with the least amount of usage are agriculture, mining and government that are using it the least. Now the is that workers report That used to take them 90 minutes now take them 30 minutes on average using ai, meaning a three x improvement on efficiency. So I said that many times before in this show, even if AI development stops today and we don't get better models, and even with people that are not fully knowing how to utilize the technology, they're seeing a three X efficiency improvement. That's obviously on specific tasks, but it tells you how profound the implications of this technology is going to be on the future of work. Now, let's talk a little bit about potential negative implications beyond job loss, which we're going to talk about in a minute, and we talked about many times in the past, but Microsoft and Carnegie Mellon has actually published a research on February 18 that reveals concerning trends about AI's impacts on workplace cognitive abilities. So they have done a it's a survey of 319 knowledge workers in different levels and in different industries, and they're finding that workers who use AI increasingly rely on AI tools instead of actually thinking of and analyzing the task, people focus mostly on verifying the AI outputs and they have higher and higher confidence of the air outputs, which correlates to further decreased critical engagement in the actual tasks that they're doing. And the research identifies shifts in the workers approach to tasks moving from information gathering and cognitive analysis to mere verification of results. reduction in active problem solving in favor of just response and integration of results and transition from task execution to task supervision. Now this is Very obvious that this would happen. There are good benefits from that. We talked about this huge efficiencies. There are huge disadvantage in our ability to think critically. And I think the key thing that we have to understand both as individuals and people, as well as employees and definitely employers, is that we have to find a way to elevate the cognitive thinking level to a different level. While allowing the AI to just do the tasks. But the reality is people are lazy and they just let the AI do the entire work, whether it's going to be better or worse than them being a part of the process, this is obviously really alarming multiple levels, and it's even more alarming by the fact that Microsoft is the leading researcher on this when they are one of the largest companies providing AI tools to employees right now. Now going back to job loss, USA Today is reporting that major tech companies are continuing layoffs in the beginning of 2025 with Meta leading the charge, letting go of 3, 000 employees, which is 5 percent of their global staff. Workday laying off 1, 750 employees, which is 8. 5 percent of their staff. And other companies as well. The somewhat good news is that tech layoffs has slowed down dramatically compared to the same time last year. So these companies are still laying off employees, but are significantly slower pace than they did a year ago, and their optimistic overall outlook for hiring in 2025, not just in the tech industry. So zip recruiter survey reveals that tech sector is bullish on hiring new employees, mostly because of the economical terms with the reduction of interest rates and growth in the economy. Now staying on the topic of employment, a Y Combinator backed startup called Firecrawl has posted a job specifically for an AI agent. And they were willing to pay that agent 10, 000 to 15, 000 annually. Now the position, it was looking for an agent that can autonomously research trending models and build sample apps. They received about 50 aI agent applications by different people for that task before they pulled it off, and they are claiming that they actually did this as part of the recruiting strategy for human AI engineers. Now, this may sound like science fiction to you, but this is coming, meaning companies will look for AI agents and will find AI agents that will do tasks that otherwise employees would do, and they would pay for the developers and the people who create, deploy, and run these agents X number of dollars less than they would pay a human employee to get the job done without actually having to develop it in house. Now, you may or may not believe me that this is what I think, but the reality is that Workday, one of the largest HR management companies in the world, just launched what they call Workday Agent System of Record, which is a platform that allows enterprises to manage and monitor the work Of their AI agents, both work day native and third party in a centralized control center. So what does that tell you? It tells you that one of the largest companies in the world that is building a platform to manage employees is now looking to add the ability to manage agents, because they're going to be a part of the workforce that will obviously continue to a whole industry of, like I said, AI agents from third party suppliers to do different works. I anticipate that there's going to be platforms like Fiverr and Upwork that will just offer millions of AI agents that can do different tasks and will get reviews and you'll be able to hire them for whatever you want, either on your personal life or on an enterprise level. So what does that tell us? It tells us that the workforce is going to change dramatically, and it's going to impact everything that we know, from the way we hire, to the way we fire, to the way we train, to the way we manage the workforce, because the workforce itself is going to be comprised of both humans And AI agents that will have to work together and as the time goes on, there's going to be more and more AI agents and less people. And the people will have to find different tasks to focus on. And as we've seen, the people are currently on their own, letting go of tasks and transferring over to AI, giving more and more trust in the AI capabilities. This is ringing so many alarm bells in my head, but we will have to figure this out. And we'll have to figure this out very, very fast. My personal suggestion to people. Learn how AI works, understand where you provide value, where you can learn new things to provide even more value, either on your own or by deploying AI tools and capabilities and learning how to manage them in the most effective way, because that is going to be the most powerful capability in the next few years. Switching gears to the robotics world, which we talked about a lot in the past, the robotics world on one hand, very exciting. On the other hand is a very high risk to blue collar jobs. So figure ai, which is one of the leading robotics companies, has just unveiled Helix. It's a new AI model that they have developed that they call a vision language action VLA model that enables humanoid robots to respond to voice commands in the surrounding. Environment around them. This comes only two weeks after they more or less fired OpenAI. They were using OpenAI models before that in order to drive their robots. And now they're announcing their own in house development that they're claiming that is more suitable for running the robotics world. In addition, obviously OpenAI announced that they're creating their own robotics department, which means they're going to be competing, which is another reason I'm sure why they broke the relationship with open AI. Now, the interesting thing about Helix is that first of all, as I mentioned, it understands both visual data and language in real time. It can control two robots at the same time in order to allow them to collaborate on specific tasks. The other interesting thing in figure is that they're showing more and more home environment solutions, meaning building robots for the home, which is different than most companies in the industry that are focusing on industrial robots that will work in factories and so on. So they're definitely focusing big time on robots that will be able to do different chores and actions around the house. Another big news in the robotics world is Metta. Metta just announced a new division That will be dedicated to developing systems and humanoid robots under its reality labs unit. Now, Meta is not new to developing hardware. They have deployed different hardware solutions in the past. The most successful one we'll talk about shortly, which is their collaboration with Ray Ban on the Ray Ban classes. But the goal, at least for now, that they're stating is not necessarily to focus on developing robots themselves, but developing models and systems and components that will power robots from other robotics companies. So if you want, they want to become The Android of the robotics world, similar to the approach from NVIDIA, right? NVIDIA has a lot of capabilities in that universe, but at least as of right now, they're not planning on develop their own robots, just being the infrastructure, the architecture and the software and the hardware potentially to help other companies develop robotic solutions. Now, interestingly, the first focus of meta in the robotics world is healthcare. They're focusing on potential applications that will assist elderly patients and we'll do daily tasks and chores addressing clinical labor shortage through the entire healthcare system and providing them with automation in routine tasks. The humanoid robotic world is really saturated and he's burning right now with Tesla Optimus and NVIDIA robotics initiatives and figure AI that we talked about in unitry from China and Boston Dynamics and many more. So we will see more and more robots take more and more positions. Like I said, mostly initially in industry and factories, but then very shortly after in places that we go to. So gas stations, coffee shops, and shortly after people's homes as well. Another company that made an interesting move into potentially entering the robotics world is Apple. If you think about it, it makes perfect sense. Apple has been involved in delivering technology to people for decades, and this is going to be the next frontier of technology and hardware for people to use. So they have announced that they're working on the development and research of humanoid and non humanoid robots with potentially mass production timeline of 2028. Now, the interesting thing is that the first thing that they released is not humanoid at all. It more or less looks like a Pixar style lamp that is a robot that can follow you around and light things for you. I don't really understand the goal. But the idea is to play with different form factors and find where they can be useful to people in different scenarios, which I actually find a very cool and be very Apple, let's not do what everybody else is doing, but let's look for something cool that will actually provide value. It's very early proof of concept stages. So time will tell whether Apple actually goes down that path. I will be. I'm really surprised if they don't, because as I mentioned, there's multi billion dollar opportunity there. And I don't think Apple will stay behind. They're already behind on the AI race. And at least on the hardware race they can join the game right now and be in the competition in the late 2020s and into the 2030s. That's it for the deep dives for today. And now into a lot of rapid fire items. So, first of all, If you remember last week, we shared with you that Elon Musk made an unsolicited offer to buy the nonprofit arm of open AI for 97. 4 billion, where according to Financial Times, right now, open AI is exploring new governance mechanisms to protect against the pandemic. Hostile takeover off their entities from Elon Musk and or others, and they're considering different ways to do that such as allowing special voting rights to specific scenarios and removing voting rights from other scenarios to potentially having the nonprofit arm would retain control over some of the restructured company and different things like that, but just ways to block a hostile takeover, which is. Becoming a bigger fear as Elon Musk's beef personally with Sam Altman, as well as their professional competition between XAI and OpenAI is heating. And speaking about Sam Altman and Peking Battles, this is obviously much smaller, but Sam shared on X that there is a new version of GPT 4. 0 that is better than the previous model. And Arvind Srinivas, the co founder and CEO of Perplexity basically tweeted back saying, sorry, what's the update? So Sam responded with among many other things, it is the best search product on the web. Obviously sticking it back to Arvind and Perplexity and Srinivas replied to that, that they just released a deep research agent as well to compete with OpenAI. So what does that tell you? I must say that I really like Perplexity. I still use it every single day. I think that their search user interface is still the best out of all the tools, but they definitely lost a lot of the magic because if a few months ago there were the Only AI research tool on the planet that worked very quickly and provided results other than you. com that was more of a deep research kind of tool. now you can do this on all the other platforms. As I mentioned, I still like their user interface a lot. I still use them all the time. But I think they're going to lose more and more market share. I said that very early on when I started using perplexity about a year ago, that I don't see a very bright future for them because I soon as open AI and more importantly, Google, who are the search gods will figure out how to do this better. They will have very little to offer in order to beat them, especially that they're not developing their own models. They're relying on other people models. But as I said, as of right now, I still use them a lot. I use them a lot with DeepSeq R1 and I find that it's actually working extremely well in that particular combination. So if you're not using Perplexity, it's still worth testing it out, but keep comparing it to the other tools as well. Now staying on OpenAI, OpenAI Just released their AI agent operator in additional countries. so far it was only available in the U. S. on To the 200 per month licensing level. now it's going to be available to the same premium level in Australia, Brazil, Canada, India, Japan, Singapore, South Korea, and the UK, and obviously still not in the U. S. The European Union because of their regulation. So if you are in any of these countries, you will be able to pay 200 a month and get access to this capability, which allows operator to take control over your browser and do tasks for you. Now, still on the topic of doing tasks on the browser, Microsoft has introduced OmniParser version two, which is a tool that enables any large language model to interact with a graphic user interface on your screen, basically allowing it to take over anything that you can share with it on the screen. Now, this infrastructure has a 60 percent reduction in latency compared to their previous versions of version one of the tool that they released previously and when combined with GPT 4. 0 achieves very powerful capabilities as far as being able to accurately click on the right things on the screen, they're currently supporting OpenAI 4. 0, 0. 1, and 0. 3 Mini, DeepSeq R1, Quen 2. 5, and Anthropic Sonnet, And it is available through Microsoft dockerized windows system. So if you want to build on top of that, you can do it right now while picking your model. This is very obvious that this is direction that everybody's going, that those agents will be able to control everything on our computer. I mentioned that before, I'll mention it again. I think the two biggest gaps right now is consistency. Meaning these tools cannot still consistently do the tasks and it will wander off and do other things. And that's obviously a big risk. And the other is control. How do I control that? It's only touching the things that I'm allowing it to touch and put guardrails that are going to be tight. So the tool will not do other stuff that it's not supposed to do. And I think we'll see more and more developments on these two topics in 2025. That by the end of the year, we'll make these systems a lot more usable and predictable. And hence will be deployed in a lot more places. Now staying on Microsoft Satya Nadella, the CEO of Microsoft has done a very interesting interview with Rakesh Patel on his podcast this past week. And he, first of all, it's an interview. You have to listen to, if you want to understand what's happening in the world right now, it's about an hour long and they go into a lot of topics that Satya didn't necessarily share in other places. But two of the most interesting things that he shared, one is that he thinks we're overbuilding AI computer infrastructure. So Satya believes that current demand doesn't make sense and that we're building more AI compute than we actually need. And he's saying that Microsoft is planning to lease a lot of the compute capacity in 2027 to 2028 versus building their own capacity, Just allowing them more capability to turn it up or down as needed. And he's saying that because of what he predicts, he thinks that AI compute prices are going to decrease dramatically because there's going to be an oversupply of compute. There's a lot of companies that do not think this way are investing tens of billions and hundreds of billions of dollars in building this capacity, but that's what makes this viewpoint very interesting. Now, the other interesting topic from Satya is that he dismisses current view of how to define AGI. So there's all these different benchmarks and idea on how to define AGI, and he actually took it to a completely different direction. Satya is basically saying that AGI will be achieved when it can generate a 10 percent growth in the overall global economy. Basically saying that AGI means a significant increase. Change to how efficient humans as a whole are. And the way to measure that is by global GDP. Now, 10 percent growth in the global GDP is in the trillions of dollars. And that's a very interesting way to measure whether we achieved AGI or not. I really hope that there's going to be some compromise between these two things, and that we will look for other ways to measure the benefit to humans. Other than just GDP, but again, a very interesting viewpoint. Go and check out the interview. I will share a link to it in the show notes. Now in parallel to that, Microsoft announced a breakthrough in quantum computing, and they're claiming that it will be possible to build a utility scale, meaning something that can actually be used by us and not just for research quantum computer within four years. This is a line from similar announcements from Google with their latest achievements. So it seems that quantum computing is moving slowly or maybe quickly from being an incredible idea. To a practical solution that we'll be able to start using before the end of this decade. This is way faster than anybody anticipated a few years ago. And this will mean going back to compute for AI stuff that will be able to harness the power of quantum computing to do a lot more AI with a lot less computers sometime within this coming decade. And from Microsoft, let's shift to Google. But in the first news, we will mix Google and Anthropic. So according to a court filing, Anthropic has requested to intervene in Google's antitrust case. So if you remember, the government has decided to limit Google's ability to grow its market share and even wanting to break it up in specific segments, Anthropic is arguing that the proposed ban on Google's investment could harm its own operations and its market position and others. Google has a 3 billion stake in Anthropic right now, and that might go away If they have to stop their additional investments, and this could force Google to sell a lot of Anthropic shares that could dramatically drop the market value of Anthropic, which means it will make it harder and harder for them to raise future capital. Now the anthropic arguments are, is that AI was not a part of the original antitrust case and that neither anthropic or Google's AI investments were even mentioned in the complaints. Now, will that be successful or not? It will be very interesting to follow. And I will keep you posted with how this moves forward. If you remember last week, I shared with you the JD Vance in his address to the Paris conference has shared that he thinks that the world and the U S specifically needs to give as much opportunity to smaller startups to fight incumbents. Well, this particular scenario is really interesting because the incumbent is the one that is financing the incoming startup. So I don't know what his position is going to be on that, but I think in general, they want to make sure that new companies stay afloat. And if that means allowing Google to invest in Anthropic, I think that is going to be allowed. Now switching to only Google, Google Research has announced a multi agent AI system based on Gemini 2 that is specializing in biomedical research. And what it's actually doing, the system operates through multiple AI models that are challenging each other or in a self improving loop. The system employs six specialized AI agents, one for generation, reflection, ranking, evolution. Proximity and meta review, basically a team of agents that are working collaboratively in order to achieve a goal. In this particular case, it is looking at huge amounts of data, trying to refine scientific hypothesis and making a significant advancement in the way that is done currently only by humans. Now they've already tested that in actual drug repurposing, and that was tested in a lab and was proven to be successful. So we talked about this in the past that DeepMind, and specifically Demis Hassabis, who is running DeepMind, his main goal is to progress humanity through more advanced and faster research. And this is definitely aligned with that. Now, if you want to learn on a very small scale, how that can be done, our next episode 167, is called Build an AI Dream Team that works for you in different roles and personalities. And we're sharing how you can do something like this for your business right now, and having different types of agents working collaboratively, or even arguing with each other in order to achieve better results with the AI tools we all have access to today that's coming out this coming Tuesday. Now since we mentioned Demis Hassabis in an address to Google employees, he was talking about DeepSeek and he basically stated that Google and I'm quoting, more efficient, more performant models and that they have, I'm quoting again, All the ingredients to maintain leadership in AI development. He also addressed DeepSeek cost efficiency claims, and he's suggesting that they're only sharing a tiny fraction of the actual development costs compared to how Western companies are sharing. And that It was very obvious that they were heavily dependent on the Western AI models in their development. That being said, he also said that DeepSeek's work is the best he has seen coming out of China, meaning he's aware of the fact that there is growing competition from the other side of the world. I Said that many times before, Google is probably the company best positions to A, lead and B, continue leading the AI race because they have access to all the different components that are required in order to make that happen. Small piece of news from Google, but really exciting from my perspective as a heavy Google user. Imagine 3, which is their AI image generator, is now available across the entire Google workspace stack so you can use it in the Gemini app, you can also use it in the docks in Google Docs sidebar in Google Sheets in Google Drive in Google Slides in Gmail and in Google vids. So wherever you are using any Google platform, you can now generate images with imagine three. And if you remember last week we shared with you that imagine three is now ranking number one in the AI image generator leaderboard on LM sys chatbot arena, so it's not just a model. It's one of the best models out there to generate images. And now you can do it natively in the apps that you're using in your day to day. This is the big promise. And we talked about this in the past. the goal of Google and Microsoft is to integrate the full, most capable AI capabilities across everything that they're doing. aNd this is one of the first times that we actually see something useful in that direction. I'm personally excited about this because I generate a lot of presentations and I have to go to third party tools in order to create the images for the presentations to put in Google Slides. And I'll be able to do it natively within the app, which is great. The rollout of that already started, but it will end on March 1st. So if you don't have that capability yet, you will get it soon. Now, Google also is now allowing you to analyze documents on the free Gemini version that was previously kept to only the 20 a month membership. So right now on their web platform, as well as both mobile platforms, you can upload almost any file you can imagine. So up to 10 files simultaneously and multiple file formats like docs and obviously code like C and Python and Java and other coding languages and obviously the Google workforce files like Google Docs, Google Sheets, CSVs, Excels, XLS, so most of the files that we use, you can now analyze in Gemini. I do this a lot and it actually works extremely well combined with the fact they have the largest context window of 2 million tokens, depending on which one you use, it actually is a huge capability that is probably the best of all free models right now. Now Google is also splitting out its Gemini app on iOS from the main Google app now to a standalone Gemini application. This is in the big craze right now for AI applications. We talked about both open AI and deep seek and the chat from Mistral. So now Gemini has its own iOS chat. And they're also integrated it with the iPhone 16 action button to immediately call the Gemini chat and have a conversation with it. And they're also introducing Gemini live, which allows advanced voice assistant straight from the app. Going from Google to Meta, Meta has announced a very interesting project that is called Project Waterworth, which is deploying. 50, 000 kilometers of a sub C high capacity fiber optics cable connecting all continents. It is the first of its kind architecture and capability that it's going to be the most advanced and company owned infrastructure to pass data across all continents. So basically, Meta will have their own internet infrastructure. The interesting thing here that I didn't know is that Meta currently controls 10 percent of fixed and 22 percent of mobile global internet traffic. So. Owning their own infrastructure will allow them to do a lot more without competing with other users for bandwidth and combine that with AI needs to transfer data even faster from one place to the other, makes it a very interesting move from meta. Another interesting milestone for Meta that we mentioned earlier, Meta's Ray Ban smart glasses have sold 2 million units since their launch in October of 2023, dramatically exceeding their initial expectations. Now, what they're planning for the future is they're planning to dramatically increase the production capacity to reach 10 million units annually by the end of 2026. So within two years from now selling 10 million units and they're expanding their partnership with Ray Ban through 2030, and they're planning to also go beyond just Ray Ban and doing partnerships with Oakley to develop similar solutions as well. So the combination of a highly stylish and yet highly advanced AI driven it's proving highly successful, they're also planning to develop a newer advanced model that will actually have display capabilities, meaning instead of just seeing the world and listening and being able to talk to you, you will be able to display stuff on the screen, and they're planning to start developing those in the end of 2025. Now, while currently this is not a big number, smart glasses sold 2 million units compared to 200 million units of global sales of smartphones. I definitely think that this new way to engage with the world is going to eventually take over just because it makes a lot more sense. It's a lot more immersive and it provides a lot more feedback while not having to carry another device that we don't necessarily need. Will that completely replace the phone screen or not? Time will tell, but I think in the very near future, we'll see more and more people using wearable devices. And as I mentioned, mostly stylish wearable devices that will be connected to the internet and to AI capabilities. Now, speaking of devices and phones, Apple just announced that they are going to start selling iPhone 16 E, which is a less capable, but significantly cheaper model that they're going to sell for just 599, which is significantly cheaper than their leading phone, but it will have the A18 chip and will come with Apple intelligence built in. So this is Apple's play on how to get Apple's intelligence in the hands of more people with a device that is more accessible from a price point perspective. And it's fully aligned with what Google is doing with their pixel 8a. And Samsung Galaxy S24 FE that are following the same kind of idea of making models that are slightly cheaper but still have all the advanced AI capabilities. And now three interesting groundbreaking research related topics. One is a new AI algorithm called Torque Clustering That was developed by researchers in the University of Technology of Sydney has achieved a 97. 7 percent accuracy on unsupervised learning tasks. So previously, in order to get to these level of accuracy, human supervised learning were needed and unsupervised was. running around 80 percent accuracy range. Well, they were able to allow a model to basically train on its own and achieve a very high level of accuracy. This will dramatically reduce the cost and the time to develop new models if this is going to be scalable. On a somewhat alarming research that was released this week, Fudan University has demonstrated that large language models can successfully replicate themselves human intervention. They were actually researching two different things. One is shutdown avoidance, basically an AI replicating itself before termination and chain of replication where the AI will continuously create copies of itself indefinitely to prevent from being shut down. They've tested that on both MetaLlama 3. 1 and Quen 2. in both models demonstrated unexpected autonomous behaviors of Problem solving scenarios, including system manipulation in order to prevent being shut down and terminated this Behavior was defined by many researchers as a red line that we should avoid, and yet we're very close to that, and nobody seems to be, thinking about that red line anymore. That's obviously highly troubling when it comes to developing more advanced systems, because we have to be able to control them, or they will figure out a way to control us. And then the last piece of research actually comes from Meta, so We mentioned in the past that Meta is working on VJEPA, which is Video Joint Enabling Predictive Architecture, which is their approach on how Yann LeCun and his team believes that AI systems should learn in order to really develop AGI, which is looking at the real world, where they were able to prove that these systems can learn the fundamentals of physics simply by watching videos of the real world. And the way they tested it is through a methodology called violation of expectation, which is a method usually used with infants to test their understanding of the world. And they're basically providing the system with both physically possible and impossible scenarios. And the system has to say whether it thinks that this scenario makes sense or not. And the system was able to successfully do that despite the fact it wasn't trained on it other than just watching videos. This has been Yann LeCun's main video. Differentiator, when it comes to his approach to AGI, he's claiming that's the only way to achieve AGI is to allow these models to learn the world around us by actually watching the world around us, just like babies learn, and he's now proved his hypothesis. Correct? It still doesn't mean that the other path doesn't lead to AGI, but it definitely means that his path is an interesting, different path than everybody else's to go in that direction. So that wraps up another exciting week of a lot of AI news. go check out Grok. New big models are coming probably before next episode. We'll have open AI 4. 5 sometime in the near future. We'll have Claude four and shortly after GPT five, so it's not going to get boring in the next few months. And as I mentioned, this Tuesday, we'll release episode 167 on how to build an AI dream team that can serve you across multiple aspects of the business. It's a fascinating episodes that you don't want to miss. And I'll mention one more thing. We do AI Friday Hangouts every single Friday at 1 p. m. Eastern. That is a community that is getting together. We had 28 people this week that was talking about practical use cases, what's happening in different models, solving specific problems for specific individuals, reviewing different tools, all of that in one hour of a really fun and engaging community. So if you want to join us, look for the link in the show notes and come join us. We do this every single Friday. and if you haven't shared this podcast with people, with other people that can benefit from it, please do so just open your app right now and click on the share button and share it with other people. And if you haven't ranked this podcast and give us comments on what you like and don't like about this podcast on Spotify or Apple podcasts, please do that. That helps us a lot. And I would really appreciate it. And until next time, have an amazing weekend.