
Leveraging AI
Dive into the world of artificial intelligence with 'Leveraging AI,' a podcast tailored for forward-thinking business professionals. Each episode brings insightful discussions on how AI can ethically transform business practices, offering practical solutions to day-to-day business challenges.
Join our host Isar Meitis (4 time CEO), and expert guests as they turn AI's complexities into actionable insights, and explore its ethical implications in the business world. Whether you are an AI novice or a seasoned professional, 'Leveraging AI' equips you with the knowledge and tools to harness AI's power responsibly and effectively. Tune in weekly for inspiring conversations and real-world applications. Subscribe now and unlock the potential of AI in your business.
Leveraging AI
160 | 🚨 RED ALERT 🚨 – DeepSeek R1 (a Chinese model) shocks the AI world, outperforming GPT o1 and crushing NVIDIA stock, sparking urgency across the U.S. AI scene. Plus, GPT-o3 is here, Gemini unveils groundbreaking features, and more must-know AI news
Is the AI race shifting under our feet?
This past week has been nothing short of historic in the AI world. From China's DeepSeek shaking up the industry with an open-source model that rivals GPT-4 (at a fraction of the cost!) to the U.S. stock market taking a massive hit, it’s clear we’ve entered a new phase of the AI arms race.
So, what does this all mean for business leaders?
In this episode of Leveraging AI, we break down why DeepSeek’s rapid advancements sent Nvidia’s stock into freefall, how the U.S. government and AI giants are scrambling to respond, and whether this signals a fundamental shift in AI development. Plus, we cover the latest model releases, OpenAI’s pricing shakeups, and a surprising move by Google that’s making waves in AI search.
In this AI News, you'll discover:
- Why DeepSeek’s new AI model sent shockwaves through the stock market.
- The real reason Nvidia lost nearly $600 billion in a single day.
- How open-source AI is challenging the dominance of closed models like GPT-4.
- What Google’s latest AI updates mean for the future of search and productivity.
- New AI regulation shifts under the Trump administration – and what business leaders need to know.
- Why OpenAI’s next funding round could hit **$40 billion** – and what that signals for the industry.
👉 Join the AI Business Transformation Course!
Want to transform your business with AI? Learn how to integrate AI effectively with real-world strategies. **Sign up now** for the next cohort starting Feb 17 and get $100 off with promo code LeveragingAI100. https://multiplai.ai/ai-course/
About Leveraging AI
- The Ultimate AI Course for Business People: https://multiplai.ai/ai-course/
- YouTube Full Episodes: https://www.youtube.com/@Multiplai_AI/
- Connect with Isar Meitis: https://www.linkedin.com/in/isarmeitis/
- Free AI Consultation: https://multiplai.ai/book-a-call/
- Join our Live Sessions, AI Hangouts and newsletter: https://services.multiplai.ai/events
If you’ve enjoyed or benefited from some of the insights of this episode, leave us a five-star review on your favorite podcast platform, and let us know what you learned, found helpful, or liked most about this show!
Hello and welcome to the DeepSeek podcast. No, I'm just kidding. Welcome to a weekend news episode of the Leveraging AI podcast, but we are going to talk about A lot about DeekSeek. If you haven't been following the complete madness around DeekSeek and other open source models that had been this week, then you're in for a treat. If you have been following, I'm probably going to be able to provide you a lot more information in my opinion on where we are and where this might be going. But we also have a lot of other stuff to talk about. Some interesting releases from Google, some crazy valuations. Trump and what his current administration is already doing in AI and a lot of other good stuff. This week has been one of the craziest week in the history of AI. And I know I said that a lot, but it's really been a completely insane week. So. Let's get started I'll start with an interesting private story. My dad is one of my best friends in the world. I talk to him more or less every single day and he's overseas and I'm here. So it's usually early in the morning. So Tuesday morning, around 7 30 AM, I talk with my dad and we talk about different things and I said, Hey, I got a question for you. Can you tell me what is the whole thing with a DeekSeek situation now? My dad is awesome. He's involved in a lot of stuff. He volunteers, he hikes, he learns photography, goes to university, a lot of other stuff, but he's almost 80 and, you know, development in the AI world is not top of mind for him. So when he's asking me at seven 30 in the morning, What's going on with DeekSeek that should have made all my alarms go off of something not normal is happening, especially for a company that he never heard about probably until that day. And so that may have gave me an idea that I probably should short Nvidia stock, which did not happen, sadly, but let's talk a little bit before we dive into all the madness that happened this week about who is DeepSeek, what has happened in the past few weeks, and why it made the crazy impact that it has made. So DeepSeek is a Chinese AI company. It's a relatively young Chinese AI company. It was established in May of 2023. So it's not even two years old. It was founded by Liang Wenfeng, who is a serial entrepreneur, and he is also one of the co founders in a highly successful AI based Chinese hedge fund called HiFlyer. HiFlyer is the company that is Financing everything DeekSeek. And Mr. Wen Fang is also the CEO of this AI company that started half as a hobby for him. Now, because they have no stress to make any money because they're funded by a very successful hedge fund, they were focusing on developing the most effective way to get to a GI artificial general intelligence, which we talked about a lot on this podcast, but it's a relatively small company that has been focusing on finding unique solutions that are different than everybody else's. They've been focusing on hiring really talented, People out of universities in China versus looking for big names in the AI world, which already tells you that their approach is to come up with solutions themselves versus do what everybody says that would work. Now, their mission is to share our progress with the community and to see the gap between open and closed source models narrowing. So as you can understand, they've released DeepSeek 2. 5 and that didn't make a lot of waves. But as I mentioned on the podcast, two weeks ago, they released DeepSeek version three, which was actually a very powerful and capable model. It's as good. And in some specific use cases, better than GPT 4. 0 and Claude 3. 5 Sonnet. And the reason it caught my attention, everybody else's attention when they released V3 is not the fact that it's a Chinese model that is as good as the leading American models that are available right now. But the fact that they claim that they spent only 5. 6 million training that model. And I talked a lot about this two weeks ago, last week, I told you that they released a model called our one, which is their Thinking model, basically similar to GPT 01, and it's as good. And in some cases, better than GPT 01, despite the fact that it was trained for a really small amount of money. Now I share that news on Friday, but it took a few days for the rest of the world to understand what is actually happening and what it means. Now, if you remember, I told you that there's probably, Multiple engineers in all the leading AI companies that are now dissecting everything that they have done and trying to understand exactly what they were doing and the claims that they were making and so on. The immediate outcome was that on Tuesday morning, as this thing caused caught fire again, as my dad asked me about what's happening and who's DeekSeek NVDIA stock and other chip manufacturers started collapsing. When I say started collapsing, Nvidia took the largest single day loss in value in the history of the stock market and if you want the exact number, it was$589 billion in valuation, lost on January 27th. Now, why did this happen? The reason it happened is because the assumption that drove Nvidia stock through the roof is that the only way to develop bigger models and to run those models is to have more and more Nvidia chips. And I talked to you in all the recent episodes about the amount of money that is being raised and about different projects, whether government or industry that the leading organizations are either pouring or trying to raise in order to drive AI innovation and to keep the U. S. In the lead in that. So hundreds of billions or potentially in the long run, trillions of dollars to drive that innovation. And here we have a Chinese company that within a few months from their previous model to this model, while spending 5. 6 million, 5. 6 million. Presumably without access to the latest Nvidia GPUs are able to develop a model that is as good. And in some cases better than the leading us models that cost billions to train. And hence, if you don't need hundreds of thousands Nvidia chips in order to make this thing work, then obviously Nvidia is a lot less attractive. Now, other cheap makers like ASML and Broadcom also saw significant declines, not as big because they're not the 800 pound gorilla and they don't have a target on their back, but still a big decline. Also energy stock such as GE, Vernova and Vistra and other energy infrastructure stock also fell because of the same reasons, because in order to power these gigantic data centers that everybody's planning, it will require more power, which will mean more money to energy companies. I don't know. And I think nobody knows whether that was intentional to see what's going to be the impact on the U S economy. But I think it's very obvious that the Chinese found a very serious vulnerability in the U S stock market right now. And again, I don't know if they did it on purpose, they found an angle, how they can make the U S economy, or at least the stock market, hurt. Now, what is driving that decline in the stock market and so on is a combination of several different things. Number one, they are achieving really high scores on multiple benchmarks across the board, more or less, whether it's code writing, math, Science, writing, data analysis, and so on. Reason number two is that their app, their mobile app has climbed to number one in the app store in us, India, UK, and Australia almost overnight. So literally the thing that everybody around the world wants on their phones is the DeekSeek app. It has also climbed to number four, very close to the top three on the LMCIS chatbot arena, meaning on a blind test, it has scored better than most models out there on day to day tasks that people test models on. So it's not just the benchmarks. It's actually doing really, really well across the board, across multiple types of tasks for a model that presumably was trained for very little money., So if it is possible to create a model for a few millions of dollars with just a few thousands of GPUs over just a couple of months versus. Many months or a year plus, if you look at how long it takes to develop the next variation of models and they can build an app that will climb to the top of the app store across huge markets like the U S and India and the UK. It means. That Nvidia and everybody else in the existing paradigm of how AI needs and can be developed might be wrong, which means they don't deserve the valuation they have right now.
Isar Meitis:This show is sponsored by the AI business transformation course, the AI business transformation course, I've been teaching it personally since April of 2020. And since then, I've taught probably thousands of business leaders on how to transform their businesses using AI. The course covers an introduction. So those of you who know nothing can get a lot out of it, but also it teaches a lot of multiple use cases across different tools and how to use them and how to apply them for business. And it ends with an actual blueprint that I use with my consulting clients on how to implement AI from a company wide perspective. perspective. The course is four sessions, two hours per week, and we just opened our next cohort. It's starting on February 17th. The number one thing that's going to decide whether you're going to have successful or successful. AI transformation or not is training, education, and skills for employees and leadership that has been proven through multiple research and surveys and whatever, it's the number one decision. If you don't have a means to train yourself. Or your team right now, don't miss this opportunity. I teach this course all the time, but most of these courses are private, meaning I teach specific organizations who invite me to teach them. And I do a public course like this once every quarter or so. So don't wait till May of 2025 to get your act together when it comes to training yourself and, or your peers. Team with how to implement AI in a business. Come join us February 17th. There's gonna be a link in the show notes, so you can go from there straight to the signup page. You can use the promo code leveraging AI 100. So basically the name of the podcast, 100 to get a hundred dollars off the course. But now back
Speaker:But let's dissect some of the claims and see how people in the industry has responded to this dramatic shift. So the. First thing is that many groups within the U S have made allegations or suspecting foul play by DeekSeek in the speed and the cost that it took them to train the model. And there has been many accusations across the board on what they might have done. So one of the concepts, Microsoft has established a group of researchers that is looking into how DeepSeek might have used OpenAI's API in order to train their model in a more efficient way. Now it's called distilling. It's something that many AI companies are doing with their own models, meaning they're using bigger models to train smaller models. We talked about this with O1 and O3 being trained by variations of Orion and the other way around. And so they're using models to train other models and basically allowing themselves to benefit from the fact that they have something that knows the answers that can reason to train other models. You're not allowed to use the API of a third party to train your own model. That's not part of what is acceptable. And it's not allowed by the terms and conditions of using the API. That being said, again, there's a suspicions by Microsoft that is now checking whether there's been usage by DeekSeek researchers in order to train their model using API. Recently appointed White House AI czar, who is David Sachs, stated that there's substantial evidence that deep six distilled knowledge from open AI models in order to create their model faster and cheaper than it would have taken otherwise. Now, to strengthen that claim, there's been multiple researchers and reporters, including a report on TechCrunch from one of their tech reporters that's saying that DeepSeek has self identified as ChatGPT in multiple occasions. I don't know if this is a complete proof, because these models do hallucinate, but the fact that there are several different occasions where this has happened gives you the idea, at least that it might have been trained on data that is claiming that it is open AI. And this is why it thinks it is open AI versus DeekSeek model. So there's a fair likelihood that this is what has happened, that they've used open AI models and maybe others in order to train their models faster. Is that enough to say, okay, we can be calm because they've been stealing data in order to do this. The answer from my perspective is no. And we'll give you several different reasons, but before we dive into my reasons, let's look at the feedback from Dario Amadei, which I actually think is the most interesting in this whole situation. First of all, he's saying it's a great model. It's very capable and kudos to the developers. And we heard the same thing from all the big players from open AI and Anderson Horowitz and Microsoft. And so on, everybody said that it's an amazing achievement, but Dario puts it in a very interesting perspective. And what Dario said, he basically said that the models that DeekSeek is beating or aligning with are 7 to 10 months old. So Sonnet 0 are models that have been released a while back and have been in training even before that. So what he's claiming is first of all, looking into just how much the training run costs is not the right way to look about it because he's claiming that they've trained 3. 5 Sonnet for a few tens of millions, but what is also claiming that because of the very fast reduction in the cost of creating and running these models, that if you run the curve of the decline in cost in creating and training these models, and you run it from the previous models, which is GPT 40 and sauna 3. 5. You will get maybe not to 5. 6 million, but to the same ballpark. And he's claiming that there are other companies who are achieving these results right now that again, are at the same ballpark as DeepSeek. The other thing that he's claiming based on other rumors that came from multiple different sources, is that it seems that they did not have access to five or 6, 000 older GPUs, but they somehow got their hands on 50, 000 H100 GPUs, which is the most common GPU that most people use before the latest version that NVIDIA came out with buying. 50, 000 GPUs will cost them about a billion dollars. So they have invested a billion dollars in infrastructure. And then presumably, again, we have no way to check that or prove that 5. 6 million in just training the model. So in that perspective, it is not dramatically different than the next variation of models in the U S but that raises the question, if this is the case, if really training new models. You don't need so many GPUs and you can do it for 5. 6 million or around that ballpark. Why is everybody trying to raise tens of billions of dollars or hundreds of billions of dollars and planning this incredible infrastructure effort of building data centers and power sources and so on. And this just doesn't add up to me. So I think while Dario probably has a lot of solid points, I still think that would mean he doesn't need to raise the same level of round as he is trying to raise talking about a 60 billion valuation and raising a lot of money in the process. And so the truth is probably somewhere in the middle. I think it doesn't matter. And what I mean by doesn't matter, the competition between us and China on technology and a lot of other stuff has always been roughly the same. In most cases, us creates the innovation either in Silicon Valley or otherwise the Chinese copy it in the beginning, learn how to do it on their own. And then they run forward with it faster and cheaper. And in some cases, better in some cases, not better, but in very Competitive economics. And this is exactly the same thing that happened right now. But the bottom line is that the reality is the reality, whether they stole IP in order to do this, whether they distilled information from ChatGPT or not, they have a model that is highly capable that on several different aspects is better than the models we have from open AI and Anthropic and so on, that costs on inference, meaning the actual usage of the model. Forget about how much it costs them to train it and whether it's really dramatic or not. Using it is about 17 times cheaper than using O1. So you can use a reasoning model, which is DeepSeek R1 for 17 times cheaper. Less than it would cost you to do the same work with a one. That's a very significant spread. Now, because of that, more or less everybody under the sun understands the potential and made it available on their platform. So because it's an open source model, it's relatively easy to take it and run it on any infrastructure. So both Microsoft Azure, as well as AWS. almost overnight started offering the R1 model as one of the large language models that you can run on their platforms. The first thing that happened is obviously the model underwent some safety evaluations and security reviews to make sure it's not sending any information to China or that it's not using it in any malicious way, but because it's open source and it runs within a box that Microsoft or Amazon controls, that it's. And they've run the relevant safety evaluations on it. We need to assume that this data, if you use it on those platforms, is not going to the Chinese government. That being said, if you are using the API from, DeepSeek themselves running it on their servers, or if you're just tuning in the chat, there's a few things you should know, whether you're doing it for personal usage or whether you're doing it for a company or business perspective, you need to know the following. One is that it's running on servers that are located in China. That means it's subject to Chinese government cybersecurity laws, which requiring the government cooperation and the government's view into what's happening in the servers. Meaning if the Chinese government wants to have access to the data on these servers, they can do that. The other thing is on their usage terms are telling you what data are there collecting and how they're sharing it and what they're doing with it. And it's way broader than most of the US companies. So it collects things like keystroke patterns, chat history, device data, how you pay. And a lot of other stuff that you may or may not want to share with a Chinese company and. because of that, potentially with the Chinese government. Now, let's talk about the national level for a minute. From a national level and winning this race to AGI, Eric Schmidt a few months ago made a prediction or the assumption that the U. S. is about two to three years ahead Of China in a development. In addition, as you probably know, there's a ban to sell Nvidia chips or other advanced AI chips to China from the US, which presumably should have stopped them or at least slow them down. In the development process and make the two or three year gap that Eric Schmidt thinks there is to actually grow wider. Well, the reality of what we're saying is that's not the case, which means the current efforts to slow the Chinese down are either not working or are not working well enough. And again, this had opposing opinions and suggestions. Some people are saying that the current ban on the sale of NVIDIA chips is actually what's caused the necessity for innovation in China, which led to the amazing achievement of DeepSeek R1. Some are saying, including Dario Amadei, that Just think what would have happened if they did have access to these models, would they be ahead right now and not the other way around? So there's different ways to approach and look at this. As I mentioned before, it is very likely that they stall IP and use models not in the way that they're supposed to in order to stay in the race. Now, if you remember back in June of 2024, we talked about a guy called Leopold Asher Brenner, who was an engineer in open AI and wrote a very interesting paper called Situational Awareness, the Decade Ahead. Pretty long document and even done an interview that was a few hours about this topic. And I listened to the full interview back then and I read the paper and it's very, very interesting. But one of the things that he's talking about back there is that the free world must prevail. And one of the things he was talking about is not necessarily the crazy investments that are required that everybody's talking about right now, but he was talking about security. He basically made the claim that the advanced leading labs are great, are developing AI, but they're completely clueless in keeping those AI secrets safe against government espionage or large international companies such as Alibaba, other Chinese companies, and probably other players from other places like Russia. And he made a claim that in order to keep it safe, there has to be a joint force led by government agencies like the CIA and the FBI and so on, working together with the industry, both the Data security industry, as well as the leading AI labs in order to prevent, or at least make it a lot harder for Chinese companies or the Chinese government to steal that data or use that data in some kind of a way. And I think that's the direction it's going to go. I think the fact that we just saw what happened And I think the approach of the current administration is going to drive the setup of some kind of a US based government or quasi government body that is going to work together with the leading labs to keep their data and their systems and their processes and their IP safe. From international companies and other countries so we can rebuild the gap between us and them and stay ahead in the race moving forward. Now, another interesting aspect of this scenario is that. As I mentioned before, DeepSeek is an open source model, so maybe the biggest supporter of open source models so far has been Yan LeCun, the head of AI at Meta, and he has been a huge proponent of open source AI since day one, he's always saying that's a better approach, and he used this opportunity to basically strengthen his position, saying that It's obvious now that open source is the right way forward. And from his perspective, the proof is in the pudding. The fact that an open source model now is better and was able to be developed significantly faster than other. Models is a proof that open source is winning over closed source companies. And the concept of open source and open source success coming from China and other places is another topic that we're going to touch upon in a few minutes once we finish talking about some other important things. And so almost everybody in the industry obviously had something to say about the situation. Eric Schmidt, as I mentioned that previously said that the U. S. has two or three years ahead of China, said that DeepSeek is a turning point for the U. S. and the world of AI. He recommends policies like accelerating U. S. open source AI development, increased investment in infrastructure projects like Stargate, which we talked about last week. That's a, eventually, if it will all happen, a billion investment done mostly by SoftBank and potentially other sources through in collaboration with SoftBank to create Oracle data centers and infrastructure for open AI. Bye. The other thing that he's supporting is more collaboration and sharing of training methodologies and data between the leading U. S. AI labs in order for them to be ahead of China. So putting aside the competition between them and making sure that the U. S. is ahead of China in that race for AGI and beyond. Another person that weighed in on the topic is Mark Anderson of Anderson Hurwitz. One of the most influential investors in the tech world And he said that DeepSeek is the AI Sputnik moment. For those of you who don't understand the reference, Sputnik was the first ever satellite. It was launched by the Russians, basically accelerating the space race because it made it very, very clear to the U S that Russia is ahead in that race. Versus what the Americans thought that they are behind. And so he's claiming that's the same kind of moment for the AI world, where it's a serious wake up call for the U. S. to figure out how to dramatically change the quality and the speed and potentially also the reduced investing of their game in order to stay ahead of China. So how did this crazy week and everything that comes with it, meaning a Chinese model, and actually we'll talk in a minute about another model, Chinese models that are open source that are as good or potentially better than us models. How did this impact the U S government or things related to the government? first of all, Alexander Wang, who is the CEO and the founder of scale AI has sent a letter to President Trump on winning the AI war. And this is literally the words that he's using and he's saying in big fonts, and I will share the link to that in the show notes, America must win the AI war. And his letter basically starts with dear President Trump. America must win the AI war that is written in really big font. And then he's talking about how does he suggest we will do that? So allocate America's AI investments in the right places, build a workforce of the future in America, make federal agencies, AI ready unleash American energy to support the AI boom and ensure safety without stifling innovation and he for each and every one of those topics. He gives a few ideas on how that can be done. If you remember, I told you last week that Sam Altman is supposed to testify behind closed doors to a Senate subcommittee, and in that presentation he was urging a$100 billion in federal AI investment over the next five years to counter China's rapid advancements. Now he proposed measures to include tax incentives for AI startups, streamlining regulatory process, and public and private partnership to build next gen AI infrastructure. Now, AI has been a top of mind item of president Trump, more or less since the moment he took office, or actually even before he's the first president that's going to have and AI Tsar working under him. He also almost Immediately after taking office, canceled Biden's AI regulations and provided U. S. Agencies 180 days to come up with better regulations that on one hand will keep us safe. And on the other hand will drive U. S. Innovation. I think it's very, very clear that this particular administration because of all the billionaires behind it, driving the decisions we're going to see less regulatory action, a lot more freedom for these companies to act, potentially huge investments from the government or foreign investors, they will find ways to make it easier or make it more profitable for them to make such investments in the U S in order to drive more innovation and more AI in the U S compared to China And before even the dust settled on the craziness from DeepSeek, Alibaba released Quen 2. 5, which is their latest model. Alibaba, again, is a gigantic tech companies from China. They're basically the Amazon of China, but they have a lot of other stuff. And the leading application that every Chinese is using for anything, including making payments, they just released Quen 2. 5. Quen 2. 5, which is their latest AI model that is comparable with GPT 4. 0. And again, they've released it as open source. Now, in addition to the fact it's open source, it's completely multi model from the ground up. It can understand video. It can understand images. It can create video. It can create images. It can write code. It understands multiple language. It understands voice and so on. And it knows how to do autonomous app and website navigation, similar to what we've seen released by initially Claude and then ChatGPT, as well as Perplexity. So all of these things are baked into the same model, highly capable, fully open source coming from China. So where does that put all of us before we move to the next topic? It means that there is serious, real competition coming from China. It also means that there is hope that we will not need to destroy the planet in order to satisfy the needs of AI for compute power and cooling. Now it might be wrong. maybe the fact that AI will be significantly cheaper will actually allow us to integrate it into a lot more places faster, which means we're actually going to need. More compute and more chips to support it in more places at a higher volume and faster than anticipated before. So this could go either way, but it's very, very obvious that the leading labs are paying attention. It's already been announced that meta has four different war rooms looking into the implications of DeekSeek and what they have released. Obviously they have. A lot to gain and a lot to lose because they were the leading open source model and now they're not anymore. So on one hand, it supports their open source agenda. On the other hand, there's another big dog in the open source AI world that they need to now fight and counter. It has been announced that's already happening in meta. And I assume despite it wasn't announced that the same exact thing is happening in Google and in Microsoft and in open AI and so on to other big open source model releases that happened this week. One is a new model from Mistral, which is a French company that we talked about several times in the past. They've released a new model that matches and exceeds 70 B, but in a model that is three times smaller and run significantly faster and can run on a single GPU, meaning significantly smaller footprint, great results, less money. And as good as other much, much bigger models that took a lot more time to develop. And there's another company that we don't talk about a lot, but it's called aI2 and AI2 has released their open source model called Tulu3 and it has a 405 billion parameters. And the interesting thing about this model is beyond the fact that it's surpassing many other models on various different benchmarks. It is Completely open source, meaning it's not just releasing the model for people to use. They've actually released the training data, the model weights and the customizable pipelines. So basically they're allowing people to do everything that they are doing with the model. So this raises a lot of concerns and questions on one hand, Chinese companies, on the other hand, Really powerful open source models. Some of them, even if not Chinese available to anybody to manipulate, whether for good or for bad. Whether you like this or not, or you think this is promising or scary, it doesn't matter, this is the situation. And it's important for you to understand the potential implications, both on the economy, as well as on the speed and cost of the AI revolution, as we know it today. And from every hint that we're seeing, it's going to move even faster than we've seen so far. Once the large leading labs are going to learn everything that they can from DeekSeek and other open source models like when, et cetera. Now we mentioned scale AI. I will mention one more thing about scale AI that happened. That is interesting. Scale AI just. created a new test to evaluate the most advanced AI systems out there and they're calling it humanity's last exam. And they're calling it, and I'm quoting groundbreaking new AI benchmark that was designed to test the limit of AI knowledge at the frontiers of human expertise. Now, the questions were crowdsourced, so they basically had over 500 institutions across 50 countries come up with the hardest reasoning questions possible to be able to test AI across multiple topics. And there's currently over 3 million. Um, uh, Um, Um, Um, Models score the following scores. on this new test. GPT 4. 0 scores 3. 3%, Grok 3. 8%, Clod 3. 5, Sonnet scored 4. 3%, Gemini 6. 2, O1 at 9. 1%, and DeepSeek R1 at 9. 4%. First of all, we're seeing DeepSeek at the top, we're seeing 0. 1 second, but we're also seeing all these models at less than 10 percent in scoring on this test. And I assume because it's crowdsourced, this test is to continuously grow. This is everything we had to cover on DeepSeek. Again, my personal opinion is that it doesn't matter how we got here. We're already here. China has more or less caught up or some would say not. They caught up to the previous version of models, but they did it really, really fast and with significantly less resources, which means the U S government and the U S industry need to work together in order to reopen that gap by probably doing two things at parallel, preventing as much as they can from China to get their hands on U S technology. And on the other hand, figuring out more innovation versus just more money in order to accelerate the innovation on our side of the pond. And now we're going to dive into a lot of Rapid fire items. So first of all, there are new rumors for a new open AI round and presumably raising 40 billion at a 300 billion valuation that's going to be led by SoftBank. It's going to be a part of the Stargate initiative that we mentioned last week. So that collaboration between Oracle, SoftBank and open AI. The pre-money valuation of OpenAI is gonna be$260 billion and a post-money of$300 billion. That's a 73% increase from the$150 billion valuation they got just a few months ago. Now, to put things in perspective, despite all this madness with. Maybe we can do this faster, better, cheaper that we saw from DeepSeek. In addition to these crazy amounts from Stargate with 500 billion and 40 billion just to open AI to do things. Microsoft has pledged 80 billion for AI in this fiscal year. Meta has committed 65 billion for developing their AI. So the companies with really deep pockets are continuing to pour tens of billions of dollars into AI development. To stay ahead in this race. On the flip side, it is very obvious that the competition from DeepSeek is pushing prices down. So OpenAI just announced that O3 mini is going to be released and that it is going to be available to the plus tier users as well as the professional users. So that's good news for all of us that is paying the 20 bucks a month. And they also announced that they're going to make a one available with limited quantity to the free tier as well. So this comes only a couple of weeks after Sam Altman said that they're losing money On the pro tier that costs 200 a month because it gives unlimited access to a one. So what does that mean for all three? Well, it means they're going to lose money on all three as well. And they're going to finance that with other people's money, just in order to be competitive with DeekSeek and its recent capabilities. In a similar move, Microsoft started making O1 available on the free tier of its CoPilot chatbot. So you can now go to Microsoft CoPilot and use the O1 reasoning model for completely free. Under the Microsoft umbrella again to compete with DeekSeek capabilities that is significantly cheaper to use than the U. S. models until now. Now, since we mentioned OpenAI Microsoft, the host target situation with open AI partnering with Oracle puts Microsoft in an interesting scenario, right? So they have been the only, an exclusive provider of cloud solutions and compute to open AI, and now they lost that exclusivity. That means that the relationship maybe are not in the best scenario. But to remove all these concerns, Sam Altman posted a close up selfie of him and Satya Nadella. And he wrote next phase of the Microsoft OpenAI partnership is going to be much better than anyone is ready for. So Sam is known for his somewhat cryptic messages on X, and this is just one of them, but it seems that they're both brewing something for the next phase of the partnership. And the Oracle situation just enables to potentially relieve some of the pressure from Microsoft to support. OpenAI with all this money and infrastructure when they might need it for other stuff as well. So maybe everything is good. And probably within a few weeks, we will know what he meant. Now, OpenAI made a few updates to their products in this past week. First of all, they're testing A new functionality that integrates longer term memory with search. So think about the parallel of cookies in your browser, but just on AI search. So basically it allows the AI to remember what you've done before and combine it with search results. It's currently only running on the Mac OS. Desktop solution, and you can turn it off if you wish. So there's a toggle button that you can set up in your privacy, but that's the direction that they're going, or at least testing right now. Another very interesting feature that I'm actually personally really happy about is that they've upgraded Canvas. So Canvas is their interactive, collaborative environment within ChatGPT that gets triggered every time you're writing something, whether writing text or writing code, and you can trigger it by asking for it to use canvas, which I do all the time. So it now supports the 01 model, which it did not before. Previously was just the 40 family of models, and it also Can render the results of HTML and react code. So previously, if you wrote code in canvas, you had to take that code and run it somewhere else to actually see the results, which was a big disadvantage compared to Claude artifacts, which is their similar solution in Claude. When you generate code, you can actually see the output right there and then, and now that functionality is also available in canvas, which is absolutely fantastic, and I really was hoping for that to happen. Now, the only disadvantage of that is that it's currently only available on the Mac OS desktop app. If you have Mac like me, awesome. If not, it will probably show up in the rest of the platforms. Eventually, I must say that one of the things that drives me crazy with ChatGPT in general is that every aspect of their product has different limitations and different capabilities that don't carry across like things you can do in the regular chat. You cannot do you. on custom GPTs, things you can do in custom GPTs. You cannot do in projects. See things you can do on the desktop app are different than what you can do on the web and are different than what you can do on the mobile app. And it's all very confusing. And I don't know why they're doing it because it's the same models running in the background. Hopefully they will get the product act together and will give us access to everything, everywhere, just limited by the tier that we want to pay for. But it doesn't matter how we're going to access it across the different tools. It is going to work. I really hope somebody is listening and they're going to make it happen because it's becoming very, very confusing. Now, as I mentioned, Sam also announced that all three mini is coming out almost immediately. So in the next week or two, we'll start getting access to all three mini. You also shared that they're going to release more agent based tools and not just operator, and these are coming down the pipeline sometime, and he didn't give an exact timeline, but I assume within the next quarter, if he's already talking about it, and there are rumors about potential software engineering agent that is going to be baked into ChatGPT as well. So what are we seeing from open AI? Huge dreams, involvement with the top leadership all the way up to the president, partnerships with the biggest companies on the planet, giving stuff cheaper and cheaper because of competition. And in addition to upgrading the models, they're also upgrading the tooling to provide better access and usage for that. I'm a huge believer that the quote unquote tooling, the way we access these models make a huge difference. I absolutely love working in Canvas, which drove me to use ChatGPT more than Claude, which was the opposite before Canvas was available. So I think it's the right direction of figuring out how to make it more user friendly and more applicable in more use cases. Now, before we move over from OpenAI, there's been some negative news about OpenAI this week from Gary Marcus, who is an AI expert who compared OpenAI to Theranos. So those of you who don't remember Theranos, Theranos was the medical company that made claims about What they can do. And it was found that it was all fake and billions of dollars of investors money went to the trash. Now, I don't think that's as bad with open AI, but the reason he's claiming that is apparently all the information that they shared about the success. Of all three on the epoch AI frontier math benchmark happened because they had access to the data of the benchmark before. So basically, they were cheating on the test, and they were the only people who actually got to test the model. So he's claiming that the fact that they were cheating on the test and the fact that they were the only ones in the room shows that they're not sincere about the actual results. Claiming the 25 percent accuracy on the test versus 2%, which was the best any AI has scored before. Well, very soon we're going to find out, because they're going to release all three to the world and then we'll be able to see if the gap is really that wide between 01 and 03, that's what they've been claiming. And as I mentioned, we'll get to know very, very soon. another interesting news about models is that Google Gemini 2. 0 flash thinking is now the top dog in the AI model ranking, also known as the chatbot arena. So this model expanded the context window to 1 million tokens, which is by far the largest in any thinking model out there. They added code execution capabilities that did not exist before, and they improved the thinking response alignment. It's a very capable model. I'm playing with it more and more, and I actually find it very useful and helpful across multiple use cases, and that reflects in the chatbot arena. So since we talked about many models in this past half an hour, let's see how the table stands as of the recording of this episode on Friday, January 31st. In the first place, we have Gemini 2. 0 flash thinking, and then in the second place, Gemini experimental 1206, third place, ChatGPT 4. 0 latest. Which was released on November 20th, then DeepSeek R1. These four are relatively close together in the ranking. Immediately after that, Gemini 2. 0 Flash, and then R1. Followed by O1 preview, followed by DeepSeek version three. So DeepSeek have both their models in the top 10. And then there are a lot of models. The really interesting thing is that Google is holding the first two spots, plus the fifth spot. position. Now, if you remember in early 2024, Google had one failure after the other in releasing different models. And I told you, you cannot disqualify Google from this race because Google has everything they need in order to be highly successful in their race. They have endless amount of cash. They have access to compute. They have the largest potentially distribution channel in the world. They have Deep expertise in AI, deep expertise in search. And they have more data than potentially anybody else on the planet because of the Google search engine and it's scrolling the web as well as YouTube. And so they really have all the components to make this successful. And it was very obvious to me that it's just a matter of time until they take the top positions and at least right now, it turned out to be correct. Now, Google released a few additional cool things this past week. First of all, they've dramatically upgraded the Gemini side panel within Google Sheets, and it can now write Python code to do analysis and visualization within sheets. So this has been more or less the promise since the beginning, both in co pilot in Excel, as well as in Google sheets. So you'll be able to ask questions about data that you have and get immediate magical answers without knowing how to actually run a spreadsheet properly. And that was not the case until now. Co pilot has made some moves in the right direction, but you still need very, very specific formats. And now Gemini has made something that is very dramatic that you could have done before. Both on ChatGPT with data analysis capabilities, as well as in the Gemini chat, but you couldn't do it within Google sheets itself or Excel, and now you can, and I've actually tested it on several different things. I'm going to create a post about it on LinkedIn so you can follow and see what I've done, but you can open a large spreadsheet and ask a question and it's going to generate charts for you based on the questions that you're asking. And you don't have to know anything about the data on how to manipulate it. Only disadvantage is that it's not really generating a chart that is a Google sheets charts where you can go and change the colors or the font or whatever. It generates an image of the chart that is not editable, but I still find it very, very useful. And I also think that's going to change. And hopefully they will actually create the data behind it, meaning the pivot table or whatever other table feeds that As well as a chart or graph that you can actually edit, and I assume it's just a step in that direction, but overall, it's a very cool and very useful feature if you're doing a lot of data analysis and you're a Google user. Another very interesting and troubling feature that Google is testing is called Ask for Me, which is the capability of Google to call, literally, on your behalf, local businesses to make inquiries about their services. So the current feature that I'm sure is going to be expended as it gets better and better is an automated calling machine that's going to call local businesses to question about their availability of services and specific timing, and it can be done either through email or through phone, And what it does is it calls on your behalf, let's say to the local barbershop to see what kind of services they provide. And if it's relevant, what availability they have, and it can set up the appointment for you. It does disclose that it is an AI system when it calls these businesses and businesses can opt out through the Google business profile. But this is another step in the agentic future that we're walking into, where we won't Talk to businesses, won't set our own appointments. It will be done for us by these agents. And this is just one small step in that direction. I find it on one hand, really cool on the other hand, really troubling. But again, what I think about it doesn't matter. That's the direction. Everything's going. We talked about many models so far, and I'm not going to go deep on this one because it's not formally happening, but XAI, so Elon Musk's company released Grok 3 accidentally to several X users it disappeared shortly after, but in that time, they already started sharing that it has improved logical reasoning and better AI coding capabilities. It also incorporates court filing data for legal understanding, which is very interesting because it opens a whole set of use cases. If you're in the air world, just as a quick reminder, this new model was trained on the largest GPU cluster on the planet of 100, 000 GPUs in their Memphis data center. Now, what is going to be the actual release and launch timeline? Nobody knows, but since it's already pre released by mistake to X number of users, we need to expect that it will be released shortly, most likely in February. So within the next few weeks. Now, there are a few interesting robotics updates in this rapid fire segment. So the world's first half marathon that is going to combine human runners against robots is going to happen in April in China this year. Now, the robot's requirements is to be human like appearance, to be bipedal movement, so basically have legs and not wheels. It can be between half a meter to two meters tall. So between two feet to something between six and seven feet and both remote control and autonomous robots are allowed and they can charge or change their batteries during the race. And this comes after the Raybo 2 robot successfully completed a full marathon in South Korea in just over hours. So this is a remarkable achievement and it shows how fast, literally, the progress is happening in the robotics field. Another robot news is another Chinese company called UB Tech is planning mass production of its industrial humanoid robot. They're targeting 1000 units to be manufactured by the end of this year. All these robots are for industrial use, and they have some really large international corporations that are already buying these robots from them. Now combine this piece of news with other news that I shared with you in the past of several different other companies such as figure and Tesla or already having their models work in several different factories as test cases. It tells you that there's a question that nobody seems to be asking, which is safety. How do we control these robots and make sure that they're not damaging other machines, but more importantly, that they're not endangering humans that are working in those facilities. And so the related news to that topic is that figure AI, which is one of the leading robot development companies are creating a center for advancements of humanoid safety. And it's going to be led by an ex Amazon robotic safety engineer. Grundell. And what they're saying is that there's no specific instructions from the OSHA, which is the occupational safety and health administration, or any standards or humanoid robots, despite the fact, like I said, that they're already been deployed in Amazon, Mercedes, BMW, and other factories around the world. So their goal is to create transparency and to create reports and define different procedures for detecting hazardous scenarios, focusing on stability of these robots, as well as human and pet detection, AI behaviors and navigation safety. Now I mentioned that in previous episodes, but I'm going to mention again, a, because I'm a geek and I love it, but B because I really think we're already there. As a teenager, I loved Isaac Asimov's Books, and he has his robot series. And in the robot series, he talks about the three laws of Robotics and how they're set in order to keep humans safe in a world where robots roam everywhere And the first law is a robot cannot harm a human or allow A human to come to harm. The second law is a robot must obey a human's orders unless those orders conflict with the first law. And the third one is a robot must protect itself as long as it doesn't conflict with the first or the second law. I think that something like this is becoming actually really required. Now, how do you actually program the robots to follow these laws. That's a whole different question. But having some kind of a prime directive for all robots around the world is something we will need to start thinking about, especially something between five to 10 years from now, when they're going to be practically everywhere. Two more really interesting Pieces of news. One is very practical and tactical. And the other one is really geeky and somewhat scary. So the first one is Zapier introduced AI agents in beta and it's accessible through agents. zapier. com. It's basically an expansion of Zapier and central that they've released. Months ago, but what it enables you to do, it enables you to create agents that work quasi autonomously while leveraging the Zapier connection to over 7, 000 apps that are already integrated. So the idea is that you can give it instructions and it will figure out how to do them across the multiple platforms that it is connected to, but it also has some enhanced capabilities compared with the Zapier platform, such as web browsing and Chrome extension support and Obviously the connection to live data across all the different platforms that it's connected to. I think that's a very promising approach by Zapier. A similar approach was taken by n8n, which is the open source, more geeky version of Zapier. And I definitely think that's a move in the right direction. It will allow users to create a lot more sophisticated things with much higher levels of autonomy, with significantly less expertise, While building on their existing infrastructure, as well as their existing distribution to benefit from this. So I definitely see this as an interesting move by Zapier. And then an interesting piece of research that again is somewhat traveling is that Stanford researchers were able to create AI agents that mirror real human personalities with 85 percent accuracy. So the way they have done this is They successfully simulated over a thousand individual personalities following interviews of these people with large language models. So you would get interviewed by a larger language model that can detect multiple aspects of your personality, and then it can more or less replicate your personality and more or less be your doppelganger across multiple aspects of what you can do. It will behave like you think like you solve problems like you and so on. On one hand, this is really exciting. On the other hand, as I mentioned, this is really scary. Like many. other aspects of the AI world. That's it for this week. I know this was a longer episode, but the amount of stuff we had to cover was just enormous. And it will be very interesting to keep on following the race between China and the U S and between open source and closed source, and between investing billions to potentially just investing millions. And so on and so forth. I will obviously keep you updated as this move forward. Enjoy the rest of your weekend, keep on exploring AI. And if you haven't done it yet, please rank this podcast and share it with other people who can benefit from it. That's your way to help educate more people about AI and where it's going. I will really thank you if you can do that. And until next time, have an awesome weekend.