Leveraging AI

75 | Claude takes the top spot in AI, Microsoft announces AI-PC and more AI news for March 30

March 30, 2024 Isar Meitis Season 1 Episode 75
Leveraging AI
75 | Claude takes the top spot in AI, Microsoft announces AI-PC and more AI news for March 30
Show Notes Transcript

In this episode of Leveraging AI, Isar Meitis shares the rapid evolution of AI technologies, their impact on various industries in the AI News. 

Topics we discussed:

About Leveraging AI

If you’ve enjoyed or benefited from some of the insights of this episode, leave us a five-star review on your favorite podcast platform, and let us know what you learned, found helpful, or liked most about this show!

GMT20240330-143649_Recording_gvo_1280x720:

Hello and welcome to a News Weekend edition of the Leveraging podcast that shares practical, ethical ways to improve your efficiency with AI, grow your business, and advance your career. This is Issar Metis, your host. And before we dive into the news, I want to share with you A quick summary of thoughts and processes that are happening that I've crystallized based on this week's news and actually all the news that we're sharing in the past few weeks. So the first thing is that we're getting more and more models from more and more companies that are getting bigger and bigger. And the good news about it is that these models can do more and more things. And the fact that there is competition is going to drive the cost down. That is happening in combination with the fact that the adoption is still in its infancy, meaning adoption of these models are going to grow, and it's going to grow dramatically. Whether by simple day to day users or large scale usage in corporations training models on their data and building new applications around them. As an example, this past week, I drove down to Fort Lauderdale to lead a hackathon of an industry organization. And that was amazing. There was two things that were amazing. One is the level of results and progress that groups were able to make as far as creating actual useful business value generating applications within a single day. So that was the first thing that amazing. But the other thing that was amazing is how few people actually know how to use tools like large language models, image generators, and so on. So as I mentioned, we are still in very early stages, meaning we are going to generate bigger and bigger models and use them in more and more things in a very dramatic way in the immediate future. The problem with that, or one of the problems with that, is that all these things, including training models and using models, is consuming a huge Amount of power. So in an article this week, we're going to talk about the fact that it's already consuming the amount of power, like a small country. And this is just expected to grow again, more than that in the actual news. But this drives to the development of new ways to train the models and run the models that will be more efficient, which is actually very good news. And there's two examples of that in two days. Now, the other thing that becomes more and more evident is that it's very hard to fight the big behemoths, the Microsoft and open AI is of the world in this race. And that drives obviously consolidation on one end, but also a lot of shifts in senior leadership movement on, by the way, in both directions, we see examples of people leaving smaller companies into bigger companies because the opportunity there is obviously obvious, but there's also the flip side of people who are leaving and looking for ways to still keep the open source world alive. And we'll talk about this in the news. So far, not successfully from a results perspective. And we're going to talk more about that in the news themselves. Before we dive into the news, I will mention that if you're listening to these news on either March 30th. Or Sunday, the 31st of March, you still have an opportunity to join our April 1st cohort of the AI business transformation course. It's a course we have been teaching to hundreds of people since April. It literally transforms businesses. So if you want to join, look for the link to the course registration in the show notes and just click on that and you can register and you can join us this Monday. If you listen to that afterwards, I'm not sure when the next course is going to be because we're booked with private courses that we're teaching to organizations and companies at least in the next three months. And now let's dive into this week's news. The first piece of news I wanted to dive into is actually something we mentioned last week, which is Nvidia's conference. As I mentioned earlier, I had a long drive going from Orlando to Fort Lauderdale, and in this drive, I was listening to the entire keynote by Jensen Wong, the CEO of NVIDIA, from their conference that happened just over a week ago. I highly recommend to anyone who's interested to see where our world is going to listen to that keynote. It's a very long keynote. It's just over two hours, but he really shares everything that they're doing right now and his thoughts about the future and what it will enable and so on. I must say, I have two reactions to what they announced and they announced some amazing things like their Blackwell GPU architecture, which will drive all their stuff in the future that delivers. incredible efficiencies and results, as well as connectivity that we never had before between GPUs and AI platforms and enables to do things in AI that were just not possible before at a much larger scale than ever. So from a technical perspective, I was blown away with what they're able to achieve and where they're going and so on. But. I couldn't stop thinking about Skynet and I literally went and pulled out the Skynet commercial from the first Terminator movie. And there's too many things that are in parallel talking only about the good stuff. This is amazing. It's going to solve global warming and world hunger and clean energy and all these great things, but it doesn't mention any of the potential negative implications. And I'm putting aside robots that may or may not destroy the planet, I'm talking about a lot of other negative implications that these tools may have, and I feel that these companies that are driving this train forward are driving it faster and faster with huge resources that are promising even bigger returns while either ignoring or at least downplaying the potential negative aspects. And this is not a good sign. But as I mentioned, don't take it from me, go listen or watch. If you're not driving the two hours keynote, you can put it at one and a half speed, and then it happens a little faster. If you have any technical curiosity in you, it's definitely worth your time. As I mentioned, the explosion in capabilities that Nvidia is promising also obviously means that more data centers will be built. That being said, their new chips are significantly more efficient in doing what they're doing from an energy perspective compared to the previous generation. I still don't think that's going to solve the problem on its own. The international energy agency, the IEA estimates that data centers for cryptocurrency and AI currently represent almost 2 percent of global energy demand in 2022, and that is going to double to about 4 percent by 2026. I think they're underestimating the pace. And I think the new report, whenever it comes out, will show that it's a lot faster, just as an example on how much energy usage is required running generative AI data centers requires 30 to 40 times more energy than running traditional computing to do similar operations. Tasks. Now, this is just the beginning because so far, most of these models were language models. The image generation are still not close to the same scale of usage. And we talked about this in several shows following very quickly is video generation, Which is obviously image generation on steroids because you've got to generate every single frame. So the requirements for these very energy consuming data centers is going to grow exponentially as we start generating more and more video with AI, which will start this year. Now to make this even worse, these data centers require a reliable, consistent source of energy, meaning the vast majority of them are running on non renewable energy, which means they're carbon based and releasing a lot of negative impact into the environment. But as I mentioned in several different episodes, a lot of researchers are looking for ways to make the training of these models as well as the execution of them more efficient. And this week, researchers from MIT released a new way to create images that they're calling distribution matching distillation or DMD for short, which is a tool that enables to create images like in mid journey and Dolly and stable diffusion. Not using a diffusion model, which was the common way to create images so far that is being used by the tools that just mentioned this new framework is eliminating basically the multi step, usually like 100 step process to create an image and generates the image in a single step. Now, not only that, it keeps the same quality of outcome as the existing diffusion models. This means two very important things. One is it is extremely faster than existing solutions. In the test that they've done, they've tried to generate the same image running the image on a diffusion model took them 2. 59 seconds running it on their DMD model took just 90 milliseconds, 28 times faster than the current models. The other benefit of that is that it's obviously dramatically reduces the computation power required in order to generate images, and hence reduces the energy that it requires. Combine that to what I said before, that we're going to be generating a lot more. And when I'm saying a lot more, I'm talking about several orders of magnitude, more AI based video this year than ever before. And it's going to keep on exploding as this technology moves forward, having the ability to generate images faster with less energy is a huge benefit, both to the creators as well as to the environment. And with that, let's jump as usual to some of the big giants. Microsoft made several announcements this week, and some of them are in combination with some of its suppliers. Microsoft is pushing very aggressively to generate what they call a AI PC. What is an AI PC? AI PC is a PC that will be able to do AI capabilities on board. So basically run many of the copilot functionality on the computer itself versus in the cloud, which is the way it is done right now. That requires a whole new set of chips. Currently, Microsoft PCs are using chips from mostly Intel, AMD, and Qualcomm. In this past week, Microsoft released what is their minimum requirement to run the copilot tools locally on the PC. One of the requirements is to have an NPU that runs at 40 tops. So what is an NPU and what the hell is 40 tops? An NPU is a neural processing unit. So traditional computers have a CPU, which is their centralized processing unit. In many new computers, they have a GPU component as well, which is a graphics processing unit, which is also being used a lot for graphic processing which has been also the driver behind the explosion in AI capabilities, hence why NVIDIA's crazy growth, because they generate GPUs. But now the chipset will also include an NPU, which is, as I mentioned, a neural processing unit, which is specialized in running these models. The idea here is obviously not to take over CPU and GPU time when running AI capabilities, meaning it's not going to slow down everything else that the computer is doing, or the graphics that it's running, but it's gonna run on its own little components. So what the hell are TOPS? So tops is trillion operations per second. It's basically a measurement of how much computation can these chips do in a single second. So the requirement that Microsoft has put for these chip manufacturers is to get to at least 40 NPUs, meaning 40 trillion operations per second is the minimum that they require. Now that doesn't tell us anything unless we look at current chips that we have. So Intel's Meteor Lake NPU offers 10 tops, which is the chip they're selling right now. AMD's Ryzen provides 16 tops. Both of them are falling very short from the 40 tops requirement. Qualcom is the only company that currently already sells chips that surpassed that. It's their elite chips with 45 tops. Intel is already working on chips that will do the same thing. And there's no doubt in my mind that AMD will do the same. Now, why does this matter? It matters because it means that the next generation of PCs will be able to run very sophisticated AI operations on board the computer itself. That provides three very important benefits. Benefit number one is speed because it doesn't have to send information back and forth to the cloud in order to present it to you. Benefit number two is it that it opens the opportunities for multiple software houses to develop applications that can run locally customized to very specific needs of very specific users that can run on board the machine, which will allow an explosion of capabilities even beyond what we're seeing today and benefit number three is obviously privacy because you're not sending any data to the cloud. You're actually running it locally in your computer. All very good news. There is zero doubt in my mind that Apple will follow and Macs will do similar things and probably Google with their Chromebooks will do similar things as well. So the direction is very clear. We will be able to run more and more highly capable AI operations locally on our computers that will most likely be completely integrated with the operating system and all the applications that we're going to run on our computers. Microsoft also announced that they're releasing new AI capabilities for teams. So Copilot for Teams will be able to combine inputs from multiple sources. So this is Microsoft Chat, Microsoft Teams, and even phone calls into a single interface that will allow you to see everything that happened related to a past meeting or in preparation for a future meeting, because it will know how to aggregate all the transcribed and summarized information from basically every conversation that you have. Even announced integrations with AT& T, Odido, Virgin Media, O2, Vodafone, and several other international companies to allow users to get a team's phone number that they can use on their device to have calls that are related to teams to integrate into that entire ecosystem. And they're also adding the capability to do speaker voice recognition and face recognitiOn. So however, you're running teams, even in conference rooms will be able to identify who is the speaker and to do the summarization and hence the data collection and so on, in a more efficient way. As with everything, this is amazing news as far as efficiency and preparation and meeting results and so on. But it's also really scary because it means Microsoft will have access to a lot more stuff that we're doing in our meetings. And we'll literally know every conversation that we're having that is business related. Now, speaking of Microsoft, I shared with you last week that Mustafa Suleyman, the CEO and founder of Inflection is moving with some additional senior leaders from Inflection into Microsoft. This relates to what I said in the beginning of this episode that has to do with people moving around from smaller organizations to bigger organizations where they will have a lot more impact, but just a couple of days after that, Microsoft said that Parkin, who led Microsoft Bing's search engine development and advertising business, had decided to, quote unquote, explore other rOles. And Pavan Davulari, who was previously Microsoft Surface and Services work, will now run both Windows and Surface teams as part of a leadership change. So what does that tell you? It tells you that in order to achieve the goals that Microsoft is trying to achieve, which is become everything AI across everything that we're doing in a more integrated way, they're also shuffling, not just the entire industry, but also people within Microsoft. I expect this trend to continue. I think we're going to see a huge fight for senior talent and even lower liver talent in order to keep this AI train moving faster and faster. And so while this is just from this week, don't expect it to be the end of this. I think we'll see more and more announcements with senior leaders and senior AI researchers jumping ship either in or out of these large organizations. And staying on the same topic, Moustaki, who was until recently the CEO of Stability AI, the company behind Stable Diffusion, an open source image generator, and now video generation as well, that has generated the amazing product, like literally Stable Diffusion 3 is Absolutely amazing. And it powers a lot of other applications like Leonardo and several others has left the company after a lot of quote unquote issues in their leadership and performance and financial issues that they have in the company. So from a business perspective, Stable Diffusion is not in a good position. Place while in a technological perspective. I think they're doing really well. What does that mean about their future? I don't really know but why is that related to the previous news because this week Moustaki released a selfie of him being on a video chat with Satya Nadella, the CEO of Microsoft. Now he did not share what they talked about, but he did share that they did talk. He did say that he's pursuing decentralized AI, which hints to potentially the fact that he will not join Microsoft. But just the fact that these kinds of conversations are happening are telling you what's happening behind the scenes in the politics of this gigantic and growing industry. Another major player that made a big announcement this week is Databricks. So Databricks is a company that was built on an open source community platform that has been generating database data management and AI tools for Multiple companies that Databricks has been around for over 10 years, and they've just released a model that they called DBRX that has 132 billion parameters that is running on top of their existing architecture, allowing companies who use Databricks infrastructure to develop better, faster, more capable AI solutions on their existing platform. It is currently outperforming leading other open source models, such as Lama270b and Mixtra, which are the two leading open source models right now in benchmarks such as language understanding and programming and math and several others. And they're claiming that it's significantly more capable than GPT 3. 5 at a fraction of the cost. The interesting innovation here is that they've used the concept of mixture of experts in their main architecture which is actually using 16 expert sub models dynamically in every step off the token generation process, which means they can generate things a lot faster. So let me explain this to you for a second. GPT four and even Claude are using the entire language model to generate everything generate, meaning it is a huge effort to generate every single token or every single word that it's putting out based on the instructions or the prompts that you're giving it. The new architectures that are being implemented using what's called mixture of expert, meaning there is a manager aspect of this who's looking what needs to be the answer and what topic it is, and then he's firing. Only neurons in the neural network that has to do with that particular topic. This makes the process significantly more efficient and faster and the new way That Databricks has implemented it, they're saying is even better and faster than the standard mixture of experts concept that existed so far, this connects directly to what I mentioned earlier that everybody is looking for better ways to run these models in order to make them more compute efficient Which helps them in means of speed, cost, scalability, and consumption of energy. Now, it's obviously not a surprise that Databricks has moved in that direction. They have been the company to go to when it came to hosting data and creating AI models on top of it until the recent rise of the AI models from everybody. And now basically every cloud storage provider like Amazon, Microsoft, and Google are offering these kinds of capabilities on their platform. So Databricks have to step up their game. The fact that they're releasing their model as open source is obviously to attract additional development from the huge community of AI developers out there, but also to attract talent to them, showing that they are going to continuously share that and that they're in the tip of the spear of the open source world. Now we just have to see how many of Databricks users are actually going to adopt this new capability and what kind of value they can drive from it, or will they start leaving and moving to other platforms such as Google Cloud and Azure and AWS. Since we're speaking about models and how they rank compared to other models, let's continue by talking about Claude 3. So Claude 3 was released a few weeks ago and in three different tiers, Opus, Sonnet and Haiku with Opus being the biggest one, Sonnet the middle and Haiku being the smallest one. And as of this week, it's the first time that Cloud3 Opus, the largest model, has passed GPT 4 on Chatbot Arena for the first time. So those of you who don't know about Chatbot Arena, we talked about this in several different episodes before, but Chatbot Arena is run by LMSYS, large model systems organization and what they do is they allow you to run your prompts on a blind test on two different models and you need to rank which model has performed the job better. This allows them to compare models without the users being tinted to one way or another and really rank the models based on actual results across a gazillion variation of different tasks because it's open to the public and anybody can go there and run these tests and provide feedback. So it was the first time ever since the release of GPT 4 that any model has actually passed it in its ranking, pushing GPT 4 to number two. That being said, it's in a very small margin, and it's allowing Cloud 3, a brand new model, to compete with a model that were released about a year ago by OpenAI, when they're just about to release GPT 5 or GPT 4. 5, whatever they're going to call the next model, supposed to be significantly better than GPT 4. And so I don't expect Claude3 to stay at the top of the list for a long time. That being said, all three Claude3 models are ranked in the top 10 on the leaderboard, which is very impressive for including the small model. But the more interesting fact is that the top 20 large language model in chatbot arena are all predominantly proprietary, suggesting that the open source models are still far behind because the top 20 models are run by closed source, big players. What does that tell us? It tells us that at the end of the day, access to more capital is more important. More compute, more talent, more data will win in this field. And the chances of the open source companies to win the large language model game is not very high. That being said, I think we'll get to the point of diminishing returns eventually, meaning I think sometime in the next two years, the Open source models are going to be good enough for most tasks. And while they're not going to be better than the closed source models from Microsoft and Google and open AI, et cetera, I think they're going to be good enough. And in many cases, they're going to be free or dramatically cheaper, which then may undermine the financial incentive of these huge giants to develop bigger and better models. Which means it actually puts at a bigger risk, the bigger players rather than the smaller players. Again, this is just me making assumptions. I obviously don't know what the future will tell, but this is me analyzing what's currently happening in this field. Staying on the same topic of large behemoth in their announcements about AI, one of the players that have been surprisingly quiet is Apple. So Apple finally announced the date for their WWDC conference, which is their annual developer conference that they're doing every year. And they started hinting about AI capabilities, which they haven't done so far. So the current expectation is that the new iPhone 16, and the new operating system, which is going to be iOS 18, is going to have a lot of AI built into it. As always with Apple, they've released very little information about what is going to be possible. But from a lot of hints that we're getting, it seems that Apple is going to go in two separate directions. There's going to be a lot on device AI capabilities that will run on their own chips and will provide various capabilities in combination with cloud based AI processing that it seems is not going to be done by Apple. Which means Apple is probably going to run everything that's going to run on the device and they're going to send The information for the cloud processing to some other players. So in a weird twist of events, it looks like Apple is going to send AI cloud functionality to Google Gemini, at least in all the world other than China, where there have been in conversation with Baidu to do the same thing for them over there Because Google is not allowed to work in China. If this is what actually happening, this is obviously a very interesting twist because Apple and Google are, the biggest competitors when it comes to the mobile world and Apple sending some of their capabilities to Google is surprising to say the least. From a Google perspective, this actually happens. It's going to give a huge boost to Gemini because tens of millions of iPhone users are going to start using Gemini on daily basis. Is this actually what's going to happen? Time will tell, but we'll know very shortly once WWDC actually happens. On a completely different topic, the U. S. Department of Treasury just released a report on managing AI specific cyber security risks in the financial services sector. And this report is showing a widening gap between the current capabilities of the financial sectors And the capabilities that AI brings to the table as far as risks. And they are talking about these gaps on several different aspects. One of the gaps is between large and small financial institutions and their ability to adapt to the current risks and even to the adoption of AI technology with obviously the bigger players having significantly more resources, meaning the smaller ones are going to be significantly more exposed. They're also saying that there is not enough data sharing among companies on how to deal with all the AI implementation, as well as risks, especially in the field of fraud prevention. And they're talking about that. There is a serious need right now to develop new frameworks and best practices and additional research on development. In all the aspects of financial services in order to deal with the advancements of A. I. This is obviously not good news, but I expect this gap to widen, at least in the immediate future, because right now there are a lot more resources going into the development of A. I. Then to deal with the negative implications off the development of A. I. I really hope that the government We'll step in at a certain point and put whatever limitations or budgets or both in order to deal with a new situation because otherwise the catastrophe is just brewing and really bad things can happen because the AI capabilities will drive stuff that the financial sector and many other sectors does not have the means to deal with. And if you think that we can go a full week Of AI news without talking about OpenAI, I haven't happened so far and it's not going to happen this week either. So TechCrunch released a research about the GPT store. So those of you who do not know what the GPT store is, it's GPTs are these mini applications that anybody can develop, including you, if you're paying the 20 a month for the, for the, Paid version of chat GPT, and you can release them to the wild. So to this GPT store and what they're saying that the GPT store is getting filled with many GPTs that actually break a lot of rules across multiple industries and so on. As an example, Some GPT is containing the ability to defeat AI content detectors that are currently being used by schools and universities. Now I'm not going to dive into the whole topic of whether that's okay or not okay to try to detect what students are generating with these tools. I personally think it's stupid because I think if anything, universities and schools need to teach students how to use these tools because their goal is to prepare them to life and they're going to in life. But I'm putting that aside for a second. This goes developing these kind of tools is against OpenAI's own policy. Some of the other GPTs are helping users circumvent the limitations that OpenAI has put on what content you can generate. So basically jailbreaking ChatGPT to do things that it's not supposed to do, that was presumably blocked by OpenAI. That obviously doesn't make a lot of sense. Other GPTs impersonate figures like Elon Musk and Donald Trump and Leonardo DiCaprio and Barack Obama, which, again, is going against the regulations and the rules that OpenAI themselves has defined. And there's a lot of other examples on, quote unquote, bad GPTs. Add to that the fact that a huge amount of these GPTs are pure lies. junk. They're not really helpful. They're not doing a lot of useful things and you can't really use them for anything productive, but you may spend a lot of time trying to figure out what they do. So overall, it seems like currently the GPT store is the wild west. Despite the fact that open AI has teams in place that are supposedly monitoring. What's currently happening and approving different apps that doesn't seem to be the situation right now. That being said, there are some GPTs that are absolutely incredible as far as their results that they can generate developing your own GPTs based on the needs of your business. Some of these things we have done in the hackathon in the past few days, which again, generated amazing value to the companies that are going to use these GPTs. You want to say amazing value. I'll give an example. Some of them help with proposal generation and reduces some of the efforts from a few days to minutes or a maximum of an hour. So if saving a few days in a process that you do every single week is not significant. I don't know what is. So GPT in general is awesome. some of the existing GPTs on the GPT store are absolutely mind blowing and amazing in their capabilities, but open AI definitely needs to do a better job in monitoring and curating what goes on the GPT store. Otherwise it will very quickly lose its value because it's going to be very hard to find stuff that's actually useful in an ocean of junk and bad GPTs. If you enjoyed this episode and you've been enjoying this podcast, I would really appreciate if you do two things. One is open whatever platform you're running this podcast on and give us a review and also open it right now. Click on share and share it with a few people that you know. Yes, right now, pull up your phone unless you're driving. Don't do this if you're driving, but if you're not driving and you can pull up your phone safely right now, pull up the phone, click on the share button next to this podcast and share it with a few people that you could think can benefit from this. We're not investing money in marketing. We're investing all of our resources in providing you as much value as possible. So if you want to help your friends or colleagues know more about AI, and you think this podcast is a good way to do that, we would really appreciate if you share that with them. On Tuesday, we'll be back with an amazing episode, interviewing an entrepreneur that is transforming her entire business, large scale business with a lot of employees in multiple states that is being transformed with AI, saving her a huge amount of money. And she's going to share exactly how she's doing it. So don't miss this Tuesday episode. By the way, if you didn't listen to previous Tuesday's episode with EZ, she's sharing exactly how she has automated her business, cutting her workforce by 75 percent while tripling her revenue in a single day. So both of these episodes are highly recommended if you want to learn on actual implementation of generative AI in your business. And as I mentioned, if you're listening to this on March 30th or on March 31st, you can still join our AI business transformation course that starts on April 1st. And until next time have an amazing rest of your week.