69 | Claude 3 challenges GPT-4 for top dog LLM, Hugging face developing humanoid robots, A Chinese employee stealing AI secrets from Google, and many more important AI news from the week ending on March 8th Artwork

Leveraging AI

Dive into the world of artificial intelligence with 'Leveraging AI,' a podcast tailored for forward-thinking business professionals. Each episode brings insightful discussions on how AI can ethically transform business practices, offering practical solutions to day-to-day business challenges.
Join our host Isar Meitis (4 time CEO), and expert guests as they turn AI's complexities into actionable insights, and explore its ethical implications in the business world. Whether you are an AI novice or a seasoned professional, 'Leveraging AI' equips you with the knowledge and tools to harness AI's power responsibly and effectively. Tune in weekly for inspiring conversations and real-world applications. Subscribe now and unlock the potential of AI in your business.

All Episodes

Leveraging AI

69 | Claude 3 challenges GPT-4 for top dog LLM, Hugging face developing humanoid robots, A Chinese employee stealing AI secrets from Google, and many more important AI news from the week ending on March 8th

March 09, 2024 • Isar Meitis • Season 1 • Episode 69

Who is the top LLM out there? It may not be GPT-4 anymore...

In this episode, we dive into the latest advancements and dramas in the world of artificial intelligence. He sheds light on new model releases, ethical dilemmas, and the intersection of AI with business innovation.

Topics we discussed:

The release of Claude-3 by Anthropic and its groundbreaking capabilities.
Amazon's integration of Claude-3 into Bedrock and its implications.
The whistleblower case at Microsoft regarding the Co-pilot designer tool.
Developments in AI from Inflection
Salesforce's new AI tools for developers and their potential to transform business operations.
Zapier's introduction of a no-code AI workspace, enabling users to create custom AI bots.
Google's innovative feature in Gemini's mobile interface for refining model responses.
The arrest of a former Google employee for theft of trade secrets and its broader implications.

About Leveraging AI

The Ultimate AI Course for Business People: https://multiplai.ai/ai-course/
YouTube Full Episodes: https://www.youtube.com/@Multiplai_AI/
Connect with Isar Meitis: https://www.linkedin.com/in/isarmeitis/
Join our Live Sessions, AI Hangouts and newsletter: https://services.multiplai.ai/events

If you’ve enjoyed or benefited from some of the insights of this episode, leave us a five-star review on your favorite podcast platform, and let us know what you learned, found helpful, or liked most about this show!

0:00

And welcome to a weekend news episode of the Leveraging AI, the podcast that shares practical ethical ways to improve efficiency, grow your business and advance So if you want to advance your career with AI, we have a packed episode today, all the big names as usual, but also really exciting news from Hugging Face and Salesforce and Zapier, and some news about drama in OpenAI, so lots and lots to talk about.

Isar Meitis: 0:26

This episode is brought to you by the AI business transformation course. It's a course we've been running successfully since last year, we're running two courses in parallel every single month, and we're fully booked through the end of March, but the next cohort opens on April 1st and you can sign up on our website. There's going to be a link in the show notes so that you don't forget. If you're looking to advance your understanding in AI to advance your career or to drive growth in your business or your department, it's an incredible course that I've been taking by hundreds of people at this point that are all leaders in different businesses. So check out the link right now To this week's news.

1:09

There has been two releases of two new large language models this week. One of them, which is the more exciting one for me is Claude-3. I've been using Claude for a very long time and I really like Claude. And Claude-3 from Anthropic has been released this week. It's a very powerful model. They released three different variations of it. Haiku, Sonnet and Opus, which are three different levels. The default context window stayed as before at 200, 000 tokens but they're claiming that can be increased to a million tokens in specific use cases and work as effectively. Their most powerful model, Opus, has been tested and exhibits really amazing results that are surpassing GPT 4, which has been the most powerful model so far. It is showing amazing skills that are passing or close to human skills in things such as undergraduate knowledge level in multiple aspects, as well as multilingual capabilities and math. One of the things I've been using Claude for a very long time is summarization. It's always been very good at finding specific data and summarizing data. And Claude 3 Opus is achieving 99 percent success and accuracy rate in what's called needling haystack evaluations, which is finding one sentence or a specific topic out of a very large piece of content on various topics. So it's doing even better than it did before at things it was excelling on even before the release of Claude 3. Currently Opus and Sonnet are available through the API. Haiku is expected to be available shortly and while Sonnet powers the free version of Claude.ai and Opus is only available to the paid subscribers. A very interesting thing happened while Anthropic War training Claude 3, Opus was starting to show hints of self awareness. They were asking it to find a pizza topic information in one of the Needling the Haystack tests. So a huge amount of data, and it had to find information about pizza topics. Now, the rest of the information had nothing to do with pizza, and the model was questioning whether that information was put there on purpose in order to test it. It became aware of the fact that it's being evaluated, even though that was not defined to it. It was just given a task. So while it's definitely not AGI yet, it's these little hints and sparks that we see every now and then from these models that are very surprising and showing us that they are developing more capabilities than we're actually giving them just by the fact they're given more data and more training. In parallel to this, Amazon announced that they're adding Claude 3 Sonet, which is their mid-tier version onto Bedrock, which is Amazon infrastructure for everything ai. So anybody who has their data on AWS can use Bedrock to use their existing data with multiple models, and now with Anthropics new models as well and Haiku and Opus versions are coming very soon. This is not surprising, A, because Amazon has been adding multiple models to Bedrock, but also because Amazon recently invested 4 billion in Anthropic. So that's one of their ways to obviously capitalize on this investment. Now per token costs through the API are actually lower than Claude two, and they're also adding the capability to use hourly pricing, which is something I haven't seen from any other models. So overall, a very positive note, another large language model that provides better capabilities through either their own platform or through partnerships. The partnerships obviously comes less than a week after Microsoft announced that they're adding Mistral large model into their Microsoft Azure platform, which is the only large language model that Microsoft has other than open AIs. Overall, as I mentioned, good news because it provides us, the users, additional ways to use and integrate these large language models. The second large language model release of the week came from Inflection. Inflection is the company behind Pi. Pi is now powered by Inflection 2. 5, which outperforms its previous Inflection 1, that was the model that was running Pi so far, by a big spread. So it's a huge improvement compared to what it had before, and it's coming close to GPT 4 model levels, especially in STEM subjects. Those of you who don't know Pi, Pi is mostly powerful as a chat conversational bot that has a real personality and really high EQ, and a lot of people are using it just for conversational information, even though it can do a lot of other stuff. The interesting fact here is that the Inflection 2. 5 achieves 94 percent of GPT 4 level across multiple things, with only 40 percent of the training and compute power while training the model. A positive development in the direction of using less compute, which is really harmful to our planet, while achieving really high EQ capable results from these models. Now, since I already mentioned Microsoft with regards to the news from last week of adding Mistral to their model, there's been interesting news from Microsoft coming this week. Shane Jones, a Microsoft principal software engineer has sent a letter to us regulators and companies board of directors, urging them to take action against the company's co pilot designer tool. That is based on Dall-e 3. He's Claiming that it easily generates offensive, harmful imagery, including things such as sexual images of women, violence, political bias, underage drinking, drug usage, misuse of corporate trademarks, conspiracy theories and religion. So basically almost any bad thing you can imagine, it will actually generate. Jones became a whistleblower after trying to solve the problems internally. He actually was trying to address that internally first before going out to U. S. Senate and so on. But he feels that he failed to do that because the people in Microsoft sent him to talk to OpenAI to try to get the problem solved. This situation, by the way, already started being addressed by Microsoft. In an article from CNBC, from less than 24 hours ago, Microsoft is claiming that they have made significant improvement in putting additional guardrails on its image creation, and now it actually prevents you from using multiple aspects Of what Jones shared. It's obviously not bullet proof yet. It's coming at the wake of the issue of Microsoft with its image generation model generating, let's call it highly diversified slash woke images of historical images that caused Gemini to remove its image generation capabilities of people, and it's still being worked on by Google. And if you want to add gasoline to the spire or being able to do stuff that the large language models don't want you to do. Researchers from the University of Washington in Chicago have developed what they called ART prompt, which is a new method that circumvents the safety measures of the large language models, that they were testing, which include GPT 3. 5, GPT 4, Gemini, Claudee, and Lama 2. So a really wide range of large language models. And the way they've done this is they have created ASCII art. So using different symbols from the computer, draw something or draw a word, but in a drawing, not by writing it. And the large language models perceives that as the actual word that the drawing represents, but it does not block it because it's not using the actual words. So it's a workaround that enables them to get the large language models to do things as offering advice on how to build bombs, counterfeiting money and creating images that it's not supposed to create. If you combine all of these together, it's very clear that we're living in an era of complete uncertainty as far as what's true and what's fake. And this will actually get worse and worse. So right now, literally anybody can generate any imagery they want that will be highly realistic and will look credible. And in the very near future, we'll be able to do this with videos. Those of you haven't seen the demos of Sora from OpenAI, don't understand what I mean, but go check it out. Google Sora. We've talked about this two weeks ago and you'll see exactly what I mean. So do not believe anything from non credible sources online. It could be your best friends, but they might be sharing something that is not real and they just not aware of it. And I don't really know how to address that. I don't really know how to address that with my kids, with my family, with my company. But this is the area we're walking into. And this is while we're talking about the quote unquote safer solutions that come from reputable companies that have invested a lot of time in red teaming and putting guardrails in place. Beyond that, there's the entire universe of open source models that some of them have no guardrails. And even those who do, you can branch out and change it to not have any guardrails. The reason I'm sharing this with you is not to scare you, but just to make you aware and hope that you share this with other people so they are aware that this is a situation that we cannot trust any digital communication that we see online. And we have to research and check the facts behind it through other channels in order to verify them before we share them or use them as information. Obviously, we cannot have a week of AI news without talking about Open AI. And in this case, it's more gossip than any useful information, but OpenAI just released emails from Elon Musk as part of Elon Musk's lawsuit against OpenAI. And in those emails, it becomes very clear that Elon Musk was not only aware of the fact that OpenAI is going to include a for profit component to it, but he was actually maybe, the biggest supporter of it, pushing for it to happen. The only thing is he wanted Tesla to buy out or to be a major player within Open AI and be its cash cow, as he mentioned in those emails, while the board members of OpenAI rejected that because they did not want to put in so much power within Elon Musk's hands. As a result, Elon left the company and Open AI was left without a sponsor and without somebody to pay the bills. And so shortly after they cut the deal with Microsoft to do exactly the thing that Elon Musk was pushing to do with Tesla. Now, does that change anything in court? I don't know. I'm not a lawyer and I'm not sure exactly how this goes, but it definitely shows that from an intent perspective, Elon cannot be complaining about what OpenAI have done, changing from complete open source to being a for profit organization because he was the one that was pushing for this more than anybody else. Still staying on topic of gossip from OpenAI, the law firm WilmerHale is expected to release a report independent investigation on the events that led to the outing of Sam Altman as a CEO end of last year, obviously after that was returned to that position, But the New York time reports that as part of this report from the law firm, they have found that Mira Moradi, the chief technology officer of open AI back then is one of the people who raised concern about Sam Altman leadership style to the board in October. The other person is Ilya Saskover. That is one of the co founders and the chief scientist officer of OpenAI also reported and raised issues about Sam Altman to some of the board members. Now the board, beyond the concerns that was raised, were really afraid that Saskaver and Moratti might leave the company if they don't take action, which is one of the reasons that Sam was let go. So now all we have to do is wait for the formal reports to see how much that information is complete or accurate, and then I'm sure we'll know more details and I will update you as we learn new details. And the last piece of news from OpenAI, OpenAI just signed another letter that is promoting safe usage of AI. In addition to OpenAI, this letter was signed by Salesforce and HuggingFace and other giants and a lot of other companies in the field. The goal behind this letter is to commit to safe usage of AI to promote the positive impacts and to try to fight the negative impacts of AI usage. The only problem with this letter, just like all the previous letters that were similar to this, that it has no specific actions defined on how to achieve responsible AI and no details on what might be the implications to companies who do not go down that path. Huge piece of news from Hugging Face. Hugging Face is the biggest repository of open source AI models. And hugging face just announced that they are going to start developing an open source project for humanoid robots. And that is going to be led by a former Tesla engineer called Remy Cadence. Cadence emphasize that they are working to create a true Open source, wink, wink to open AI and that it's going to be not just open source, but also significantly more cost effective than some of the other companies that are developing similar solutions right now. Hugging Face is currently hiring engineers in the field of robotics and they're hiring people in their Paris, France facility to develop these low cost robot systems. This is obviously a huge move by HuggingFace that so far only played in open source software, mostly as a host, but also as developing some open source solutions themselves. So they're moving into this very hot field of developing human or robots, which is really the field that's going to move from putting at risk only white collar jobs to putting more or less all jobs at risk. Which I'm sure has, like we spoke before positive impacts, but has a lot of huge social impacts that I don't think anybody has an answers to. And yet the competition and the race becomes more and more fierce. Another company that we don't talk about a lot, but that made a huge splash this week is Salesforce. Salesforce just released Einstein one studio. It's a set of low code, no code AI generation tools, enabling developers to develop gen AI applications, including three different platforms, they call co pilot builder, front builder and model builder. In addition, they've launched a bunch of courses to help and mobilize Salesforce, huge developer ecosystem to start using these tools. if you think about what it requires in order to launch a successful AI solution, that will be widely accepted, you need the right tools, you need the right technical skills. You need the right resources. You need the right data and you need the right customer base and Salesforce definitely have All of the above. So the benefit of Salesforce is that it already has a huge amount of data for each company that is using Salesforce that can train models now without training skills on the data. And it has a gigantic ecosystem of developers across multiple companies around the world. So this is obviously a very interesting development. It's not surprising. I think every large platform out there this year will start releasing, if it hasn't already, AI capabilities, and probably all the big players will also release development tools around that for their ecosystem, but a very interesting and important move by Salesforce in the right direction. Those of you who haven't used Salesforce before probably do not know that while it's a very capable and powerful system, it's not very user friendly. It requires professional developers to actually develop solutions per company that will be customized to the company needs in order to get these real big benefits, which is a big burden on companies who wants to run faster with the Salesforce platform and having these capabilities to basically just have a chat and ask any question and get results and answers accurately based on the data in your CRM on any or any of the other Salesforce platforms is really powerful and will provide a lot of value to the Salesforce users all around the world, which is a huge improvement compared to what they have right now. Another big company that is moving in the same direction is Zapier. Zapier Zapier announced this week that they're launching Zapier Central. Zapier Central is a new AI workspace for customers to create custom AI bots without coding. So similar approach to what we see from Salesforce and that we saw from Hugging Face and that we saw from OpenAI with GPTs. So the ability to create your own solutions. The cool thing here, similar to, as I said, with Salesforce is they have the knowledge and they have the customer base. And the idea is to allow people to develop really sophisticated, smart automations without knowing anything, not even how to use Zapier just by chatting with the new Zapier interface. And it will obviously build on top of Zapier's existing automation expertise across over 6, 000 different applications. I haven't tested it yet, but there's no doubt in my mind. It will be extremely powerful and will allow Zapier and Zapier users to do a lot more than possible today with significantly less skills. And from Zapier to Google, after last week's news, half of it was Google announcement. This week there's a small announcement, but it's very interesting. So now, in Gemini's mobile interface, you can select parts of an answer that you're getting from Gemini to a question, and ask it to refine just that part. I haven't seen any other platform that can do this, and I find this very useful. Sometimes you have an issue with just one component of a long or not even a long answer from a large language model to something you're trying to do. And so far, you had to go and explain what that segment is and ask it to change just that, but it usually would still change the rest of it because it regenerates the whole output. And now Google have changed that where you can literally highlight just specific segments, allowing you a much more granular ability to steer the model towards the solution that you try to get. So I think this is a great move in the right direction. I assume the rest of the large language models will move in the same direction, which will be awesome for all of us because it will allow us to get better, more accurate results while working less. And still staying with Google, a Chinese national was arrested this past week. His name is Lin Wei Ding. He's 38 years old and he's a former Google employee. When I say former, he left a few days before that. And he was arrested and charged with four counts of theft of trade secrets. So Ding while working at Google has illegally copied over 500 files containing confidential Google information while starting his own companies in China, without obviously sharing this with Google. The files, the secret that he has stolen relate to Google's hardware infrastructure and software platform that enables them to create supercomputing data centers to train AI models through machine learning. So these are obviously very serious allegations. And the fact that he was caught is great. But I assume he's not the only one. There's a fierce battle happening in the background between China and the US. As the US is limiting the types of chips that can be sold to China and what kind of other AI technology can be shared with Chinese companies. And so there's little doubt in my mind that there's other people that are currently working in U. S. companies that are sadly sharing this kind of information with China. The good news is that the top leadership in the U. S. is obviously aware of that. So the Justice Department, CIA, FBI, all of those are involved in this investigation to try to understand how to prevent that in the future. The bad news is that this was really easy. All he did is he copied files from Google's central database Into his apple notes applications, converting them into PDF files with different names, and then uploading them into his personal Google Cloud account. Now, I assume if you would have copied them into something else versus Google, he may not have been caught. But the fact that he was able to do that, that easily is a little bit alarming and concerning. And as I mentioned, this is going to go on and I assume we'll hear more such stories in the future. That's it for this weekend. On Tuesday, we have an incredible, amazing episode that is going to show you A really powerful and yet simple way on how to combine large language models with Google Sheets in order to do qualitative data analysis at scale. It's pure magic. It's extremely powerful. So don't miss our Tuesday show. I remind you of the course that we are running. So if you want to join our four session course that has been transforming businesses approach to AI. Look for the links in the show notes right now, click on it, get some information, and then you can decide for yourself whether you want to do it or not. and until then have a great weekend.

Isar Meitis

Host