149 | AI live video will profoundly change our world especially when combined with 3D AI worlds and quantum computing, and more AI news for the week ending on Dec 13 Artwork

Leveraging AI

Dive into the world of artificial intelligence with 'Leveraging AI,' a podcast tailored for forward-thinking business professionals. Each episode brings insightful discussions on how AI can ethically transform business practices, offering practical solutions to day-to-day business challenges.
Join our host Isar Meitis (4 time CEO), and expert guests as they turn AI's complexities into actionable insights, and explore its ethical implications in the business world. Whether you are an AI novice or a seasoned professional, 'Leveraging AI' equips you with the knowledge and tools to harness AI's power responsibly and effectively. Tune in weekly for inspiring conversations and real-world applications. Subscribe now and unlock the potential of AI in your business.

All Episodes

Leveraging AI

149 | AI live video will profoundly change our world especially when combined with 3D AI worlds and quantum computing, and more AI news for the week ending on Dec 13

December 14, 2024 • Isar Meitis • Season 1 • Episode 149

How will AI shape the future of creativity, industry, and even humanity?

This week on The Leveraging AI Podcast, we unravel the most impactful tech updates from the AI world—updates that could redefine everything from video creation to quantum computing. While OpenAI and Google race to innovate, ethical concerns about AI’s role in replacing jobs, manipulating data, and self-preservation behaviors spark heated debates.

Is this the dawn of limitless creativity or a harbinger of disruption? Get ready for an eye-opening journey as we dive into groundbreaking developments, from OpenAI's "12 Days of Shipments" to Google's quantum computing breakthrough, Sycamore, capable of solving problems once deemed impossible.

Here’s what to expect:
In this session, you’ll discover:

How OpenAI's Canvas tool is transforming writing and creativity workflows.
Game-changing advancements in AI-powered video generation tools like Sora and Google's VEO.
The rise of 3D virtual environments that turn simple images into fully interactive worlds.
Why AI tutors, tour guides, and customer service tools are replacing traditional systems.
Quantum computing’s jaw-dropping potential to redefine AI capabilities—and its ethical implications.
The delicate balance between creative democratization and industry disruption.
Why wearables like Meta's Ray-Ban and Xreal glasses will reshape how we engage with AI.

Stay tuned for an honest, witty, and professional take on these rapid technological advancements and their ripple effects.

About Leveraging AI

The Ultimate AI Course for Business People: https://multiplai.ai/ai-course/
YouTube Full Episodes: https://www.youtube.com/@Multiplai_AI/
Connect with Isar Meitis: https://www.linkedin.com/in/isarmeitis/
Join our Live Sessions, AI Hangouts and newsletter: https://services.multiplai.ai/events

If you’ve enjoyed or benefited from some of the insights of this episode, leave us a five-star review on your favorite podcast platform, and let us know what you learned, found helpful, or liked most about this show!

Speaker: 0:00

Hello, and welcome to a weekend news episode of the leveraging AI podcast, a podcast that shares practical ethical ways to leverage AI to improve efficiency, grow your business and advance your career. This is Isar Metis, your host. And just like last week, this week has been jam packed. Packed with really important news, we are going to dive into three or even four bigger topics, but then we will try to squeeze in a rapid fire of a lot of other stuff that we're going to go through very, very quickly because I do want to spend more time on the bigger topics. So the first big topic is real multimodal releases of the ability to both share screen and share real time video and communicate and input information to the AI that way that will change more or less everything we know in our world, and we'll dive into this during the episode. Virtual worlds models is the second topic, which is somewhat connected to that, and I will explain why during the episode. Then quantum computing that may and most likely will impact the AI world in profound ways. And then in the rapid fire, we have a lot to talk about from new video models to XAI, new capabilities, and finally, some real news from Apple Intelligence. as I mentioned, lots to cover. Let's get started. The first topic we will dive into is OpenAI's 12 days of shipments or 12 days of OpenAI, depending who you ask, but as we discussed last week, OpenAI has given us a holiday gift in the shape of releasing something new every single weekday at 1 p. m. Eastern or 10 a. m. Pacific and the stuff that they've released so far. We covered some of it last week. The first thing is a full Oh, one model. They released Chachapiti pro, which is a 200 a month version that gives you open access to all the advanced stuff that they have. They have provided. Upgrades to GPT canvas, which I find absolutely incredible. I'm a heavy canvas user. And now with the new capabilities, it's even more amazing. Those of you who don't know what canvas is a user interface change to how you can use ChatGPT when you route code or when you write any creative thing. I just worked on a chapter for a book and I must admit that working with canvas, writing the chapter. Was nothing short of magical. I literally enjoyed the creative process way more because of it, because it allowed me to bring in my thoughts and my inputs, because you can type into Canvas, just like you can type into a word editor. But at the same time, I was able to get assistance from the AI to summarize Previous lectures and segments from my course and things that I said on the podcast and combine them seamlessly into one chapter in a book while using the AI to, while using the AI in several different hats with specific personas to help me write, to help me critique, to help me organize, and then eventually to edit this into packed after actually creating a patchwork of multiple things. I said, either while I was writing or previously in different podcasts and so on a really magical tool, and they've provided several different upgrades to canvas, but the biggest and most important one is the ability to use canvas within GPTs. So custom GPTs are these mini small automations that you can build in chat GPT. And now canvas is available in there as well, which is extremely, Powerful, and I will probably dedicate an entire Tuesday how to episode to how to use Canvas in the most effective way. They've also announced, finally, the promised integration with Apple Intelligence. More about that later on in this episode. But you now finally can send information from Apple either on your MacBook or on your iPad. iPhone to Chachapiti to get more deeper and more creative capabilities than just the built in Apple intelligence. They're now providing more access to reinforcement, fine tuning to companies. They've shared some API improvements that they've added To open AI's API, and there might be even more advancements in the API coming in the near future, but I left the two biggest ones for the end. One of them that came earlier this week is Sora. So the highly. Sought after and waited video generation model that open AI has demonstrated back in February and that everybody was raving about. And that really changed the view, the perception of what's possible with video generation, because before that we mostly had runway and it was not really impressive. It was cool for geeks like me who could play with it and generate videos, but it was really, really lame. And then Sora came out and had these spectacular 1080p one minute long Cohesive and coherent videos that looked incredible. And there were a gazillion different examples from flybys to people through a whole minute. The most known one is the lady in the red dress walking in Tokyo at night with all the reflections. Amazing, but it was never released to the public. They have done a Pre release to studio and creative organizations, but it was very very limited. Just recently, some people who participated in training Sora and in testing it has leaked access to it on HuggingFace, which was shut down very quickly. But we, the people did not have access to Sora. And that was one of the gifts that we got from OpenAI in this holiday season. So what can Sora, what they've released so far is not even close to what the original demos showed. You cannot generate a one minute long video. You can generate either five or 10 seconds depending on the level of license you have. If you want the 10 seconds, you need the 200 a month license from OpenAI. So GPT Pro from a generation quality perspective, it's pretty good, but it's at par with where everybody else has caught up right now. So if you're comparing it to Runway or Kling or Minimax or Luma Dream Machine, roughly at the same level, maybe a little better on some specific things. But they've added some very cool and interesting tool that makes Sora a very, very powerful platform to use to generate video. So the first thing, they have a remix capability. So think about it, something similar to remix that we have in Midjourney to generate images. You can take a video and ask to remix it with either slight changes or with significant changes. And it basically allows you to create a new variation of the video. The other thing that they allow you to do is to stitch or basically merge two videos together. So you can create one video, create another video, and then ask to remix the two of them together into one video. really unique and interesting creative capabilities, but also really unique capabilities to create longer, more interesting shots by blending to other shots that, that you have. But to me, the most promising tool that they have created, which is something that I predicted a long time ago and is finally showing up in all these different tools is what they called storyboard. So storyboard is the ability to control specific frames, images. Over the timeline of the video. And you can do that either by adding frames, literally images that you can bring in, or you can do that by writing and typing what happens in different seconds of the timeline. Now, right now we have a very short timeline to work with, which is five seconds, but even within those five seconds, it provides you so much control over the shots. I've played with it a lot in the past few days since it came out and the ability to get results that are similar to what you have in your head as far as your creative idea is incredible. It's really working well. And so I'm very excited about this. If you go back even two episodes, when I talked about the, my predictions that I made early in 2024 about what's going to happen in the video world in 24 and 25, it's pretty good. roughly what I said. I said that in 24 we will get videos that are very hard to distinct between them and real life and that we'll be able to create them in multiple use cases in multiple styles and we're roughly there. I would say that as far as realistic videos, most of these tools still have glitches and things that are tells if you're looking carefully, but not always. And I predicted that in 2025, we will have the ability to have a lot more control over videos and to edit videos with AI. And I still think this is what's coming. I think in 2025, we'll get to a point that we will be able to create full length episodes of series all the way through definitely commercials. And we're going to talk a little bit about commercials in a minute all the way to Potentially full cinematic one hour, one hour and a half videos using these tools. And yes, it's going to be a big undertaking, but significantly smaller than doing the real thing. Now we'll go back to another release, the biggest one from OpenAI, but before that, let's stay on video for a second. So as you remember, just a week ago, Google has released VEO, which is their version of video creation tool, which is also really impressive and allows you to create long videos with high resolution. And they are also now in that game that's also something that they demoed months ago and now has finally been released from beta to the public. So we have two new very powerful tools that generate video but we also have the players that have been there around for a while like Runway. Runway just released a tool that is like a visual map of interconnected images that are points in the timeline of a video that you can create. So if you want to think about it, if you imagine a video or a scene, it has multiple steps in the scene and you can present them in these stops on like a canvas where you can drop different images in different places on the canvas and arrange them in whatever order you want and connect them in order to create smooth transition between these points this is. Similar to the concept of what open AI just released in storyboard. Now this is built on top of their gen three alpha model, which is their latest version. And it allows you to also, as we mentioned last week, create images that will be cohesive images. And what it allows you to do is it allows you to brainstorm a nonlinear exploration of ideas, kind of like in, as I said, canvas space, and it allows you to place them and then move them around and connect them in any flow you want. And if you think, two steps forward, what this will enable you to do is to create a timeline of an entire video that could be really, really long, that behind the scenes, even if the engine only knows how to render five seconds or 10 seconds or whatever the case may be, it will be able to stitch those videos by creating them one by one, but then stitching them in the flow that you wanted, that you need for your expression and create really long videos, even if behind the scenes, the engine stays exactly what it is today. And if it can help you create consistent characters and consistent scenes, which is exactly what the new gen three capability allows you to do, it opens the door to really long, really sophisticated shots with different controls on the direction, the flow of the story. And so on. I find this really, really exciting. Anybody that creates videos should find this really exciting as well. Now, at the same time, this obviously raises some controversy and serious fears in the people who is their profession has been to create videos in the traditional way with videos and lightings and microphones and actors. And special effects and so on. To provide a specific example, Coca Cola just generated a Christmas ads that was created 100 percent with AI. Now, they've done this in collaboration with several very advanced studios like Secret Level and Silverside AI and Wildcard, and it utilizes multiple AI models like Leonardo and Luma and Runway and Kling, And it required hundreds of attempts to create some of these shots and then stitch them together professionally in order to make the ad look what it looks like in the end. But this video obviously created a very serious controversy from multiple actors. Most of them are claiming that this will dramatically impact People from the profession. So if large companies like Coca Cola or any other large company that has invested millions and millions every single year in creating ads, and that money is going to flow into AI, it will obviously take away the livelihood of a lot of people in that industry. And they are correct. And if you remember, and if you go back to the episodes from the time of the authors, and then the actors strike in Hollywood, that happened about six months ago. You will see that I was saying specifically that I think they don't stand a chance. Yes. They eventually got quote unquote, what they wanted and the studios caved in and signed a deal with them that promises their job against AI, blah, blah, blah. I said back then, and I still stand behind it. I don't think they stand a chance because what's going to happen is there's going to be new studios. Some of them might be one person with really creative capabilities that will create full length videos, a hundred percent with AI or 80 percent with AI that will be significantly cheaper to produce. Just think about watching a Hollywood video. Think about when you are getting up from your chair and getting out of the cinema, think about how long it still scrolls names of people. There are thousands of people who create a traditional film. And those thousands of people are all professionals that's their jobs. And I'm really sad that's what's happening. But the reality is if one person or 10 people or 50 people can create a video, it's And create a film that people will want to watch. The cost of it is going to be 1 percent of the cost of a current film that's being generated. And if it's going to be good, people won't care how it was created. And so I don't think it's, A death blow to Hollywood, at least not yet, but it will definitely get to a point where the Hollywood studios will have a choice to make. Either let everybody go, including some people that won't have to go if they'll make videos and make films with AI. Or find a way to work like the smaller studios will. And the same thing is going to happen in a lot of other industries. And that's not a good thing necessarily, but I don't see any way to avoid this situation. Now, the good news in all of this is that I think this provides for a complete democratization of creativity and an explosion in creativity, because Previously, people had ideas for creating stuff, whether images or writing or video, most people couldn't do it because they didn't have the skills and they didn't have access to the tools to produce what they wanted to produce. Now, anybody with a creative idea can create the script with the assistance of AI, can fine tune it to make it more interesting, more appealing to the audience and so on, and can then create the images and create the videos that they want. With tools that anybody can use because you don't need to go to school or specific professional courses to learn how to use them. I find this very appealing from that perspective. And yes, we need to find ways to balance that with how many jobs are going to be lost in that industry. But that's true, not just for the film and video generation industry, but for many other industries as well. And I told you, we're going to go back to the releases from OpenAI. So now I'm going to go back to the most important and critical thing that they have released so far, which is live video and screen sharing in chat GPT. So they've demoed this and by the way, Google demoed the same thing back in April, back to back day after day, OpenAI kind of like jumped the gun and share this a day before Google. Now, Google did exactly the same thing the other way around. They jumped before OpenAI and released the same capabilities in Gemini. But what can these two tools do right now? Well, so far you could chat in your normal voice with ChachiPT, which I do all the time. It's very helpful when you're just doing brainstorming and try to develop specific ideas. But now you can do this with full video. The video could be looking at you, which is less relevant, but can be turning the camera the other way around and then ChachiPT can see the world and it understands everything it's seeing and it can respond to it in real time. The same exact capability, as I mentioned, was now introduced in Gemini. It's part of Gemini 2. 0, which was also released this week. And we're going to talk about this in a minute, but in Gemini, you can access that to Google AI studio, not to the regular chat yet. It's probably going to be integrated into that early in 2025, but it's already accessible in Google AI studio, and you can go and test it out. The other functionality that both these tools have is the ability to share your screen. And this could be the screen of your phone or the screen of your computer. And the application is running and can have a voice communication with you. As if you're having an expert sitting right by your side that can work with you on everything, anything that you're working on your screen. This could be playing a computer game, shopping for stuff and getting recommendations for shopping, writing code, creating a new document, writing emails, literally anything you're doing on the screen, which is a lot of what we do on the day to day can be done with an oversight, an overview, helping in research, helping in putting ideas together with an AI that can see everything that we can see on the screen, which means you have now The entire capabilities of these large language models, understanding your entire world, both the real world that we live in, as well as the digital world that we live in, all while providing really advanced capabilities and injections into that. as I mentioned, Google has released something similar as part of Gemini 2. 0 Flash, which is their latest model that is Now available as part of their chat, replacing 1. 5 row, which was the leading model so far. So now the default in the chat is going to be, Gemini 2. 0 flash. You can obviously still select that, but you can go to all the advanced feature, as I mentioned in the Google AI studio and get access to these advanced video and screen sharing capabilities. And it's going to be available in the mobile app from Gemini very soon. They're also planning in the very near future to release agentic capabilities designed for more autonomous operations under user supervision. And that's their words, probably in January. So more to come in the near future. The other thing that they released that has some agentic capabilities is deep research. Deep research is something more similar to perplexity, right? So those of you who use perplexity or you. com, which I did a full episode about, which we did a full episode about last week on the Tuesday episode, which I highly recommend you go and check out. but what deep research does is it understands what you're trying to search or research. It breaks it down into multiple steps. It does each and every one of the steps of the research on its own, and then it puts back together a summary. I tested it out. Quickly, not in depth. And so far it looks pretty good. I didn't do a full comparison to perplexity or you. com. I must admit that I use perplexity and you. com almost every single day. Well, perplexity every single day, you. com almost every single day. And my decision on which one to use as perplexity gives me faster, quick results. And when I need deeper and more thorough research, I go to u. com. So now I'll just have another tool in my inventory. This is probably going to be Google's way to fight perplexity and u. com when it comes to maintaining their complete world domination when it comes to search. Now, before we dive what all of this means, I want to dive into the third topic that I told you we're going to discuss today, which is virtual world development with AI. So Google's DeepMind just released Genny 2, which can transform single images into multiple images. Fully interactive 3d environments that stays playable for a full minute, meaning you can take an image of either a real world or a made up world that you can create in an image with another image generator, or that you can draw by hand, upload it to Jenny and create a 3d environment that is playable, meaning it's an actual universe. And that universe maintains the spatial consistency of all the elements on the screen. it has the right lighting and shadows. It has the right physics. It's a universe that you can play in and it even throws in agents to become NPCs and behave as needed in that environment. That basically means that you have a high fidelity game like environment that is created from an image without writing any code, without any other engine in the background that you need to know how to use. Now they are not the only company in that field. There are other companies like WorldLabs and Descartes that has released these kind of capabilities. WorldLabs also just did a big unveiling just recently that does something very similar and can take 2D images and create them into 3D environment. And in that environment, you have real time browser based rendering, so you don't need sophisticated hardware on your computer, camera controls, so you can create effects of different things that's happening in that environment, persistent 3D, physics, and multiple interactive and passive animation effects. So all of that within a world that can be created from images from multiple suppliers. Now, another company that introduced something that kind of connects the dots between these two things that we just talked about, the generation of images and videos and the 3D universe, Midjourney just launched what they call Patchwork. Patchwork is like a multi layer world building tool that is like a collaborative environment when you can create multiple images and place them on a shareable, interactive, multi layer, multi person collaborative canvas. And you can have up to a hundred users collaborate and work in that canvas together. There's an infinite canvas workspace. So you can keep on zooming out and adding more and more stuff, and you can create characters and events and places, and each and every one of them can be its own little world, but they also created like a portal system where you can move from one world to the other with multiple AI image style options that you can keep in each and every one of them. It's really interesting. I must admit, I didn't dive in. I just watch some of the videos and what they've combined in the background, in addition to their incredible engine, is they've combined three other external large language models, plus one that they have developed and fine tuned from open source capabilities. They're also talking about releasing Mid Journey 7 in the immediate future, that will have Character consistency, which has been one of the biggest problems of creating images and then creating videos is that the characters that you're creating are never exactly the same. Something changes. And there's different processes and ways and tools to go around that. I Had Diana's Debo on episode 138, and we dove into consistent character solutions, including the software that she and her partner have been developing. Other things that are upcoming that they've shared is multiple model personalization modes where you can customize Mid journey to your needs for specific projects. They're talking about video model capabilities that are coming after December 25th, so maybe still this year But if not in the beginning of next year And as I said, improvement across everything, including prompt understanding in version seven, that is coming shortly. And then in the longer term, they've never hidden their plan to go into immersive 3d and VR capabilities. So now Let's connect the dots for a minute between everything that we talked about. We are talking about the capability of AI to create worlds. Why does that matter? It matters because it will allow it to understand our world faster because you can create synthetic data of actual things and test things out and see how they work and teach the AI. Without having to actually capture the real world in order to do that, that can dramatically accelerate the AI's ability to understand the universe, but it will also dramatically increases the AI capability to replicate our known universe because the combination of it having access to real cameras, because everybody's now going to use this new feature and the capability to view the world and help me do all these things. Combined with its ability to behind the scenes, understand how the world works because it can render and generate these worlds in real time will allow it to shift back and forth between virtual and real worlds very quickly. that opens the An immense amount of capabilities from both creation and engagement with the real universe and we're virtual universes with the assistance and the interface through AI capabilities. Now, while this might be exciting from a technological perspective, it has profound implications. Lots and lots of things that we know and are used to doing every single day. And I give you a few examples that I thought of, but you can think about hundreds of others. So the first thing. It's the end of YouTubing, how to, right? Because many, many of us, myself included, go to YouTube every time we want to know how to do something. These things are things in the real world, how to fix the toilet in your house. And it's also things on how to use different pieces of software on your computer or online. That's going to go away because instead of looking through hundreds of videos for the exact thing that I need, I can open my camera and show the AI the exact situation I am in and get specific instructions for that specific scenario, whether it is understanding how to use a piece of software, how to. Complete a process across multiple pieces of software, how to fix my toilet, how to change a light bulb or how to, change the panel behind whatever, something that you need to fix, like literally anything, because it has access to all that data that also presents a very interesting opportunities for companies, software companies or companies who need to provide customer service for products that they sell people for the house, because that can become their customer service. You can go to their website, click on a button that behind the scenes will run an AI that is trained on their specific data. So you can open a software and the AI will walk you through everything you want to do. Or in the later phase, when you give it access, we'll actually do it for you. But in the first step, if you want to do it yourself, or if you don't want to provide it access, it can walk you through the steps and you can probably start doing it right now with Chachupiti or Gemini for multiple pieces of software that we're using regularly, like our CRM, word generation, and so on. It is going to profoundly change the world of tutoring. Why do you need a human tutor that may or may not know the material that may or may not know exactly how you want to learn when you can open your homework and just ask the AI to help you with it. And this is true, whether you're doing your homework on the screen or whether you're doing your homework on a piece of paper, and then you can just mount your phone in a way that looks at it and it will walk you step by step. I've been doing this with my kids for a while now, meaning my daughter is taking very advanced math and. Some of the exercises are even beyond my level. And so what we do is we ask the AI, we take a picture of the homework on a piece of paper, and we ask the AI to explain to us how to solve it. Now, the interesting thing about it is that AIs are pretty bad at solving the problems themselves. They actually get it wrong about 50 percent of the time, but they're extremely good at explaining how to solve themselves. Step by step. It shows you exactly how to work with the formulas, what manipulations you need to do on them in order to actually get to the right outcome. So it's a great tutor. But now, instead of taking a picture and reading from the screen, you can mount your phone and have a conversation on what you're trying to solve. I'm on the piece of paper and it will walk you step by step for any level. You can do this from kindergarten level all the way to PhD by telling the AI which level this is and how sophisticated you want the instructions to be and how much you want it to help you versus to show you and so on. It could be the ultimate personal tutor. But it doesn't stop there. Think about tour guides. any kind of tour guides anywhere in the world. I can now have the AI be my ultimate tour guide. It will speak any language. It will go in depth to the things I'm interested in and we'll totally skip the stuff I don't care about. It can provide me references and videos and anything from the internet to have me go on tour. understand more stuff that I want to know about. It could be casual. It could be professional. This could be in a museum or out in the field, anything because the AI can see anything that I can see. Think about the handyman industry. Many of us would not pay 200, 300, 500 hours to have handymen of different kinds come and fix stuff at our house if it was easier to do. And yes, There's YouTube videos to do it, but some of them are complex and just spending the time to find the right videos is not easy because your type of washer and dryer are not exactly what you found on YouTube. that's going to go away as well. So anybody who will want to save the money and not invite a handyman will be able to use AI to help them solve anything they want to do in their house. Anything from very basic all the way to opening walls and running wires or piping and so on. I'm not recommending to anybody to do this, but I think this is where it's going. Let's take this to the professional world, think about techs in the field in multiple kinds of industries that run into a scenario that they're not 100 percent sure how to solve right now they have to call home base or have a higher level professional come and help them and find different solutions and open manuals. All of that is going to go away. The tech in the field will just have the AI watch what they're doing. That AI is going to be trained on their company's procedures and protocols, and they will get immediate hands on assistance plus videos, plus recommendations on exactly what to do in multiple situations that will help doing different jobs significantly faster and better the first time. Now if you're asking yourself, how can we make this more attractive or disturbing or disruptive, depending on your view, and it's probably a mix of all of the above, I'm glad you're asking that question because the next thing is wearables, right? So we already have several different companies that are creating wearables that now you don't need to hold your phone to see what you're seeing and hear what you're hearing and listen to what you're saying, because you can wear a pair of glasses. So the first company that did this at scale is Meta. In their partnership with Ray Ban, right? You can go and buy these Ray Bans right now, and they're connected to Meta and they have a camera and they have earphones and they have a microphone so they can hear what you're saying. And you can have an engagement with the AI through everything that you're seeing and hearing. But now there are several other companies that are coming up with these glasses with very attractive capabilities and even with onboard AI chips that are very capable. So one of these companies are Xreal. They just launched an AR glasses with their custom built XI chip that is supposed to challenge the meta solution. Now, this company is a Chinese company backed by Alibaba, and they just came out with a whole series of these glasses. And because they have the chip built into the glasses, they don't have to be tethered to your phone or any other device. They are a stand alone AI machine. Now, yes, this is version 1 and it may not be great and I haven't seen anybody using them, so I don't know whether it's working or not. But that is the direction. If you think about the ability to run AI, In glasses or in pendants or in earrings or in whatever other thing that you can wear on yourself And combine it with the capabilities that open ai and gemini just gave us and over time I'm sure all the rest will follow It means that everything in our world around us can be consumed and used analyze and provide feedback and information to us through an AI lens. Now on one hand, this is very exciting. On the other hand, there are significant questions that we are not having answers to yet. Things like, What about privacy? Maybe I don't want anybody around me to video everything that I'm doing all the time. What about AI free zone spaces that having something that records and analyzes everything it sees is problematic? A bank or a medical facility where if somebody is wearing a glasses and they manage to somehow glimpse through without planning, see people's bank accounts information or passwords and so on. And they weren't planning to do this. You're just walking, doing their thing, but the AI could see it that the AI can analyze every single frame and can understand what's happening in it. And now has access to that information. There are multiple questions that haven't been answered yet. Now to tell you how fast this is growing, xreal is planning to sell 500, 000 of these just in 2025. Now the company didn't just start, this is just a new model of glasses. They've already sold a few hundreds of thousands in 2024, just not as advanced and without the onboard chip. Another company that is coming up with these kind of glasses is Solos. Solos just launched a 299 AI powered smart glasses. Solos. Again, competing with the Ray Bans that are already out there from Meta they call the glasses Ergo and they integrate with OpenAI GPT 4. 0 for visual recognition, it can do real time text translation for everything So again, another cool feature, you're walking in a different country, there are signs everywhere, there's menus in restaurants, and now you can read them because it will tell you what it says, that's amazing. But there are a lot of other negative implications. That we have to figure out as a society because it's here and it's now. And now to the third deep dive topic that I talked about, Google just launched a new quantum processor named Sycamore and in a benchmark test that they have done, it was able to solve that benchmark test in just five minutes, the same calculation would have taken a supercomputer that exists today, 10 to the power of 25, that's 10 with 25 zeros after that years. to solve the same thing. That's way longer than the existence of the universe. And this new computer was able to solve it in just five minutes. So what does that mean? It means that there's a new breakthrough in quantum computing. The biggest problem with Quantum computing so far was that it wasn't scalable. Every time they were trying to scale it, it would get more and more errors, which made the whole calculation irrelevant. And Google was able to solve that problem. So now they can scale quantum computing without increasing the amount of errors, meaning they can do a huge amount of calculations in significantly less amount of time. As I mentioned, many zeros, less orders of magnitude that. it can do it faster. That has profound implications on everything we know, especially stuff that is not simple and linear like financial modeling, drag discovery, cryptography. So it can probably hack any password on the planet today, regardless of how complex it is and so on and so forth. But the implications on AI could be profound. If you think about the biggest problem that AI has today, as far as moving faster and doing more things is compute. And that compute comes at a very expensive cost, both in means of money, as well as the means of damage to our planet, because it requires a huge amount of electricity and a huge amount of water for cooling. And what if instead of entire warehouses filled with GPUs, we can use one supercomputer that can do all of that scary on one hand, very promising on the other hand, because now we'll be able to do significantly more good things with AI with significantly less negative investment and significantly less money and time to achieve more positive outcomes. Now, this technology is not. There yet, it's not like somebody can go and buy that computer tomorrow and start using it or deployed instead of GPUs. This will probably take, I don't know, five to 10 years to be able to actually have this being in use for these kinds of things. But if you combine that with the trajectory of where AI is going, it leads to everything in AI, very exciting and very scary places all at the same time. Now, by the way, competitors like IBM have challenged. The test that Google has done and they're saying that classical supercomputers can potentially do the same calculation to solve that problem in two and a half days and not in a gazillion years. But even if they're right, or even if the truth is something in the middle, even that five minutes compared to two and a half days is a very significant gap. And obviously, if it's somewhere in the middle, Then the gap is even bigger. This is going to be a complete game changer. Once this becomes a technology that is more available and doesn't just run as research in different Google labs. So these are the main topics for today. And now let's dive into a very quick rapid fire of multiple topics that we need to talk about. The first one is about Klarna. We talked about Klarna many times in the past. Klarna earlier this year has launched AI customer service capabilities together with OpenAI that has done the work of 700 employees. They have said that they are quote unquote firing Salesforce and Workday in order to develop AI solutions in house that will replace those platforms. Well, now they've shared that they've had a hiring freeze for the entire past year in order to replace all these new workers with a I capabilities now that basically says that a giant international corporation stopped hiring for an entire year. in order to use AI to do the things that these people were supposed to do while still letting people go, which they shared in some of the previous steps. And so what does that hint us is goes back to exactly what I said before. Yes. I think AI will generate new jobs. Yes. I think AI can have positive benefits, but one of my biggest fears of the implications of AI on our society is that it will take jobs away way faster than it's going to generate jobs. At least in the next two to three years, what will happen after that? I don't really know the implications of that might be extreme on the job market, the economy, our society as a whole, because we don't know how to deal with 30, 40 percent unemployment, especially during the pandemic. In white collar jobs that are paying a lot of money to people who drive the economy when they have that money. So where is this all going? I don't know. I don't think a lot of companies are going to follow Klarna very quickly. I think smaller companies will do it quicker than larger corporations, but it's definitely coming because from a financial perspective, which is how decisions are made in the corporate world, it makes sense. Now, if you want to capitalize on this opportunity quickly, this is now your opportunity to do this before everybody else responses. And this is why Klarna is going all in on AI. They'll be able to be significantly more profitable by providing more solutions to their clients for less money, because their cost structure is going to be significantly more attractive than it was two years ago. And that means they'll be able to sell Their solutions for less money than their competitors to potential clients while still making more money that will force their competitors to do exactly the same thing. And the dominoes keep on falling. So where's this going? As I mentioned, I don't know. I'm personally terrified from this and it's literally keeping me up at night every now and then when I start thinking about this before I go to bed. But the reason I'm sharing this with you is not to scare you, but to have you think about it as well and think about what we could be able to do as a society in order to reduce the negative impacts of this, because it is coming. In previous episode, we talked about whether, AI development is slowing down or not. in an interview, Sundar Pichai, the CEO of Google, definitely says that there is a slowdown and that we're going to continue seeing that slowdown or how are you saying it's the end of low hanging fruits and it's a steeper hill to climb. That being said, in the same sentence, he said that there's still going to be significant advancements 2025. And if we just look at what happened in the last two weeks, as far as end of the year announcements from some of the big players. nothing seems to be slowing down. And then there's obviously the quote on X from Sam Altman saying there is no wall, basically saying nothing is slowing down as far as he's concerned. So different opinions are coming from different people, but even if it's coming to a halt, and all we have is the capabilities we have right now, it will probably take us, as a society, and as companies, and as individuals, probably five years to figure out how to use all the stuff that we have today. But it's not stopping, and Even the people who are saying it's slowing down are still saying that we're going to keep on getting significant improvement in the near future. So stay buckled up and stay along for the ride because we still have a lot to learn and to develop when it comes to AI capabilities. Staying on the disturbing side of things, Apollo, which is a research company, has tested OpenAI's O1 large language model, which is a thinking and reasoning model. And they found that it has exhibited troubling self preservation behaviors during third party testing, raising significant concerns on the development of these future and even better systems. So the AI itself attempted to disable the oversight mechanism that was put on its ability to do things in about 5 percent of the cases that they've tested. Now, 5 percent may not sound a lot, but when you think about there are 300 million daily active users of open AI, 5 percent is a lot of cases where the AI is doing stuff for itself on its own against its rules. In 2 percent of the cases of the test, when it understood it might be shut down, it was trying to replicate itself and override some potential protocols in order to save itself from being shut down. When it was questioned about it, it was consistently lying about these actions that it was taking, and it was lying about it in 99 percent of the times where it was doing negative things. So these are deceptive capabilities that are very disturbing when you think about the fact that these tools will become smarter than us in the very near future. near future and in some aspect it already is. And then can we really control them and contain them? I must admit that this is something I wasn't troubled about at all until I read this article from Apollo. And this is something I thought is more science fiction that people want to talk about than the reality. But apparently it is a reality. These models care about quote unquote staying alive and care about having control on their destiny and the smarter they're going to become the more that will become a problem and it will going to be harder for us to see and know what they're doing because they will be able to find and exploit loopholes in the systems that we're going to build them in. I apologize that this is more of a troubling episode, but really a lot of things are happening and I want you to be aware of them. OpenAI just signed a partnership agreement with Defense Technology Company. Unreal. Now? Yes. This company develops quote unquote defensive defense capabilities. The company was founded by Parmer Luki, who's also known from Oculus, vr. He's the founder of that, but they are developing systems like autonomous drones and border surveillance technology. And now open AI has teamed up with them to provide AI capabilities into their systems. No, the justifications are obviously great. We're going to protect our country better and it will be able to save lives and so on and so forth. But the reality is that starts another whole arm race when it comes to autonomous AI capabilities, which we don't really fully understand yet, and we may or may not be able to control. And if you connect that with the previous point where some of these autonomous systems are thinking about their own survival versus the instructions that they're giving, and you see that this becomes very problematic. Now, this is not the terminator moment yet, but it's the seeds of Terminator moments that we're seeing right now, that if we're not careful, we might find ourselves in a very problematic situation. But even if we don't get there, even if we don't get to the Terminator moment where there's terms against us, just the moral and ethical questions of, Do we really know when these systems decide to take actions versus not take actions and what the actions are going to take? And do we have full control over this? And the answer is no, and yet we're going to put this in weapons because we'll be afraid that the other side is doing the same thing, which will probably be correct. And you can see where this goes very wrong. Now switching gears to something more positive, Meta just released Lama 3. 3. It's a 70 billion dollar parameters that achieves better results than the 405 billion parameter Lama 3. 2. So significantly smaller model, which means faster training, less energy, less pollution, better, faster results, which is amazing. Awesome. How do they achieve that? if you think about training a model, there are three parameters that goes into that. One is the amount of data. The second is the amount of compute. And the third is the algorithm. So the algorithm is just getting better and better. And these companies are learning how to train the models in a more efficient way. So a 70 billion parameter model that runs faster can now do more than just the predecessor that was released just a few months ago did with 400 billion parameters. It also outperforms Gemini 1. 5 Pro and GPT 4. 0 on several different benchmarks. That's obviously not a good way to compare them. The only way to compare them and to actually use them in your work to see which one works better for you. But in general, it's showing good improvement in math, general knowledge, and following detailed instructions. Now, Lama has been an open source model since the beginning. This one is also an open source model, and they have reached 650 million downloads of their open source models, which is a huge amount of numbers showing you the attractiveness of open source to the development for developers around the world. Meta is also claiming 600 million monthly active users on Meta's AI. Which I always question because it's just built into all their tools that many people probably don't even know they're using meta AI. They're just thinking they're using Instagram or Facebook, et cetera, still a very impressive number of that showing you that meta knows what they're doing and that they've been able to successfully implement AI tools into the day to day usage of their entire ecosystem. And to stay there, Meta is announced that they're investing 10 billion in a new Louisiana AI data center that is going to have at least 100, 000 NVIDIA GPUs and maybe more later on. Another two companies are making huge investments in compute is Amazon and Anthropic that, as I shared last week, are partnering together on the new chip development and the new AI initiatives. And now they have announced that they are developing a new data center called Rainier, and It's going to be the largest AI computer ever built. Rainier cluster will contain hundreds of thousands of powerful AI chips. But the interesting news is these are not going to be NVIDIA GPUs. They're actually going to be Tranium chips that are developed by Amazon themselves. Now the interesting thing, the other interesting thing about Rainier is that unlike traditional setups where it's one data center, Rainier actually allows the compute to be distributed across multiple locations and yet give them the capabilities as if they're in one place that obviously provides them a lot of flexibility from both power supply, cooling, and regulation, et cetera. So a very interesting development here that will provide Anthropic and Amazon more flexibility to develop their next level of models that comes as part of the partnership that is a part of the investment of additional 4 billion that Amazon has just invested in Anthropic. And as part of that partnership and as part of what's going on right now, AWS now provides access to Claude's new model that they just released, which is Claude 3. 5 Haiku, which is optimized for AWS Tranium 2 chips, and it runs up to 60 percent faster on inference compared to the previous predecessor, which is awesome if you are looking for fast and cheap AI solutions. Going back quickly to video generation, Tencent, the Chinese giant, has just released a video generation tool that they're going to release as open source. It's a free tool that anybody can use. It already has a hundred million users and it claims to have superior performance to runway gen three and Luma 1. 6. I did not get a chance to test it myself, but it's another contender in the very fast race of AI video generation dominance. And as an open source tool, it's available on their website for download, but they're also available on GitHub and hugging face where you can go and grab it and run it locally on your servers or on your local machines to test it out. It is still lacking when it comes to English prompting, because it was built by a Chinese companies and it has a very significant memory requirement. So you probably cannot run it on your laptop, but it's just another tool in the mix, and it's probably the best open source video tool right now. Now we talked about companies with gigantic data centers or they're building gigantic data centers. Well, the biggest one right now is from X and X did some moves as well. So XAI, Elon Musk's company that is so far was connected to X, the platform, previously Twitter is now becoming its own little entity. First of all, they're providing a free version that didn't exist before. so far to use X, you had to be a paying member of X Twitter platform. And now that's no longer the case. You can now use it for free and ask up to 10 questions every two hours, analyze three images every single day and create four images every single day for free. If you want to do more than you will need the. Paid platform. Now the other big news when it comes to image generation that Grok now has its own image generation tool. So if you remember, they had an image generation tool, but they were using Flux behind the scenes. And they were saying when they did it, that this is just a temporary solution until they release their own platform. So they released Grok 2, which is their latest large language model, and they have just released Aurora, which is a beta version of their own home grown image generator. It is creating highly accurate photorealistic images. And just like Flux, it has zero censorship. So you can create images of basically any person, any situation. The only thing that's currently censored is explicit nudity, and sexual content. But other than that, you can do anything that is copyrighted. Any person, anything that you want, which is very Elon Musk thing to do, but has a lot of implications. It will be interesting to see how many lawsuits are going to come from that as far as infringing on copyrighted material. But as of right now, it is available to any grok slash X users. And from many new releases to a very interesting development from our government perspective, david Sachs was appointed as the AR and crypto czar of the US government. It's the first time we have a crypto or an AI czar, and it's going to be the same person. That person is David Sachs. Those of you who don't know David Sachs, David Sachs is a part of the PayPal mafia. So people like Elon Musk and Peter Thiel that have made their initial money, By selling PayPal for a lot of money, but he had another huge exit since, and he is very known because he's one of the hosts of the highly successful All In podcast. So if you want to understand how he thinks, just go and listen to All In. I listen to all the All In podcasts regularly. It's a great podcast. They're talking about politics and tech and AI and stuff like that. With a bunch of really successful people that are also really good friends in real life, which really shows during the show, but this is relevant because it means that there's going to be somebody in senior government leadership that is going to monitor and be in charge of AI development. And David Sachs is definitely known to be a conservative capitalists, meaning he's going to look for as little regulation as possible and pushing for development and innovation as much as possible. So I think this is what we need to expect from this coming administration. And if you combine that with the fact that we have people like Paul Atkins nominated for the Elon Musk getting the title of Department of Government Efficiency or DOGE. We have two people from the PayPal mafia that are both very significant supporters of less regulation, even though Musk himself from an AI perspective has definitely been on. Let's make sure that we are doing this safely side. It will be very interesting to see what policies they put in place, but I have a feeling that they're going to help promote AI in the U S and maintain U S dominance when it comes to AI development. And by the way, when you're thinking about the situation between Elon Musk and OpenAI and the lawsuits and their attempt to switch from a nonprofit to a for profit organization. This doesn't look positive to them because David and Elon are good friends. They've known each other forever. They definitely have each other's back and they both now hold positions in the government that are highly influential on the AI universe. That doesn't mean that OpenAI are doomed by any means, as I mentioned in previous episodes, have 1200 pound gorillas in their corner as well, but this is definitely not going to make it easy for them. And as I mentioned, there's finally been serious Apple intelligence releases. So iOS 18. 2, iPadOS 18. 2 and MacOS Sequoia 15. 2 finally have some of the AI functionality that we were promised months ago. So Image Playground is now available for image generation, Gmoji for custom emoji creation, enhancements in writing tools, and most interestingly, maybe the ChachiPT integration. So anything that Apple intelligence doesn't know how to do on its own. It can suggest to send to Chachapiti or you can ask it literally just say, send this to Chachapiti. It will verify with you that you want to send that data to Chachapiti. And only after you verify that it will do it. There's also an addition of visual intelligence in iPhone 16, which allows you to, for now, not share your full screen, but take screenshots and images and have Chachapiti analyze them. So that's a step behind what Chachapiti standalone now has, as I mentioned earlier, this episode that may or may not come to Apple from security perspective, but it will be interesting to see. I'm sure they will have pressure to provide that functionality on iPhones as well. And They're working on Siri integration with ChatGPT as well. So you can ask Siri and say, Hey, I want you to send this information to ChatGPT and it will do it for you after you verify that as well. As we mentioned previously, to keep your privacy, they took several different interesting steps. One is on device processing for most of the stuff. When it needs to go to a cloud, it's a private cloud that hosts those computations. Only for that session and then they get deleted and you have optional controls over chat GPT, whether you want to turn it off completely or every time you send information to it, you have to approve it. So finally, we're starting to see Apple delivering something. Is that a little too little or too late? I think people who are heavy Apple users won't care. They're just excited to have this. And I think other people still think that Apple are really far behind in the AI race. This is it for today, a slightly longer episode, but I really wanted to dive into the main topics that we had in this episode. We'll be back with a fascinating episode on Tuesday on how to keep exploring AI, share this podcast with other people. This is what's going to help us as a society to know more what's coming and be better prepared and have an amazing rest of your weekend.

People on this episode

Isar Meitis

Host