Omar Shams: AI Founder and Google AI Agent Lead — #89

Omar Shams: The standards for software, even for a startup are much higher now.

You just can't have shitty software. Even if you solve some unique novel pain points, customer expectations are very high and that's gonna pull the field forward.

Steve Hsu: Welcome to Manifold. My guest today is Omar Shams. He is an AI founder who recently sold his company to Google. His position at Google currently is as a lead on the AI software agent developments. And I'm really looking forward to talking to him because he's not only a leading thinker and researcher in ai, but he has a background in theoretical physics.

So, Omar, welcome to the podcast.

Omar Shams: Thank you for having me, Steve.

Steve Hsu: And we are broadcasting from the Equinox Hotel in Hudson Yards in beautiful New York City. Omar splits his time between San Francisco. I. And New York, and I'm here giving a talk on ai. So we're lucky to be able to get together.

Thank you.

Omar Shams: Thank you.

Steve Hsu: So Omar, let's start with your background. So you went to Carnegie Mellon, studied math and physics, and then you went to grad school at Rutgers to study string theory. And so talk a little bit about what your, what the world looked like to you when you were in your early twenties and uh, when you actually started thinking about AI separately from physics.

Omar Shams: Yeah, great question. so I think much like you, my first love was definitely physics. I was obsessed with it. I remember reading about the Twin Paradox in high school actually when I was 15. And I remember looking at this little cartoon in the back of my physics book. And it had, you know, one twin go out and come back.

And then his twin was much older. And I remember thinking like, this is made up, right? Like this is a joke or something. And then I went to my teacher and I was like, Hey, you know, but this is fake, right? Or this isn't real. And he's like, Nope, it's real. And I was like, what? And it was like finding out that magic really exists.

That was my, that was my honest, subjective perception. And I was like, okay, I must learn magic. And so I did that for a good 10 years. My specialization at the time I was a Tom Bank student and I ended up, you know, dropping out a BD with a dissertation. But holography and Tom, and I think Tom still works on this, is not using non-communicative geometries to reconstruct things like, you know, space time.

Like could you go from a non-communicative geometry where space time becomes actually this emergent thing, right? And the specific work I did was. Doing some stuff on which I, I barely remember anymore. On, on conformal killing spinners. Spinners are kind of like the square of vectors, right? And using that to construct the Susie Algebra, it was really su Susie, Susie

Steve Hsu: means super symmetry for our listeners.

Yeah, sorry,

Omar Shams: Sorry, Susie. Meaning super symmetry, which is an extra symmetry that, you know, these spinners, which again, are kind of like the square root of uh, vectors uh furnish. and then kind of towards the end when I was you know, about to drop out, I got really into, biophysics and people, I need to be careful 'cause people mean a lot of things when they say biophysics.

So lipid physics and kind of the structure of that. I was more actually on the genomics side, so I did a little bit of genomics and this is kind of uh, not that surprising anymore. But at the time, you know, one of the very first things I did was, okay, let me do a PCA on this human mitochondrial DNA.

And I, I think, that, that was definitely a big part of it, like seeing that, you know, the power of you know, machine, you know, machine learning techniques on, on these kind of data sets as an undergrad, I also did, some lattice QCD with I believe Colin Colin Morningstar.

Mm,

Steve Hsu: Yeah, I know Colin.

Omar Shams: Oh, you know Colin. So I spent it, and so did Tom too. You know, everyone. I, so I did I, I did that for a summer, a nice summer with Colin as an undergrad. So that was probably my introduction to the field, but really my main introduction to doing this was my first job. Where I built a music recommendation engine as part of this small company called Hi-Fi, which later was acquired by block.

Steve Hsu: Okay. Let's before we let you leave physics Yeah, let's dwell on physics just for a little bit. Of course. I'm really struck by the story that you just gave about special relativity in high school because I've had the same thought that, you know, when you're a kid, if your, if your brain is wired the right way, even just knowing a little bit of algebra, you can derive the lower transformation special relativity.

It doesn't require any more than simple algebra, algebra two level stuff. And I actually have an old blog post where I say something like, how can a kid be well educated in the United States and not have been exposed, at least briefly, to the ideas of special relativity? 'cause it's such a glamorous thing like Albert Einstein and then like.

How could you not suddenly get interested in physics when you realize some very simple empirical input, like, wow, the speed of light looks the same to anybody no matter how, what speed that person is moving. That observer is moving at that simple observation. Then with a little bit of logic and some equations leads to, for example, the twin paradox that you just mentioned.

So I, I'm just always amazed when I meet another smart, highly educated person. I'm like, how come you don't, aren't more interested in physics? Like, didn't you see that part of the textbook you were forced to read in 11th grade or 12th grade where they explained all this to you? Like, didn't that strike like a light of fire in your mind?

So you're like an illustration of the type of brain that I'm thinking of, and I don't understand all those other brains. So, so what do you think, were you the only kid in your high school who cared about this? Yes.

Omar Shams: Yeah, I think so. Yeah, I think, for me, just the, the, the pleasure of, you know, doing these math puzzles was kind of, you know, one level of like enjoyment or intellectual satisfaction, but there was something about physics where it lit up this fire, or for me it was very visual. Like, there's this movie that was playing in my head when I did physics problems.

That was very fun. It was almost like an action movie or some kind of thriller. I don't wanna overdramatize it, but really I'm, I'm getting, I get like visuals when I, when I, when I was doing this stuff, and I remember also in high school at the same time, I would catch clips, you know, on like whatever PBS or whatever of like, you know you know, a train moving with light and you know, someone trying to like, flashlight in a bag and like, what are the properties of light versus, you know you know a matter and so on.

And, uh. I just remember it was very compelling to me. Like again, like discovering that magic is real.

Steve Hsu: Yeah. This is a big difference between physicists and mathematicians because I think most physicists, well physicists, talk about something called physical intuition, and I think a big part of that is that part of your brain, I think for evolutionary reasons, is wired to actually simulate, maybe visually, the real world.

And if you can tap into that, you can do very powerful things. So some mathematicians I know just laugh at Einstein's special relativity because it's so simple. It's just the equations are so simple and it's like how can this be a big deal? But the depth of it is not those equations. The depth is the philosophical thinking or the physical thinking of, wait a minute, if Michelson and Morley did this experiment where they found no matter what.

The frame they were in, they always measured the same value for the speed of light that suddenly had incredible implications that you could actually get to with very simple math. And I think Einstein was also just from the thought experiments, the experiments that he always talked about were very visual as well.

Yeah. So, later in our conversation, we're gonna get to whether that kind of powerful thinking that physicists have, which is to integrate that kind of intuition. Like, oh, what is a little ball rolling down a surface during gradient descent? What, what, what do we know about the way the ball rolls down the surface?

How does that actually become useful in modern AI research? So I think that's something I want to try to get out of you. So let's go back to you and decide that you wanted to get into technology. And so I think you just mentioned the first startup that you worked at, but I, I think you've worked at a succession of the top AI labs, right?

You worked at DeepMind. and were you in London, or?

Omar Shams: I actually was in London. Okay. And I sat, you know, just a few feet across from Demis and, and Gene,

Steve Hsu: What year was this?

Omar Shams: this was 2018. 2019.

Steve Hsu: Wow. Yeah. I wonder if you went to my talk because I gave a talk in London on genomics and, and ai. I

Omar Shams: just missed it.

Okay. I was really excited and I don't know, for whatever reason, I just missed it.

Steve Hsu: I have to tell a funny story about that because I went to give the talk and, I came to the building, you know, the building in London and um, is it near Kings Cross?

Omar Shams: It is near, yeah. Near Kings. Yeah. So I went to six Ps.

Steve Hsu: Great building, although I guess it later moved into an even cooler one. I go into the building and, and they've got this whole itinerary for me. And I'm meeting with all these people and it looks like a job interview. It looks like the kind of set of meetings that you have if you're interviewing for like a faculty job.

But it also could be just like people who are interested in talking to me about research or something. Right. So I wasn't really sure, but then like the very first meeting was with a guy called Push Meet, who's now Oh, push meet. Yeah. Who's now like the VP who runs like the whole AI effort. Right. And AI research.

And uh, everyone that I met and every meeting kept asking me, are you joining our protein folding project? And I'm like, why do you think that I'm here to convince you that you should all be working on genomics, not protein molding. Like, so we had a little, little clash, not clash, but just friendly. Like, I'm like, no, this is more interesting.

And they're like, no, this is more. And they, and they, they kept saying, well, we thought you were, we all thought you were a job candidate because you're a physicist and you already work on DNA, so we thought you're joining the protein folding project. So that was a funny story. And now that I look back, I'm like, maybe I should have joined the protein folding project and been involved in the Nobel Prize or something.

So, but too bad we missed each other.

Omar Shams: Yeah. Yeah.

Steve Hsu: So when, what are your thoughts on deep DeepMind? Because you know, as you know, OpenAI got started because when Demis was involved in selling it to Google, um. Maybe this isn't that well known story, but Elon and a guy called Luke noic, who's another PayPal mafia guy, he's the founder of Giga Fund down in Austin.

Do you know him?

Omar Shams: I know.

Steve Hsu: they tried to buy, you may, you know the story. They tried to buy DeepMind. Oh, I didn't know that. Because they didn't want Google to have it. So they were at a party and Luke and Elon actually hid in a closet to get away from the noise. And they called Demis and they were like, whatever, you know, Google's offering 600 million.

We'll match that. Right. And then Demis, I think according to the story, said to them something like, well, you can, maybe, you can maybe raise that money, but you can't match the computer infrastructure that I'm gonna get. And so he ended up going to Google DeepMind became a subsidiary of Google, but then those guys were paranoid because they thought DeepMind was gonna get way ahead and solve the GI problem before everybody else.

And that's why Elon backed the founding of Open AI at the beginning to, to have an open, like I put it in air quotes. 'cause we all know what open AI is like now, but, but originally like to have an open lab that was doing AI so that Google couldn't just snatch the prize. Did you ever hear that story when you were working

Omar Shams: there?

I, I never heard that story. that, that is, that is very interesting. I should say that. just because currently I'm part of, you know, the Alphabet family, I'm gonna deliberately Yes. Avoid discussing Yes. Don't,

Steve Hsu: don't say anything that'll get yourself in trouble. Yeah, yeah,

Omar Shams: Yeah. but yeah, it's an interesting story.

I think that AI races are interesting for many reasons. I mean so many reasons, philosophical reasons. It, it, it's forcing us to ask questions that we kind of were. You know, put off for a long time, even existential questions, I think.

Steve Hsu: Yep.

Omar Shams: But it's also interesting because I think there's legitimately two bottlenecks.

Usually there's only one bottleneck, but there's two bottlenecks that I think vie for how big of a bottleneck they are. One of them, of course, is you mentioned chips, but the other one is increasingly actually just energy, like, just raw power. Like how powerful, how much power do you have, right? Yeah,

Steve Hsu: Yeah.

Well, when I, when I talk about the US China AI race, one of the issues that comes up, well, both of these come up. One is like Nvidia versus Huawei for AI chips. But the other one is just like, how are you gonna power these data centers? You know, like it's very tough to increase grid power supply in the US, whereas the first derivative of the Chinese electricity production thing is just going gangbusters.

They, they actually add, I think the correct statement is they add the equivalent of the power generation of like England or France every year.

Omar Shams: Every year. Yeah. And the US and the US every seven years. Yeah,

Steve Hsu: exactly. Yeah. And they're at two X now. So it's like, how are you gonna match that? If, if that turns, if electricity turns out to be the fundamental component that gets things, that gets turned into intelligence, how are you gonna match that?

Omar Shams: Yeah. Yeah. I mean, I don't want to go too far astray, but I wrote an essay titled The Moon Should Be a Computer, which I freely admitted is speculative. There's a lot of assumptions I had to make there. What's not as speculative is that, you know, if you increase the amount of energy consumption on Earth, by, I mean by a lot, by a lot, I'm not saying a little bit by a lot, by two orders of magnitude, you do start to have thermal effects that affect the atmosphere. And I kind of use that as an excuse. But the real reason to do this, honestly, is because you just can't get the, the put on the, the base grid loads power supply in the us fast enough because who knows why regulations or you can't, you just can't build it enough.

Or there's, there's some kind of comp there, there isn't the competency to do it, but I kind of speculate like, Hey, can we do this in space? Or maybe in fact on the, on the moon. And I thought it was, this was a kind of an idea that came about through chatting with one of my friends in San Francisco who's now at Anthropic.

but I thought, okay you know, this is a crazy idea. I knew, you know, people would say it's a crazy idea, but actually some, you know, really I'm not gonna name, but some really like important people read it and thought that was, you know, a good idea. And I found out later that Eric Schmidt of all people Yep.

Through some special financing, is now the CEO of Relativity Space, which is another YC company. My company was a YC company, so he is now the CEO. And one of his purported aims or reasons for, you know, doing this is he wants to put computers in space. I don't know that he wants to put it on the moon, but he wants to put it in space.

And I think one of the reasons, again, is regulatory. Like you just literally cannot get the energy right in, on earth. In his, sorry, not on earth, but on, in the us

Steve Hsu: In his space project. Would the energy actually be coming from solar panels or is there a reactor that's in space that's orbiting?

Omar Shams: I believe I, I don't know the, I've tried to find out and I, yeah, if someone knows, please reach out to me, actually.

'cause I'm very interested in this. Yeah.

Steve Hsu:

Omar Shams: But I believe solar. I don't think you can do nuclear in space because it violates so many, you know, treaties. Treaties. If the, if the, you know, the, the rocket went off, it would technically be a dirty bomb and you don't want that. Right. So I think it has to be solar.

Steve Hsu: But don't, wouldn't you need something like a square kilometer of solar panels in space or something? Yes. Yes. It,

Omar Shams: It is pretty nuts. Like, I, I think I. For it to be, I, I tried doing the math on this and I could be wrong, but I believe to get like a gigawatt, you need, is it like a square kilometer or is it more like 10 10 square kilometers?

Steve Hsu: I'd have to check the math, but my intuition is that it's a lot of lift to put all that stuff in orbit. It is. It's, yeah.

Omar Shams: And, and to put it, and you can't put it in, in, in lowered orbit. 'cause if it's 10, let's say it's 10 square kilometers. Yes. it would literally, I did the cross section at some point.

Yeah.

Steve Hsu: Astronomers would be a little mad at you.

Omar Shams: Yeah, no, no.

Steve Hsu: You would be visible. Everyone would be mad. That's what I'm saying. So you'd have to

Omar Shams: put it in like in l you know, like orange point. Yes. You know, for, yes. Thankfully there's a lot of space and space.

Steve Hsu: Yes. Yes. Okay.

Let's go back to your experience.

So I. as a startup founder. So your company was called Mutable. Mm-hmm. And you ran it for three years as founder and CEO, and I believe that it was in the space of basically AI tools for coding. Is that the right category?

Omar Shams: That's right. Yeah. So I started the company in November 2021. I quit my job and two weeks later I was inducted into Y Combinator as a solo founder.

And that was a pretty brutal experience. Brutal, but amazing, I should say. Like, like one of the best experiences in my life. I basically did nothing but work. I didn't sleep, so, because I had nothing, I had nothing. I had no code, nothing. I just coded all day. Went to the YC, you know, they have this curriculum, this curriculum basically.

And then pitched investors. I managed to build something that was clean. It doesn't matter too much, but it was something that cleaned up your Jupyter Notebook code. with ai, I got one customer and I raised a seed round. A small seed round, right. So it was the kindest thing in 10, three months of my life. Right.

Um but yes, I was one of the first uh AI developer tool companies. We came out around the same time as the copilot.

Steve Hsu: Right.

Omar Shams: And uh, yeah, the space that, I mean the space is, is, is bananas. Now. The cursor is making a cursor is an, is an ai uh, development tool is making over a hundred million a RR.

A bunch of these companies now are making over a hundred million a r and Very quickly.

Steve Hsu: Yeah. Now, you're too modest to claim credit, but I know Mutable pioneered a few things now that are kind of common now. And I think Carpathy gave this keynote recently that was on YouTube where he talked about some of the ideas.

I don't think he credited you, but I think you had these ideas, things like, uh. You know, using the context in a certain way for software or actually creating much better documentation, Wikipedia style documentation from the code base at a company. Yeah. I think you guys did a lot of interesting things at Mutable.

Maybe you want to talk about that?

Omar Shams: Yeah, yeah. Happy. Yeah. I'm not too honest to, to to, to, you know, stand by the claim that we definitely invented a lot of the ideas behind probably the most popular manifestation of it today called Deep Wiki from Devon, AKA Cognition Labs. But basically I had an idea, and at the time I was in Austin of, you know, looking at all these open source code bases.

'cause I'm a very hyper curious guy. You know, like I've always been that way and I always come across papers and code that I'm interested in. And you know, you can get pretty far, you know, and you can develop the skill and muscle memory to onboard a new code base quickly, but it's always, you know, kind of a slog, right?

So I was like, why doesn't the AI just help me with this? Like, why doesn't the ai, let's say write a, something like a Wikipedia article, right? To explain this code for me and, oh, oh, what's a good name for this? Oh, auto Wiki. so that's what we did. We do this big recursive summarization to explain your code, but the backstory is actually a little bit more interesting than that.

and I think this might be useful for people, who are founders or aspiring founders. You know, so YC always likes to say, you know, scratch your own itch. That might not be a direct quote, but something like that. So at least you know you have one user, which is you. Right? So I built the initial version of Auto Wiki myself, and at the time my team was doing something else that didn't matter too much, but we had another kind of product with an active pilot.

And I looked at the first version, showed it to my team, and they were like, eh, eh, you know, this, this is whatever. And I remember looking at it, I was, okay, this is not that good. So I kind of tabled it for a month or two, and then I had the same problem again. Okay, cool. Code base. Cool. We code base. I really would love to, you know, onboard this code base quickly.

So I was like, you know what? Let me dust up my auto wiki code. I don't even know if I'd called it auto Wiki at the time. And I made some improvements. I spent all day. I was just too, I was just too curious, you know, I had, I had to, and lo and behold at the end, I, I looked at the final kind of wiki and I was like, you know what?

This actually is useful. And it helped me understand the code. So it's useful to me as probably useful to others. So I showed it around the team. The reception, I wouldn't say was very strong, but it was much more positive. and then you know, at some point, you know, later, you know, and I kept kind of chipping away at it myself.

At some point later we decided, you know what? Let's commit to this. Let's actually commit to this. And we launched it in January, 2024. You know, hit the front page of Hacker News. There were people reaching out from, you know, from college from my past. you know, it went pretty viral. and yeah, that, that was kind of, we decided to put the full focus of the company, you know, a hundred percent.

We, we, we actually kind of fired our customer. We, we were, you know, they, they were, we were, you know, of course very nice about it. They were gracious. We came to some kind of agreement, but we put our full focus behind that, and it turns out that that's kind of the backstory of like the development and the product up to ups and downs.

Right? I hope our interest is an interesting intro for startup founders, or aspiring founders, like I said. But the technical part that's most interesting actually, is what Car Pathi mentioned in this talk to startup school, I believe, which is that this is, turns out to be a very useful context builder because.

Lms are, are I, I've gotten so much mileage outta thinking like lms, actually Anthrop, you should anthropomorphize the lms because they're in a way, they're, they're, they're trained after an image. uh you know, they're trained non-human data, right? And, human experiences. So it turns out having these cliff notes essentially of your code helps lms one for retrieval.

So people talk about rag, like re re you know, retrieval, augmented generation, but also for the generation part because having these summaries in a way we preceded or predicted in some ways, the reasoning models, which do the chain of thought. Yep. Which reason first? Yep. And then answer your question.

Yep. So having this, having it write a book report first, and then answer your question was very compelling. We got much better results out of a code-based chat. So we built a code-based q and a system that didn't require you to put in, oh, these are the files I want. As context, you could just put in the whole code base.

In fact, we scaled it up to the Linux code base and it could answer questions about the entire Linux code base.

Steve Hsu: Well, let me ask you a few questions about that. So in the part where you build the auto wiki, does a human have to go in and correct any problems, before you then use it productively and further generation of code?

Omar Shams: We had a feature that wasn't much used where people could, you know, modify it, but if you didn't have to look at it there look, there were hallucinations.

Steve Hsu: Yes.

Omar Shams: And um, I think some of that is solved, you know, there's techniques to get around that actually. But it turns out, even with the hallucination, it was better to have it than to not have it added more light and heat is what we're saying.

Steve Hsu: Right. So, even as a fully automated process, you basically had the model look at the code base, think about it, generate a persistent document, and then, so in a way it's a little bit like reasoning, right? So it's generating this stuff, but then as it does other things for you, it's able to consult. That reasoning that it did earlier, right?

Yes. So, yeah, so I think it's great.

Omar Shams: And there's actually a deeper, I'm glad you're a physicist, Steve. 'cause there's actually a deeper physics analogy, which it's an imperfect analogy, but it was always in the back of my head, there would be a Renormalization group.

Steve Hsu: Yep.

Omar Shams: So with the Ren normalization group for those who listeners who don't know it's this technology invented by Kim Wilson to explain critical phenomena in physical systems where you have, you know, this, you know, condensed matter system, you know, with, and, and you're, you're way more on top of the physics than I, it's been, I'm, I'm far from my physics training, but what I remember from my physics training from Peskin and Schroder, you know, QFT, quantum Field Theory.

So you have these critical, you know, you have these systems, you know, it could be quantum field theory or it could be a condensed matter system, and you have the microscopic physics, but you want to get to the macroscopic physics. You want to predict a phase transition. So you do the successive ning, which in the, in the Auto Wiki you know, scenario is a successive summarization.

Steve Hsu: Yep.

Omar Shams: And then you get to the, you know, the critical phenomenon, which is, hey, like this is a code base about X, or this is a code base about Y or here's the answer to your question, and so on. So that was kind of always in the back of my mind as an inspiration. And I know that was one of the topics you wanted to

Steve Hsu: Yeah.

Omar Shams: Discuss. No, absolutely. I think

Steve Hsu: It's, you know, the idea that you can depend on the model to start from the actual code base, but then describe it in progressively maybe more human-like terms and store that somewhere. And that's useful, right? That's an interesting idea. And it is like the re-normalization group, different levels of description of the same stuff.

I think other people who have just in general researched neural networks have often made this analogy as well, that, for example, the first layers. Maybe detect features. Mm-hmm. And then the next layer, combine those features. And there's also a normalization group flavor of what, how neural net networks actually process information.

But I met you in Austin, and I think you did this. I don't want to necessarily call the pivot, but, but I think when I first met you, you hadn't had the wiki, you hadn't committed to the Wiki thing.

And then like, I think later when we became friends, you, you, you told me, hey, I, I made this pivot. And, so very cool. And, uh. Tell us, you know, how, what's it like for an AI founder these days? What's, what's the vibe like, you, you were based at that time, you were based in Austin, but then you moved to San Francisco, is that right?

Mm-hmm. Yeah. And so what, what's the vibe like, like you're out of it now because you're at Google. Mm-hmm. But when you were a founder, scrappy founder, living in SF, I don't know if you're living in Bay Valley, what, what, what's the, what's the vibe? Like, what, what, how does it feel?

Omar Shams: Yeah, the vibe in sf uh, there's this tremendous energy, I think, from what I hear.

You know, some people differ. Some people say, oh, it still hasn't fully bounced back from COVID. But I think at least in the AI field, there's this tremendous energy. I think SF is like the ancient Athens of the Western world. You know, it's, it's small. There's 700,000 people. I think ancient Athens was whatever, 250 k maybe people. But yeah, the smartest people are there, in my opinion. The most generative, more importantly, the most generative people. Age

Steve Hsu: Agentic people.

Omar Shams: Age agentic people. Yeah. the agents are, the human agents are there.

Steve Hsu: The player characters are all there.

Omar Shams: Yes, exactly. you know, and I, I love, you know, I love, I love New York for, for uh I love Austin and I love sf.

So those are my three favorite cities in the US but I rarely learn new things. I have to say I, you know, in, I in Austin or, or New York, but I feel like I'm always learning new things in San Francisco and

Steve Hsu: To the point where it's a little overwhelming or?

Omar Shams: I mean, I'm kind of a junkie, but yeah, it can be like, I was there recently and I was at, you know, there's five events literally, you know, there was this AI engineers conference that was an amazing industry conference.

Probably the best conference that was, you know, had all these people like, you know, Greg Brockman and so on. Really, but also just really good actual people doing the work. People doing the work. and there was, you know, YC reunion, there's a private event on, you know, a GI, you know, futures and, and and so on.

Steve Hsu: oh, you were at that one too? I was

Omar Shams: at that one.

Steve Hsu: That's great.

Omar Shams: I asked one of the uh and I, you know, I love the person who, who answered this, but you know, personally, but I just, I don't know if I agree with, with the answer, but I asked one of the, the people who spoke, Hey, how have the empirical developments of AI changed?

You know, your views Yep. Of a GI or post a GI futures. And, and he said, not at all. And I'm like, oh, come on.

Steve Hsu: Yeah, that, and you were, you were just like, mentally, okay, discount this guy. Yes. Like, you know, yes. Reduce weight to close to zero for this guy. But I, the reason I'm asking about the, the SF vibe is 'cause, you know, for, for people who are not in AI specifically, or they don't live in that area,

it's hard for people to really understand what the scene is like, right? Mm-hmm. It's how intense it is. Mm-hmm. Where like, everywhere you go, there's people talking about AI models or, you know, post training, RL reasoning, you know, it's chips, you know, it's, it's, it's all encompassing.

And on top of that, there's a philosophical layer where people are talking about AGI, what's this mean for society? What's this gonna mean for the existence of the human race? It's all super concentrated there, and I think once, as you leave that area, people don't really get it.

People are like, why are you guys so into this AI thing? I like Chat GPT, you know, I use Chat GPT, but they don't really get the intensity of what's happening in the Bay Area.

Omar Shams: Yeah. Yeah. I, yeah, I think you know, it's, there's almost like this funnel where you know, there's these public events that are, you know, hosted by different startups and VCs often who, you know, for obvious reasons they, they're looking for companies to invest in and so on. And then there's kind of another layer, which is uh you know, those events are good, but there's another layer which are like founder dinners, which I often get probably the most value out of where people host, you know, private dinners.

Again, sometimes VCs, sometimes different companies, and it's kind of a friendly thing, but it's also sometimes a recruiting thing or a funnel thing for VCs. And those events are really good 'cause people, you know, kind of share their war stories and then technical tips and tricks and so on. And then I think there's just all sorts there also. It's also kind of vibrant, I I, I wasn't as plugged into this, but there's kind of a, almost like a party scene actually. Like SF house parties and just random, you know, billionaire, you get immune to billing.

You, you, you, I'm sure already are. Uh. At this point, you really do get immune to it. It's like, so what? You're a billionaire. Everyone's a billionaire. Yeah. You know, but it's, it's crazy. The num, the, the, the, the level of people there and the kind of the ideas.

I also think there's a lab scene as well, like if you're in one of the labs Yep. You tend to socialize people in the labs with, I think there's a lot of cross pollination, and I'm not the first to make this observation where people live in a house and there's literally, you know, one person from one lab living with another person from another lab, and you're like, Hey, how, how, how, how do they expect it?

I mean, you know, I'm sure everyone's careful about their, you know, their confidentiality. And I'm, I'm personally very serious about, you know, any, any agreements like that. But you, it's hard not to

Steve Hsu: diffusion.

Omar Shams: It's hard. Can't be limited.

Steve Hsu: Just,

Omar Shams: Yeah. I can't imagine, you know, that, that knowledge doesn't get out somehow.

Because, 'cause humans e even without people violating their, their confidential agreements like it, you know, just subtle intonations or, or what have you, and also people at job hop. Right. and and my impression, and again uh, for the purpose of our, our, our conversation, I, I am excluding Google and not because I don't have opinions about Google, because, but because they're employer and, you know, views here on my own, sorry to make the incantation.

Yeah, no worries. No worries. Like the Orthodox priest with the, you know, yes. Wording evil spirits away. But I think I'm very much under the opinion that all the labs, again, excluding Google which I'm not sharing my opinion on, basically they're doing the same things. I don't think that there's too much novel ip or even if there is something that you could write on a napkin, how valuable is that?

If you like, it's something you can write on a napkin, it's gonna get out. Right. So I love that.

Steve Hsu: Okay. I love where you're taking this conversation. Yeah. Uhhuh. 'cause obviously yes, there's gonna be tons of diffusion and like, imagine you're very curious or stuck on some particular issue in your own work and say you're at Lab A.

And your housemate or some guy that you meet or you're at a party with your housemate and like, but you meet your counterpart who's solving the same problem, or in the same domain at Lab B. How can you resist, like actually just saying like, are you guys finding X? You know, or have you looked at why like, that stuff has to happen, right.

Omar Shams: I actually think they, they really read them their rights. Yeah. I actually do think, and, and I don't know, at least maybe I'm just projecting my, my how I behave. Like I'm. A consummate. Yeah. I really am. You know, I'm just like very Right. You're, you're,

Steve Hsu: You stick to the rules. I stick

Omar Shams: to the rules.

Yeah. I, my impression of people I talked to at the labs, they're also very careful actually. 'cause they really read them the rights.

Steve Hsu: Okay.

Omar Shams: but so I don't think there's any like thing flagrant or that they're actually, I don't, I didn't mean, I didn't mean flagrant,

Steve Hsu: But I meant, I meant like soft diffusion.

Omar Shams: Soft diffusion happens. Yeah. And it could be like,

Steve Hsu: not directly from a guy employed by A to B, like a tells his friend he went to grad school with Yes. And the friend tells somebody else and it gets to, you know, somebody at the lab. But even

Omar Shams: Something as innocent as, oh, this is a cool paper or something. Right.

Steve Hsu: Oh, that, that doesn't work at scale. That's,

Omar Shams: Yeah. Yeah. You

Steve Hsu: know, like, oh, how do you know that? Like, did you guys try it? Yeah, yeah, yeah. That doesn't work when you scale it up. but related to this, so I think your thesis was just. Primarily though that there aren't big secrets. I don't think so. Okay.

So then what is Mark Zuckerberg doing, paying a hundred million bucks for an individual?

What is he getting from that individual? Not secrets. Are there differences in capability that manifest at that scale?

Omar Shams: Yeah. I can't speak for Mark, but, and I don't know how confirmed the hundred million stuff is.

Steve Hsu: I think some big numbers. I think that they are reporting. He did steal a handful of really good people from OpenAI.

I think that is confirmed.

Omar Shams: Okay. Yeah. Yeah. And I, well, I'm gonna actually anyway, yeah, I do think that there's probably, you know, companies' outcomes are parallel distributed. You know, I you know, I like to joke from my own case. I won't go into too much detail for obvious reasons, but you know, like in the movie, I always like bringing up this movie Glen Gary, Glen Gary Lost just 'cause it's a hilarious movie, right?

Yes. And it's like the first prize is, you know, a new car. Second prize is steak knives. Steak knives, set of steak knives. Third prize is your fired. Fourth prize is fired, so on. Right? So like, I got the right. But I think there's something like that with people as well, actually. Like, I don't think the, the, you know, normal distribution is actually a good uh, predictor of outcomes.

I think there's something about people's capabilities. It's kind of like an airplane where maybe you have a good engine, but if you don't have the wings, you know, or if you have the wings but you don't have the engine you can't take off. So there's something like that with people. So maybe, you know, it's a steelman mark.

you know, maybe there's something like that with, you know, just paying top dollar for people at the tail ends.

Steve Hsu: Maybe you, you might, you might be interested in, there's some articles, quasi-academic articles with titles like how normal distributions are transformed into power law outcomes.

Omar Shams: Mm-hmm.

Mm-hmm.

Steve Hsu: So even though the individual people have power law distributions over their abilities Yeah. The way it interacts or nonlinear reinforcements in the system, you end up with power law outcomes.

Omar Shams: Mm. Mm-hmm.

Steve Hsu: And so I think what you're saying has some theoretical basis actually.

Omar Shams: Yeah. We definitely see this with founders where you could be an amazing technical founder Right.

But if you don't have good communication skills Yeah. That kind of kills you, honestly. Yeah. Especially in, in, in, in my opinion in the west.

Steve Hsu: Yep.

Omar Shams: It just kills you. 'cause like the VCs are not technical. No.

Steve Hsu: Usually it's all vibes. It's all, but is it all vibes with Mark and building his super team for super intelligence?

Omar Shams: again, I don't wanna opine. I think Mark, look, I think Mark is a really strong founder. I think it's really gutsy. It's something only a founder, CEO Yes. Can do with like, with the super voting shares. I think the jury is still out. I'm, I'm optimistic actually, I would say, but

Steve Hsu: I'm not, I'm not questioning his making the bet because, you know, his, his engine throws off so much cash, Uhhuh Uhhuh, like, oh, visors, you know, meta worlds.

Like, this is not the dumbest bet and not, you know, not the dumbest bet that he's made with his spare cash. Right. So just betting on a SI and just billing the best team and plus I think just he's just personally interested in it. Mm-hmm. Like if I were Zuckerberg and I had his resources, I'd be like, well, why not assemble the best team mm-hmm.

That we can, we can with our spare cash cash flow. And like, why not have it here? You know, even, you know, and so I'm not questioning that strategic decision. I'm questioning like. If you're gonna allocate a hundred million to get the best hires.

Omar Shams: Mm-hmm.

Steve Hsu: Is that, is that the strategy? Yeah. Like maybe you just have to do that.

You could argue 'cause there's only a limited number of people who really know their shit. Yeah. But, but the opposite thesis is no, there are a lot of people who know their shit. Yeah. So,

Omar Shams: Yeah. Yeah. So this is, yeah. 'cause I didn't really answer your question. Like, if, if, if I'm saying there's really not, there's really no secrets, then why pay top dollar for these people?

Yeah. I think there's, they're not

Steve Hsu: completely contradictory, but, but there is a little tension there. There is a tension there. Yeah. Yeah.

Omar Shams: uh, it's very hard for me to say, but I will say that even if there aren't secrets, there are, or in my opinion at least, really compelling secrets that you're paying for.

I don't think you're paying for the secrets, but you're probably paying for this tacit knowledge where there are people, there are these subtleties that come up, in building these and that you, you don't wanna maybe wait for someone to up the ramp. And maybe from his perspective, he tried that and, you know, LAMA four, I don't wanna speak ill of them because Oh, you're

Steve Hsu: So careful.

Well, no, you, you Alphabet dudes are so careful, but, well,

Omar Shams: It's not just because I'm so careful, it's because I know how hard it is to build something and I don't like criticizing builders. Like I think builders are like heroes and like, I think it's just easy to criticize from the sidelines. But yeah, I think that even that team would say LAMA four wasn't the best, you know, showing.

Right. So, I think from it, you know, it is worth it, like, this is serious stuff. Like, look, a GI is happening. Like that's what I believe. At least it's happening and it's happening soon. So like, he can't miss out on this. Right. So he he's better off overpaying than missing out on

Steve Hsu: it. Right. You could just say like, look, so what if he overpaid?

He can't afford to overpay. Yeah. my interpretation, it would be at the level of what, like what is meant by, there are no secrets. It means like, I. A, kind of knows what B is doing, or at least knows the range of things that B is doing. Mm-hmm. So there's no secrets in that sense, but some guy has exquisite taste mm-hmm.

And he just has a nose for like, well, we should try this first. Mm. Or, we've tried this for a while and is, hasn't worked, but we should throw more resources at it. Those are all subtle, nuanced decisions, right? Mm-hmm. And it's not, it's not a secret. It's like a subtle judgment call that someone has to make.

Mm-hmm. And maybe one guy makes that better than the other guy. Mm-hmm. A little bit like trading too, like in finance you'd have a similar situation, but I think that's a possible justification for the a hundred, a hundred million dollars comp package. Yeah. Yeah. So, I mean, you'll see the same things in hedge funds where there's somebody who gets an enormous comp package based on some past performance.

Mm-hmm. But, but who knows how predictive

Omar Shams: It's, or people see the same thing about CEO pay, right? Yes. Sometimes in the, in the more popular discourse. Right, right. Yeah.

Steve Hsu: Right. I, I guess on the one hand I feel a little fomo, but on the other hand I feel like it's good that geeks are getting paid like that.

Like shouldn't a geek get paid more than I Churro, Suzuki, or, you know, or, or Tom Brady. They should, right? Yeah. Give 'em the money. So, yeah.

Omar Shams: It'll be interesting to see what people do with their money.

Steve Hsu: Yeah. Well it all feeds back. 'cause a lot of, a lot of like billionaire type people are very willing to fund science philanthropy.

Mm. Like for them it's cooler to say like, oh, I helped Caltech build this telescope.

Omar Shams: Mm-hmm.

Steve Hsu: Then like, oh, I just bought another thousand square meters worth of house, you know, in Miami or something. Mm-hmm. They're not that into that stuff. They're more into putting something into something really cool.

So I think just as we were saying, these people are agentic. Mm-hmm. and smart. If you give them huge resources that they don't really need to live, and they'll, I think they'll recycle them in interesting ways. They might fund biology research and they might fund some research that's really valuable.

Omar Shams: I hope so. I am a little skeptical. I think it's true in some cases, and I certainly have friends. You and I have friends. I think almost like we're, we're, we're, I, I hate to psychoanalyze you on the spot, but I think you're gonna tend to attract the most interesting segment of people who make tech money.

Yeah. Who does this kind of agentic and interesting stuff? uh, but I, I actually think that's not the norm, unfortunately. I think people just kind of retreat into their shelves. There's not enough medicine. So like, what if, I hope you don't mind while I'm on your podcast. Like I will kind of opine that more people who make tech money should do more interesting stuff with their money.

Like for example, the, you know, like the Vesuvius project, like it doesn't have to be. Philanthropy could just do interesting stuff. Yeah. Just do something interesting. You know, the guy, I forget his name, who confirmed that the site of the Iliad really existed, like ancient Troy really existed, just had an idea in his head.

He's like, I'm gonna go to this, you know, part of Turkey. I'm gonna go dig out this site. And I think that's where Troy is based on this evidence, right? Like more people should do stuff like that.

Steve Hsu: I, well, I totally agree with you. I mean, also being a professor, I'm, I'm, I'm the guy with the begging cup trying to get the wealthy guy to like fund some academic science, right?

So, a hundred percent agree with your thesis. I do think that the kind of guy that Mark pays up a hundred million for, to help build AGI or ASI, is probably the kind of guy who has these broader interests, like interest in science and stuff like that. So on average, average compared to on average, compared to a guy who just runs a macro hedge fund here in New York.

Yeah. And what's he gonna do with his money? His wife is gonna collect a lot of art. Or something. Right. So, so,

Omar Shams: It, it's almost like we've lost the ability, you know, people talk about Aris, you know, not to stay on this topic for too long, but, you know, people talk about, throw around all these buzzwords, but like, and I'll, I'll do a little bit of buzz wording myself, but, you know, aristocratic tutoring or like taste, you talked about, you spoke about taste earlier or mentioned taste earlier.

We just don't, we just don't develop this sense of, you know, adventure and I don't know what it is, adventure or culture or like agency and people. It's, you can be very successful in one domain and then once you collect your earnings, you just don't use it, you know, at all to do something.

Steve Hsu: It's very typical.

Omar Shams: Yeah. Yeah. Anyways, well enough about that, but yeah.

Steve Hsu: Well, I agree with you.

Okay, so let's come back to one topic we said we were gonna talk about because this is part of your role now as agents. So if someone comes to you and says, you know, I'm on social media every day. I see some clips of some guy's agent that, you know, can, can do everything for me, but nobody I know actually gets much value out of agents right now.

So what, where's the dividing line between hype and reality for what agents are good for?

Omar Shams: Yeah, no, good, good question. I, I think, look, it's, it's just really this field is moving very fast and I think it takes time for these developments to spread. I don't know if I'm a full Tyler or wanes, you know, he has this whole thesis which I, I wanna explain briefly about how this is akin to electrification and it's gonna take a hundred years for EGI to diffuse into the economy.

I think that's too slow. I don't fully believe that I get where he is coming from. There's regulation. Hurdles. There's like, people just take time to change their minds and change habits and so on. Who is that physicist that said, you know, physics advances one funeral at a time?

Steve Hsu: Maybe Heisenberg.

Omar Shams: Was it? Are you sure? I well, yeah, because I thought it was

Steve Hsu: Well, it's somebody in that group because, because the thing that historians never say is like, all the old guys pre quantum revolution didn't believe in quantum mechanics. And like now we, we don't even think about that, but, even that transition was a rough transition.

Omar Shams: Yeah. I think there's actually something similar happening with ai. You see this sometimes with some of the old school software engineers that just still don't believe in ai and I'm like, really? Really? And anyways, so, so I think there's something like that. There's probably something like that that's gonna happen with different industries.

But I wanna directly answer your question. I think undoubtedly today, soft, there are software agents which I work on, uh. and, and again, I, sorry, I am doing the alphabet thing. I wanna be very careful when I comment on my actual work, but I can talk about the field freely. uh, I think that's clearly like a thing.

People are using cursors, people are using cloud code and so on, all these other tools, right? it is making a big difference in their, in their, in their day-to-day, you know, software development people are, I, and I saw this even, you know in my time at Mutable running mutable, that the standards for software, even for a startup are much higher now.

You just can't have shitty software. Even if you solve some unique novel pain points, customer expectations are very high and that's gonna pull the field forward. So I think that is happening in software. Other domains, I agree it hasn't fully happened, but we see a little bit in the legal domain actually, Harvard, Harvey supposedly, and other companies like that, they're making a lot of money.

There's a few other domains like that. And I think slowly but surely, or sorry, actually not even that slowly, we, we were gonna see, you know, white collar labor, get these software agents and I don't know what it's gonna do to people's jobs, but there it is definitely gonna help with the development, if not take over the development.

Steve Hsu: Okay. Of white collar. Sharp, sharp question. So I think it's reported that for the graduating class of 2025 people with computer science software engineering training, the job market is poor. There's been a decline in offers or, you know, the increase in the employment rate. How much of that is due to AI driven improvements in productivity?

How much of it's just like big tech not hiring as many people for some other reason?

Omar Shams: It's hard to say. I would bet. Currently it's more that, you know, tech companies are not hiring as much and I think there was like this very zerbe era where you would, you know, make a job offer to anyone with, you know, kind of.

You know, anyone basically with even halfway plausible. And I think that was, that was unsustainable and probably that led to at least a short term uh, belt tightening because there's just an over hiring spree. And even if you have layoffs, you, you, you know, people are, are, are reluctant, you know, for very human reasons and other reasons, morale reasons to do cuts that are too deep, right?

So they're probably over, you know, companies in general over hire and they're not hiring as much. I do think that there's something deeper going on, though. I think that is a cop out a little bit. When I, when I, when I say that, I think there's something going on with a combination of AI in the computer science curriculum where the computer science curriculum is kind of weird.

I know they've changed it from what I hear. But basically they're learning, you know, Discrete math, they're learning algorithms. They're not learning actual software engineering so often in a green. CS Brad won't actually be that effective as a software engineer. he or she will just, you know, just not, not be that useful to you.

And that's kind of why I personally didn't hire people that many people who are newer. I did hire, and I'll tell you the exception of who I hired, and maybe this is if there's younger listeners. 'cause I am interested in helping out people, you know, other founders and, and perhaps if there's people who want their first job in the field.

Uh, try to give a little bit of advice here. I did hire a 19-year-old and that person, you know, didn't go to college actually, they just had graduated from Princeton High School. And they did a bunch of robotics projects. They did a bunch of rust projects. And I could tell, you know, you, you, you know, when you talk to someone you're like, yeah, they're power level.

You okay? Wow, this person's power level is very high. You know, so there's something about. That person that I hired that was, I could tell, was amazing. I wasn't sure, of course, I interviewed them and they did really well in the interviews and I made that person an offer. So, so, you know, that just goes to show you that, it's really, you know, I, I do think employers will hire you if you, especially startups, especially YC startups, like working at a startup, will hire you if you have no, if you can show competence, if you can build things, if you have projects and kind of just, just go build things, just go build things.

That's almost more important than the degree at this point. And it shows agency more, you know, and I think agency's gonna be more and more important.

Steve Hsu: Right. So that's advice to young people. But just to pin you down on the answer to my question I think you, I think you landed in the safe spot, which is there, it's multi, there are multiple explanations, multiple factors contributing to the decline of offers to software engineers in 2025.

Part of it is in the post zero era that companies overhired and they realized they were bloated and they're shrinking in the higher interest rate environment. But part of it is also a recognition that there's some increase in productivity from AI tools. Is that fair?

Omar Shams: Yeah, I think so. I think there's definitely a part of that.

If I had to guess, that's a double digit effect at least. Because right now the AI can do a lot of the work of a junior engineer and the job is kind of moving towards being like a TLM or a tl where you're managing these teams of agents.

Steve Hsu: Kind of the team leader.

Omar Shams: Yeah, team leader, yeah. or technical lead sometimes.

Yeah. Uh so like, do you need a junior engineer anymore? maybe not. And then there, junior engineers were always probably actually a net negative in the short term. Yeah. And it was always something you did like, oh, I'm growing my pipeline. I'm hiring this person, not because they're going to really actually help me that much.

And actually they're gonna maybe be a net negative actually in the beginning, in the fir, maybe even in the first year or two. But I need, you know, to grow my pipeline, I need to keep up. So maybe there's a feeling like you just need to get by with much less. And I think I'll point out two people. One is more kind of in agreement with what you're saying, Dario and Mo Day, who's saying, Hey, like, there's gonna be massive job loss in two years actually because of ai.

I have an outstanding bet actually with a friend who works at Anthropic. We put it on both of our calendars uh, almost exactly two years from now that he predicts that there's gonna be, you know, like at least 30% drop in layoffs in, you know, tech companies across the board, including companies like that are leaner already, like Tesla, let's say.

So we'll see. I mean, I made the other side of the bed. I think 30% is too high. but even less, less dramatic than that are people like Toby Luki, I believe the Shopify, okay. Yeah. CEO who he didn't really opine on layoffs as much, but he did mention, you know, that he expected all of his teams to use AI more and to see how much they can do with AI without hiring more people.

And you, you know, you hear about, you know, companies like Cora, where like, you know, they're hiring someone just to like, you know, automate as much as possible of, you know, what they do with ai. But there is a sense that AI will just make you more productive and you can get by with less and less people, and then therefore you don't need to hire as much.

But then that goes back to, what's that economic log? The VINs paradox that people go around? Yes, yes. So who knows? Who knows, it's so hard to predict. I don't have a good answer. It's so hard to predict these things.

Steve Hsu: Okay. Now coming back to the word agent.

Omar Shams: Mm-hmm.

Steve Hsu: So I think you mentioned a few cases where it is clear that there's a productivity gain from AI tools, but I wanna differentiate between AI tools where you, you know, you, you send a query in to ChatGPT or, you know, you ask ChatGPT to revise something or write a first draft of it.

I, I don't really think of that as an agent. I think of an agent as something that's a little more autonomous and takes multiple steps autonomously without human supervision as opposed to a one shot or a few shot tool where the human is looking carefully at everything that comes out.

Omar Shams: Mm-hmm.

Steve Hsu: So what's a, what's an example where, you know, I want to write some function in my code base, and I just let the agent go hog wild and it does a bunch of non-trivial stuff and returns it and, you know, is that, is that a real thing now?

Oh, absolutely.

Omar Shams: Yeah. All the examples I mentioned have true agents in my opinion. So like you can, in the settings of all of these tools, like, or whatever, any, honestly any of them lovable, whatever put the setting so that you don't have to affirm. Yep. You know, like confirm all the actions. Yep, yep. And you can just let it go hog wild and way more than a function.

You can have it build like a feature, you can have it build a web app, you can have it build, you know, and I think it works decently well. I think there's an argument to be made that the argument people usually make is like with increasing nines of accuracy. The, the, the, the, the set of things it can build that require more steps to build, you know, as you're multiplying the, you know, the 99.9, yes it does, you know, it's gonna just improve the kind of this, the, the, the horizon, the time horizon and the, the sequence horizon of actions it can take and therefore it can build more sophisticated stuff.

uh, and often that by the way, use as a justification for why, why are people spending so much on, you know, AI data centers, chips, energy, because scaling laws, people often portray it, maybe it's just my perception as any, but as like some kind of free launch. Like, oh my God, this is an amazing thing. But actually it's kind of terrible.

Right? It's a logarithm. Scaling laws are a logarithm.

Steve Hsu: Yeah.

Omar Shams: And I think the only way you can justify it is, you know, the, this kind of, the, the increasing nines argument. But also, and maybe this is a nice physics connection, is the emergent abilities argument that maybe, you know, there's a scale at which, you know, going back to the airplane analogy where you have every, you know, it looks like an airplane.

It smells like an airplane, but it does. It's not, you know, not an airplane. It doesn't take off, you know, and then you go up in order of magnitude and oh wow, it takes off, you know, it's air, you know, so I think there's gonna be things like that, it's very hard to predict. And I think scaling laws, by the way, are one of those things where I have this, you know, you know, hobby horse almost of like, you know, scaling laws are like thermodynamics where we discovered thermodynamics and steam engines way before we discovered statistical mechanics.

Yeah. So, as an open question to other AI, you know, researchers and people in the field, what are the statistical mechanics to scaling laws? 'cause I've seen stuff, and I've asked you this question before, I, I'm not quite satisfied with the answers. I think there's something deeper there, but it could be, I could be wrong.

Steve Hsu: Yeah. We've talked about this before. I think so. You know, when, for, for people who are in a little more relaxed situation where they're not trying to ship, you know, the next version of Claude or something, and they have a little bit more of a theoretical bent, I think you're gonna see people coming up with more fundamental models that then explain particular scaling laws that we're seeing.

I think that's what you're searching for, but I haven't yet seen anything quite of that nature, but I do see papers where people are trying, so I think eventually there will be some better understanding of these scaling laws.

Omar Shams: Yeah. Yeah. I think just to give a, maybe a more concrete example so Tim, I think debtors, I, I can't remember

Steve Hsu: Demers.

Is he the quantization guy? Yeah, yeah, I had him on the podcast actually.

Omar Shams: Oh, nice, nice. Okay. So like that, you know, just to review some of his work, I believe he showed that after 7 billion parameters and many families of language models that. There's these outliers that emerged and it's this, there's a phase transition and physicists are obsessed with phase transitions, right?

'cause they happen all over physics. They happen in, in particle physics, they happen in that matter, physics and so on. And they happen, and even in, in galactic physics. So, you know, they, I heard an interesting rumor that, you know, other groups this is in Google, of course uh, but other groups uh, one of, one of the other labs was not able to confirm actually his results, by the way.

Interesting. On optimization.

Steve Hsu: Interesting.

Omar Shams: Yeah.

Steve Hsu: Not able to replicate.

Omar Shams: Yes.

Steve Hsu: Interesting.

Omar Shams: So who knows? But more things like that, like come up with these, like quantitative measures, ideally not just based on benchmarks, right? I mean, those are a bit too much like emergence, right? And then like, could you predict those?

Is there, is there like a stat back for this? Is there a way to predict these?

Steve Hsu: Right.

Omar Shams: Or at least a, you know, a good, a lower. Relatively low parameter way of predicting it.

Steve Hsu: Yeah. One, one drum you'll always hear me beating on X is about open source models. Because for someone like you who has access to the whole Google alphabet infrastructure, you know, you can, if you get interested in something, code it up and run experiments, you know, quite easily.

But for academics, the existence of, you know, these small open source models like Quinn and Deeps seek or distilled deeps seek are super important. When you look at the papers, someone has some theoretical idea. They do some theory, and then they do a bunch of calculations using open source models to verify, you know, that the behavior they theorize about is actually empirically observed.

And that would be impossible actually if they didn't, if these because sometimes they have to modify the models in some way or, you know, it's, they need it to be really open source actually. So I actually think for this theoretical investigation, it's really important that academics have access to open source models.

Omar Shams: Absolutely. Yeah. I've seen so much interesting work, especially recently on, on RL reinforcement learning on these models, you know, like qu and, and, and someone. Yeah.

Steve Hsu: Okay. We're nearly an hour in, so what's a topic that I didn't ask you about that you'd like to opine on?

Omar Shams: Yeah. I mean maybe uh, two one, one we, we comment on briefly and another one that, you know, is another one of my, you know, kind of topics that I go back to.

So one is like, what, what is the role of physical intuition and ai?

Steve Hsu: Oh, great. Yes, we were, I'm sorry, we said we were gonna discuss this and I, I forgot.

Omar Shams: We did touch upon it though. Yeah. You know, normalization groups and so on. So I think that there's something to be said. And before I go into the direct arguments, let me give kind of the, some of the social proofs unquote, on this.

If people didn't know, a lot of the developments in the field have been made actually by ex physicists. And you know, Steve and I have to, you know

Steve Hsu: we have to pimp

Omar Shams: physics.

Steve Hsu: Was it, was it Ilia or was it Carpathy recently who just tweeted out something like, I theoretical physicists are like the stem cells, embryonic

Omar Shams: stem cells.

I've now seen them become everything. Yeah, yeah, yeah. Who,

Steve Hsu: Who was that? It was pathetic. It was Carpathy. Okay. Yeah. But it's totally true. I mean, you know, I mean, just historically it's true, right? That, that, okay, you can build bombs, you can design microchips, you can solve problems in biology. theoretical physicists have done all that stuff in the past, so it's not surprising they could make some contribution in ai.

But I, but my question for you is, what is the special intuition or capability or advantage that theoretical physicists bring to this field in particular? And also, what are the blind spots? What are things that theorists are or physicists are weak at? Mm. where maybe the CS guys bring more strength to it?

Omar Shams: Yeah. Yeah. I, I think I, I actually can answer both, I believe. so, the physics, unlike math physics, is honestly like, it's almost like you get everything that a mathematician does. And mathematicians might not like to hear it, but it's true. And you get this physical intuition, which some mathematicians have.

Like, I actually basically did a double major in math. That was whatever I had off course. And then but like, I think real analysis, like I, I took a bunch of real analysis classes in undergrad and there is a kind of something that feels like physical intuition with epsilon delta proofs.

That's the closest I think mathematicians get. I think sometimes geometers, there's not that many geometers though, in my opinion. I think a lot of mathematicians are. There's a lot of algebra. Topology doesn't really give you physical intuition. But anyways, physical intuition. This movie that plays in your head, at least there's a movie that plays in my head that's super satisfying that I see when I do physics problems, or I read physics textbooks, that intuition carries itself to AI research very directly because loss curves are like an energy manifold, basically, where you have, you know, this ball rolling down the hill and you're trying to optimize the, this manifold, right?

And like, there's so many analogies between information theory and like he divergences where it looks like, again, you know, like, you know, at literal Hamiltonian, right? Where you have partition functions, you have all these things in physics that not only are analogous, but are they basically exactly the same?

Steve Hsu: Well, you know, after all the word entropy does come from physics, right? Yes. So you open an ai, pa, a theoretical AI paper, and the word entropy is gonna appear there at some point, right? So, yeah.

Omar Shams: Yeah. I think there's just too many opportunities where there's something happening where. There's a, a, a physical process almost where there, you know, having a physical intuition for how that moves and, and unfolds is helpful.

I also think I am familiar with this kind of math, actually the, like, you know, path integrals, you know, I mentioned some of the other math, you know, SAT math ion functions, con more continuous math, dealing with continuous math, applied math deal. You know, dealing with approximations I think is all very helpful.

And that's why, you know, you have people like Jared Kaplan who was, I was reading as, you know, some of his string theory papers. And then I foolishly, I like, I love to tell the story 'cause it's so funny. I was reading another of Jared Kaplan's papers in ai and then a year later I was like, wait, that's the same person.

So there's something, there's some, there's something there as supposed. And in terms of the weaknesses, and I experienced this firsthand 'cause like I was really a physicist. I didn't really like it, yeah, I wasn't trained as a computer scientist. I think some of the like, very like algorithmically algorithm, algorithm stuff with like, there's subtleties with like some of these algorithms and like for loops and you put the bit here and there and, and then, oh, actually you didn't account that the bit was shifting this way.

And you know, you can learn that as a business, but it's not your bread and butter. So that's definitely a weakness of some of the, the, the more traditional CS stuff.

Steve Hsu: Yeah.

Omar Shams: You're just not trained in it. Yeah,

Steve Hsu: But I agree with you completely. Discrete algorithmic stuff is not necessarily our strength.

But on the other hand, we're better at dealing with continuous systems. Mm-hmm. And, and I think when you get to enough parameters and enough, you know, quasi continuous values of those parameters, it becomes a continuous optimization problem, not a discrete optimization problem. So, it sort of shifts it a little bit more toward physics intuition.

Omar Shams: Yeah. The tension between continuous and not to open another fire, but the tension between continuous and this great math is super fascinating. I think it's a rich source of mathematics and physics. And in physics, we, of course, come across it all the time with like, you know, we have the wave particle duality, and we learn later, oh, everything's really quantum field and so on.

But you have these manifestations where you have things that are very discreet, like eising models, you know, eising systems, but there's like, you know, ways of thinking about these that are, you know, and, and limits that become very continuous and so on. And, so I think it's a very, very rich tension. But physicists do deal, actually do, do, do actually deal with uh, discreet phenomenon.

Steve Hsu: Now not to put you on the spot here, but as advice to Zuck, should he throw some of these hundred million dollar packages at former theoretical physicists? What do you think?

Omar Shams: I think so, yes.

Steve Hsu: Can only help, right?

Omar Shams: Yes. Yes. I mean, there's a reason why, you know, Anthropic, you know, so many people there are, you know, this even people like, you know I think a mutual acquaintance or friend John Schulman, you know, I think he did an undergrad. He was an

Steve Hsu: undergrad physics major at Caltech.

Omar Shams: Yeah. Yeah. And was an undergrad physics major. There's tons of tons and tons of physics.

Steve Hsu: Yeah. The other thing about physics is that. The math you learn is specific to the most useful, interesting problems that humans have encountered. So it's like, it's like biasing you toward not the math that's most interesting to pure mathematicians, but the math that's actually most directly related to real world systems that people have gotten control over.

Right. And so it is, it is the right kind of the right curriculum for whatever other thing you're gonna do in your life.

Omar Shams: Yeah.

Steve Hsu: Great. Well that's a wrap I think, and I want to thank you again for being on the show. I'm sure that there's some young guy out there who really was stimulated or appreciated your insights in the conversation.

What's the best way for people to get in touch with you?

Omar Shams: yeah, so you can go to my website, omarshams.ai, and all my contact information is on there. You can, you know, follow me on x. My handle is Omar shams backwards, right to left and or my email is on my website as well. So feel free to reach out.

Steve Hsu: Okay. We'll put some of that in the show notes.

Omar Shams: And thanks again for having me.

Steve Hsu: Yeah, we're good.

Creators and Guests

Stephen Hsu
Host
Stephen Hsu
Steve Hsu is Professor of Theoretical Physics and of Computational Mathematics, Science, and Engineering at Michigan State University.
© Steve Hsu - All Rights Reserved