Manifold | Transcript: Jaan Tallinn: AI Risks, Investments, and AGI

May 2, 2024 • 43 Minutes

Jaan Tallinn: AI Risks, Investments, and AGI — #59

Steve Hsu: Welcome to Manifold. My guest today is Jaan Tallinn. He was a guest on the podcast a few years ago. I will put a link to that earlier interview, which I think is a really good one in the show notes. Jaan was trained in theoretical physics. He was a founder of Skype and I believe he might be the world's greatest AI investor.

I'm going to ask him a little bit about how he feels about that. But as far as I understand, he invested a lot very early on in a lot of the most important AI companies in the world. including, for example, DeepMind many years ago before it was acquired by Google. We are going to talk about what we both see as a kind of super acceleration toward AGI, updating on GPT 4 and recent successes with large language models.

That's what we'll focus on in the conversation. Maybe at the end, if we have a little time, Jaan will tell me about his adventures in Tokyo, where I know he likes to travel. Jaan, how are you?

Jaan Tallinn: I'm very good. Thank you.

Steve Hsu: So let me just get this out of the way. Is it true that you're the world's greatest AI investor?

Jaan Tallinn: Oh, I don't think so. I think that DeepMind and Anthropic are going to play two notable entries in my portfolio. About it when it comes to the first tier.

Steve Hsu: Okay. I thought for some reason I thought, I thought you had been, my impression was that you were very early in a lot of the most prominent AI companies, but certainly those two count for sure. Anthropic and DeepMind. So let's talk about the acceleration toward AGI, and I think we were just both saying before we started recording that rather than being decades away, we think it could be years away.

How do you feel about that?

Jaan Tallinn: Anxious because like when one thing that is pretty much guaranteed even if kind of AI stops now, I like the world will change now because now we human comparable minds that you can run on a USB stick. Which is a very, very weird situation.

Steve Hsu: Yeah, I would say just to define terms a little bit, what I think AI researchers think about as true AGI. Is still a couple leaps ahead of where we are now. But as somebody who is the founder of a company that builds narrow AIs for enterprise. I can already see that we can build things that will seem to the average person like an AGI with existing technology. So if models didn't get any better or only a little bit better from now on, I think we could still build something, which is a very good personal assistant that gets to know you over years living in your phone and will kind of seem like an AGI to most people, even if it's not truly an AGI under the hood. Do you agree with that statement?

Jaan Tallinn: I mean, I think the definitions are just doing a lot of work here. Like, I do think that you can kind of set up as of last year kind of reasonable operation, operalizations for a Turing test that they, I would be able to pass. So like we definitely kind of hitting the low end of AGI definitions so people are now just like busy moving the goalposts.

Steve Hsu: Yeah, I agree with that. So I think we passed the Turing test and now everyone's kind of forgotten about the Turing test. I guess for me the module that is still missing is a kind of long-term planning, extended reasoning, and goal orientation component.

Jaan Tallinn: Yeah. That's, that's exactly right. I do, I do think people quite often are kind of too hung up on I don't know, volition, and whatnot, but I think like actual thing that is missing is long-term planning.

Steve Hsu: Yeah, so in our startup we build kind of an attached external memory that we can kind of plug into the LLM so it doesn't hallucinate. But also, as an external module, you could, I think, go a long way toward longer term planning, longer term thinking, and goal orientation, sort of as an external module. What I'd like to see is, well, like, what I think is possible to happen is, you know, that could be incorporated in the neural net structure itself, but that may require a few more breakthroughs.

Jaan Tallinn: Yeah. Also like, I mean, in some ways it would be nice not to, because that would make the system more transparent and also like humans, we are happy to work while kind of looking up and Googling and whatnot. So, I do think it's, from safety, safety perspective, more modular systems are preferable than end-to-end black boxes.

Steve Hsu: The goal orientation and long term planning is an external module, which is more transparent to an ordinary human programmer, that seems much safer.

Jaan Tallinn: Exactly.

Steve Hsu: But you, you do sound pretty concerned, although maybe it's because it's the end of a long day or something for you. But you, you, you do seem like, like, so for me, I think perhaps, you know, this isn't a good way to view it, but I'm just so excited about the pace of progress of the technology. I guess maybe I'm overlooking the long term dangers that you're more focused on. But, are you close to maybe calling for a moratorium or something in terms of further advancement?

Jaan Tallinn: I mean, I'm one of the co-founders or co-founders of Future Flight Institute and Future Flight Institute put out a six months pause letter, which admittedly did not ask for a moratorium on AI. It just asked for a moratorium on the frontier experiments. So yeah, I do think that frontier experiments are completely reckless.

Steve Hsu: I see. Now, being an early investor in Anthropic, do you, do you go, do you visit their headquarters and yell at Dario and tell him to stop? How does that work?

Jaan Tallinn: I mean, that, that I don't try because that wouldn't work. but sure. Like I, I tried to kind of insert, I mean, kind of like strategy in general when it comes to investing in AI companies and people quite often ask me like, well, how, if you're afraid of AI, why do you invest in AI companies is to basically have a voice in a company and displace investors that do not care. At the same time, the main constraint that I tried to satisfy is to not accelerate those companies. And so I never kind of like a major investor. I'm always like a small investor.

Steve Hsu: I see. Now, do you have, okay, stepping back from prescriptive, you know policy suggestions for people. I'm just curious what you're feeling is about how scaling is going to go in the next few years. You know, are we computer limited? Are architectural ideas or algorithms limited? And any thoughts on that kind of thing?

Jaan Tallinn: Yeah, there seems to be, again, I'm not completely close to metal. So like, I'm not an expert here, but from like a, not far, but like from, from like some medium distance distance, it does seem to be some kind of trade off between data and and compute. I mean, one kind of clear way to look at it is that like zero and alpha zero meant zero data.

So you basically run like fully on synthetic data, right? So now the question is, how much compute you can convert into a high quality synthetic data which is like the jury's still out there. But I think it's. It's fairly likely that synthetic data will kind of be valuable.

I mean, what, one way of looking at it is that like all data is just like a result of some physical process in this universe. Right. like, I mean, the language text on the internet, there was just like some output of a bunch of. human brains, which are physical systems. So in some ways, there is like no qualitative difference between synthetic data and, and I don't know what like just data like human generated data there is just the question of like, what kind of what kind of corners of the universe you are sampling or something you know, is your data going to orthogonal enough to cover. core of the domains that you're interested in.

Steve Hsu: Yeah, I think that's so few comments on this. Like, I can plausibly see how they're going to get a 10x on the computer. I can plausibly see. how they're going to get a 10x through clever, you know, modifications of the architecture, you know, different ways to do the attention heads and things like this. I think Anthropic is, I suspect they're making some innovations there.

But many people say, well, it will be tough to get more than kind of like a two X of natural data coming from humans. And so then this question of, to what extent can you start using synthetic data, maybe synthetic data that is filtered or improved by some human review, can I still get another 10x?

It seems plausible that you can, although there's a significant work factor there.

Jaan Tallinn: Yeah. I think it's Jerry's out there again but I think it's very plausible because again, like ultimately there is no kind of philosophical difference between natural and synthetic data. They're just all results of physical processes.

Steve Hsu: I don't think there's a philosophical difference, but I think there could be a problem that some weird idiosyncrasies could sneak into the AI mind universe if you keep, in a sense, recycling the outputs of other AIs back into the new AI.

Jaan Tallinn: Oh, sure. Like I mean, synthetic data doesn't necessarily mean that it's going to derive from natural data. You can, you can also I mean, like the data in alpha, alpha zero was not derived from natural data, right? So it's like, you can, you can also you know, use things like code, et cetera, to, to kind of get your synthetic data.

Steve Hsu: Yeah. And I also think that the level of data, very useful data that will be generated from AI human interactions, obviously, that's going up very fast. you know, in, in some of the customer support AIs that we're building, you know, those things could have, you know, millions of conversations relatively quickly after full deployment.

And so that's, that's a sort of qualitatively new source of data that hasn't, I think, been used that much so far in training.

Jaan Tallinn: Yeah. And again, I can all, all areas where you have some kind of road level check, you can check against like, is it logically consistent? Does the code run or does it satisfy some kind of external constraint? then you can have much, you have much easier time baking a signal into a synthetic data.

Steve Hsu: Another area that I think we probably will see some progress on, but I'm not aware of there being progress yet is some better way to integrate sort of video and language together altogether in a natural way and use that to train, you know, some kind of big transformer that's acting in across all of those spaces simultaneously.

I mean, maybe some of that is already in Sora and things that we've seen. But I think that could go further. I think there's room for quite a lot of innovation in that direction.

Jaan Tallinn: Yeah. I mean, Sora was definitely surprising to me. So like, and I have very little idea how, how it actually works.

Steve Hsu: Yeah, my, my suspicion is that just as the language models are dealing with conceptual primitives that it extracts from looking at lots of human language, there are visual primitives as well. In fact, in a way, the fact that I can describe what I'm seeing to you means that our brains actually, you know, map the visual primitives back into the language primitives.

And so somehow I think what they've done is, is actually have the model acting directly on these visual primitives and then string them together in a way that looks sort of consistent with physics and stuff like that. I mean, it doesn't, those are just words, but I have a feeling that's what actually.

Jaan Tallinn: Okay.

Steve Hsu: Yeah.

Jaan Tallinn: Yeah. I don't know.

Steve Hsu: So what do you think is going to be the next sign of danger of some kind of, you know, real threatening development that we should be looking for?

Jaan Tallinn: I mean, Two things that come to mind are just, there's more and more effort in developing evil. there were a lot of criticisms of evils that they are just like a tool for safety washing and whatnot. On the other hand, I think it's still better to have those than not.

So some strange evils are starting to trigger so I'm just able to Make copies of myself. And the other signs are just general , like a sort of warning shot. I think the most plausible warning shot that we might get is AI hacking its way out from the data centers that it's being trained in.

I think one underappreciated area is like what happens during pre-training. And I, unless the data is very carefully sanitized, which I'm pretty sure it's not, it's constantly being provoked. So there are definitely token sequences in the data that provoke this AI to kind of spin up an agent that tries to hack systems and, and tries to kind of make copies of itself and whatnot.

So it's and in some ways. As my friend Andrew says, our cyber infrastructure is much softer than our physical infrastructure. So like, there is like this hope that we, once we get like a breakout in in cyber infrastructure, it will not spill out into physical infrastructure, but that's just hope.

Steve Hsu: To me, the idea that so, so the, the idea that a human could use an AI to develop some very powerful hacking tools, that seems pretty imminent. The idea that the AI, quote, understands that it's locked in a data center and would like to see more copies of itself in other data centers, that, that, that motive or goal, that doesn't seem like something that we're close to creating within the AIs.

Do you agree with that?

Jaan Tallinn: It's okay. Like, I think like two observations there, like one, first of all, it's kind of a capability question. I can already kind of prompt some, some AI ideally some base model to do just that. And it's just not capable of, of hacking out of the data center that is, that is like being instantiated in.

And so it's a capability question. And the other is like a motive question. I do think it's really important to kind of differentiate between the as a, as a sort of Label for, for like a configuration of graphics cards and electrons running in them, and like the agents that it instantiates while it's being prompted to do so.

So like if, if you ask a ChatGPT to produce a dialogue between two agents it will try to kind of be faithful to what these agents are thinking while producing this dialogue. So from this you can get, get, create kind of like an agent that is motivated to to accomplish real goals not on the LLM level, but on the simulation level that the LLM is performing.

Steve Hsu: So let's suppose I have a language model or you know, just some general AI that has been given access to lots of API. So we, in our work, we're getting to that point now where some of the companies that use our AIs, you know, the AI is able to send a shipping label to somebody or make a refund, or we're not at a point yet, although it's contemplated, you know, book airline reservation, things like this.

So let's imagine that now this thing has a pretty powerful API. So in particular, it could copy a file or something. It could open a TCP IP connection and transmit a bunch of data somewhere else and maybe even hack the other system on the other side and get it to execute the file that it sends. Then I can, at that point, I can see somebody getting the AI to do those things.

But the persistent motivation for that AI or even the copied AI in the other place to want to continue that activity, how do you see that materializing?

Jaan Tallinn: Yeah, I mean, I'm totally speculating here. I'm just observing that AI's currently being like most of the computer is spent. Actually, I think most of the computer is spent on it okay. Like let's put it, let's, let's phrase it differently. There are kind of two frontiers when it comes to AI on this planet.

One is like the implementation frontier, like where people are just like figuring out where to plug, where to plug AI's as they're doing inference. and I think this is definitely an area that we should keep an eyes on the kind of silver lining is that like, this is like a technological frontier, like we are just like looking at uses of technology.

So there are a bunch of experiences that we have regulating and constraining new technology. And then there's the other frontier, which is like, what happens during pre training? What happens during those months-long periods during which like megawatts of electricity and terawatts of data, terabytes of data are being poured into this like black box.

And there's that just like hums there in the cellar or whatever the data center is. and this is where like, this is like the other frontier. Now the question is, what current guarantees do we have about this frontier because the knowledge about these data centers, the knowledge about these like insecure systems that are being trained on this, it's there in the, in the data I used in the, in the AI and in the data.

I mean, one example that I keep, keep giving is that I used GPT 4 to help me to configure my own firewall. it just knows this thing. It just doesn't connect the dots yet, but like during the pre training it has the information that it needs to, to break out of the data center that is being pre trained on.

Steve Hsu: Yeah.

Jaan Tallinn: Again, by it, I mean like potentially some agents that it is simulating, not, not the LLM itself.

Steve Hsu: I agree. I agree. And then, so now carrying this forward, I guess it could, if it copies itself, that, that sub agent copies the big LLM to another data center and then also prompts the new version of itself to also run.

Jaan Tallinn: It, it, it, I think. This is like also possible, but I think it's kind of like inefficient, like it would be better for this agent to do something kind of like the way I visualized it is the kind of like it starts treating the compute substrate that is being instantiated in as its environment.

And so it doesn't, if it needs the LLM, its environment to, to continue to exist, then yes, it needs to copy it, but there is probably a bunch of stuff that it doesn't need. So you can just take a few graphics cards worth of weights with it and, and replicate this or something. I mean, I don't know, like this is like a pure speculation, but like, but what I'm constantly pointing out is that like, we do not have guarantees that something like that is not happening.

it's not possible. Although I also like another silver lining is that there's, I think it's possible to get some guarantees, when kind of like reasoning about, things like loopiness, like how, how much state, is being evolved while, while pre training happens. Yeah, how many steps, how many sequential steps, can a state evolve, for example.

and perhaps like some ideally I'm really looking forward to some kind of like, to some kind of guarantees from the operating system or, or hardware itself. For example, one idea that I've been promoting is in synthetic biology. Scientists have this approach where they make the synthetic organism dependent on some kind of chemical that doesn't appear in nature, right?

We could do the same thing in AI training by having secure enclaves in the computer, like GPUs and TPUs. I don't know if GPU has a secure enclave, but anyway, those secure enclaves they would, should expect some kind of signed tokens.

And once this like trip stops and they would just like to turn off the rest of the, rest of the graphics card or TPU. So I think that's totally implementable. and I think it's a responsible thing to do, but of course, this is just like a high-level sketch, not an actual [unclear].

Steve Hsu: Yeah, I like that idea. I like that idea. So,

Jaan Tallinn: That would also give us some kind of invariant that we can reason about, but currently just very little invariance.

Steve Hsu: Would you say that, the amount of resources invested in interpretability of models, you know, to make them safer, just to understand what the threats are? You know, that level of investment is way too small compared to what's being spent to improve models.

Jaan Tallinn: On a first level, yes, I would say. So like, obviously, like everything kind of dwarfs. Everything is dwarfed by the, by the like hundreds of billions or like tens, if not hundreds of billions that are being thrown that make more, making models more capable. On the other hand, like one lesson that I learned over the last decade or so is that you know, so-called AI safety and AI alignment is like a double-edged sword because like all advances in safety, pretty much are immediately put into port into AI capabilities.

So like once you, once you understand the AI better, you can also save on computer or make it more capable.

Steve Hsu: Yeah, I, I agree a million percent with what you just said, because the most insightful, discussions I've seen of interpretability and such of models and is always from real, from, I don't want to, sorry, that didn't, that wasn't a very nice thing to say, but from, from people, from researchers who are doing it to make the models better.

So, it seems like the people who are, I shouldn't say it this way, but the smartest and most serious people who are trying to figure out model interpretability are doing it with the goal of making the models better and better, better, faster, faster. And it seems like the safety people, while well motivated, aren't the ones that necessarily are, are having these insights.

So there's a, there's definitely an imbalance there. If you, if you, if you create it, if you took all the top mathematicians in the world or something or theoretical physicists and had them working on this problem, that the end point might just be like 10 X faster improvement of models from their results.

Right? So, um,

Jaan Tallinn: Yeah, so I think from a safety perspective, I'm now kind of leaning towards, yeah. What kind of constraints can we kind of build into, into the systems? Be it like hardware constraints or some kind of like provable I mean for example, David Repo is, is running, having is kind working at the UK advanced research, like DARPA equivalent, basic.

Steve Hsu: Oh, yeah.

Jaan Tallinn: And his, his, his program is about he calls it like open agency architecture where you kind of use the, I mean, I'm going to butcher the explanation, but it's basically the idea is that as AIs get more capable you also use the capability to have those AIs produce, proofs of what, what they're going to do. so you can like have simple proof checkers to, to check whether this is safe or not, what their plans are.

Steve Hsu: Yes. I know when people ask me about what, so what's the current situation with AI? I often tell them number one, I've never seen a field moving forward so fast, because you have this confluence of massive financial and economic resources, massive human capital, and actual real excitement. You know, I mean, people maybe should be more concerned with the way that you are, but most young technologists and scientists are just excited about, like, being able to do cool things.

So you have this confluence of these things. And then there's also geopolitical competition. So I've never seen stuff like this. You know, even the early nuclear programs, they weren't really moving that fast because they were bottlenecked by just a small number of people who understood physics. It could really, you know, contribute where here it's like a much larger set of people that can contribute.

Jaan Tallinn: Yeah, I think it's similar to the nuclear field in a sense that there's now a kind of overhang of scientific results. And now it's just like engineers are, implementing things and experimenting, et cetera. So it's, yeah, AI is building aliens. Building digital aliens [unclear].

Steve Hsu: Yeah, exactly. That's how I would put it. So for somebody who is really concerned about safety, what's a plausible good outcome in the next few years in terms of policy, you know, measures or the government's agreeing, you know, on ways to slow this down a little bit or make it more safe? If you and I get in a time machine, we come out, it's five years later and it's the best universe that you envision, what happened during those five years?

Jaan Tallinn: Yeah. So I have this webpage, jaan.info where there's like page jaan.info/priorities. and there are, you know, like six priorities currently there. One is data center certifications. Basically AI should be trained, like the frontier AI should be trained in data centers, that have certifications about certain invariants. Ideally things like yeah, hardware constraints and whatnot.

Secondly, speed limits. We should bake into AIs some kind of limits when it comes to inference and speeds because potentially, AIs could be a million times faster than humans. Humans could potentially be just plants compared to AI.

Third is liability laws, figuring out how to, if something goes wrong, how do we, back propagate the liability, to the place where it's easiest to fix the problem.

Fourth is labeling requirements. This is the easiest one, just making sure that AI outputs cannot be confused with human outputs. Fifth is veto committees. This is like a governance and more complicated, complicated idea.

And the sixth one is global off switches, basically something that I discussed earlier about this like trip of secure tokens or, or, or some, some other mechanism that allows us to still yield some kind of control over the situation if, if things seem to be going, seem to be going really wrong.

Steve Hsu: Got it. Yeah, those are all good ideas. If I come back for a moment to the nuclear arms race, which I have maybe some people would say an unhealthy fascination with since I was a kid, I grew up in the cold war. So obviously, and so did you. So, yeah, so if you're a physics guy and you grew up in the cold war, it's like there's an unlimited, you know, rabbit hole you can go down to think about now.

It seemed like really nothing stopped. So, you throw in this component of geopolitical, not just economic competition, but real hard power geopolitical competition. And so there you just had governments expending, you know, what it took and taking some of their best minds and actually, in some cases, like with land out, forcing them to work on, you know, the bombs and there was really no curtailment of what they were trying to do. I mean, we were talking about putting bombs in space and orbiting the earth and, and talking about cobalt bombs that could destroy all life on earth. And so it didn't seem like anything really stopped it until just the countries decided it was in their own interest to negotiate a slowdown in the arms race.

And during the process of all this development, they were doing things which, you know, harmed a lot of people, like, like killed, like killed a bunch of, you know, Pacific Islanders with radiation, even killed Navy people, like when they were testing to see whether the bomb, what the bomb would do to a Navy fleet.

You know, they definitely, you know, took huge risks. And so by comparison with the AI threat, where it's a little bit abstract, right? It's someone very thoughtful, like you, saying like, oh, yeah, yeah, but, but just keep extrapolating and we get into this huge problem. That's much less visceral for a politician or average person to understand than like, oh, yeah, when we do nuclear tests, we kill some people or, you know, we make this huge place unlivable, you know, in Kazakhstan or something for, for decades.

Like, even that kind of thing didn't stop breakneck development on the nuclear side. So, what's a plausible way to get this done on the AI side?

Jaan Tallinn: I don't think I have very easy answers. I do think that the trend is positive. Like if you look at the world pre-chat GPT four or, or just chat GPT and post, I think it's much easier to find politicians who understand the problem, and are trying to do something about it.

They are like legislative initiatives. Here and there in our executive order. and also like China has put out some legislation. I understand. there are more, more, more going to be international efforts. I'm part of this UN AI. A high level advisory body we'll see what happens there. But I think like, again, this happens because this exists because of chat GPT four. So like a lot of changes, a lot of positive changes are also downstream of chat GPT four. So I continue to expect something like that. So like, we will like new AI releases that will eat into our runway as a civilization, but we'll also give us more resolve to do something about it.

Steve Hsu: I agree with you. I mean, every advance concretizes the threat even to people who are non-experts, right?

Jaan Tallinn: Okay. And you can, you can now talk, you can take a computer and talk to it. Like how, how, like in what way this is not science fiction.

Steve Hsu: Is it, in what way this is not science fiction?

Jaan Tallinn: Sure. I mean, as an, if I put on my engineering hat, it's, it's also cool, but like also need to I also have my physicist hat and, and, and I think that in the end the game is what kind of configurations of atoms are reachable from here and do they contain, human shaped objects.

Steve Hsu: You know, it occurs to me. So a conversation that I've often had with other science fiction fans is that people had said that, you know, Frank Herbert was by far in some ways the most insightful because it was set in a 10,000, 30,000 years in the future where humans have clearly developed a lot of technology. But humans are still the main drivers of the action has to have an explanation and in his universe, the explanation is this, but Larry and jihad where humans had a close brush with AI takeover and then they pass these really draconian laws and the, the statement is thou shalt not make a machine in the image of the human mind under penalty of death, right? In this Dune universe. I think if we hire Demi, who's, who's the guy, Demi, who's the guy who directed Dune, Demi, anyway, if we hired him to make a little, a little short feature about the Butlerian Jihad, like the close brush humans had with the AI takeover and then how they then enacted all these restrictions on it. You know, maybe that would be a good way for average people to understand, you know, what's going on here.

Jaan Tallinn: Yeah. Many thoughts on it. First of all, like I think that, there are already many people thinking about how to kind of visualize the threat. So I expect several movie projects to come out in the next few years.

Second, I think that ordinary people don't even need much convincing. At least some of the polls show that they seem to be much more bored on board with this, like, wait a minute. Why do we, why are doing this? I think the most resistant people to, to kind of reasoning about the AI are, I don't know, some kind of like, I don't know how, what, what is the exact common denominator, but like basically a subclass of intellectuals, who just like want to say smart things and, and saying, not saying crazy sounding things. This is like one heuristic that they follow when they want to sound smart.

Steve Hsu: Yeah, I agree with you. I, you know, it's so funny 'cause as I was waxing eloquent about Dune and the. But Larry and Jahad, I realize like a common common sense approach to this whole thing which I hear like you might hear on Reddit is just somebody referencing Skynet in the first Terminator movie, which is obviously quite old, but they're not wrong.

Like, why do I have to make a Jaane movie when I could just reference Skynet? Right, so.

Jaan Tallinn: The AI safety community was so frustrated that, like everyone, everything we said or did was like a kind of a company with a picture of Terminator, but the movie is not actually bad.

Steve Hsu: It's not.

Jaan Tallinn: Given the constraints, you always have to make an interesting movie, whereas a realistic AI disaster will not be interesting to humans. with that constraint, I think the Terminator movie as well as many other movies have been.

Steve Hsu: Yeah. Yeah. So now that I think about it, it's like, well, this is already baked into the culture. It's just and plenty of average people actually get the point. Do we really want to?

Jaan Tallinn: At least in the West, the situation is a bit different in the East.

Steve Hsu: Maybe you could talk about that when, when you, I know you, for example, you'd like, I, you like to visit Japan and Tokyo, like, do you have conversations there about AI risk?

Jaan Tallinn: Yeah. My, my, my, I mean, first of all, like the Center for Future of Intelligence at Cambridge. A few years ago they did this like comparative study about AI narratives in different cultures. And I think their summary really was that, in the West you get dystopian Terminator scenarios like that dominate AI discourse or AI fiction universe, whereas in China and Japan, the typical AI narrative is about robots becoming conscious and once they become conscious they are. Just like us, except they're not us. So you have like this, like narrative problems like ethics and, and like, how do you relate to robots, et cetera, et cetera, which it's just super unrealistic. But like it is kind of a positive, positive narrative, but it just doesn't make sense.

Steve Hsu: When I was a very young kid, My brother and I would get up early and run downstairs to watch a cartoon called Astro Boy. I don't know what the Japanese name of it is, but in English it was called, in America it was called Astro Boy. And it's, it's literally a retelling of the Pinocchio myth. But the little Pinocchio character is a robot.

And so it's a machine that comes to life, but it's a lovable, loving, wonderful little boy robot. And I think that's probably the most, maybe the one of the most influential bits of anime that's ever been made. So yeah, I see where that comes from.

Jaan Tallinn: I mean, it's also kind of possible, but like who enforces the constraint that these robots are not going to develop their successors?

Steve Hsu: That's not, it's not, well, again, like back to Frank Herbert, he had the insight to say, like, wait a minute, like, this is a, this is going to happen technologically. And then the humans have to react to it a certain way. Otherwise, when I write science fiction, the progenitors are all going to be giant machines and stuff.

Right. So.

Jaan Tallinn: Not, to mention, mention like Vener Vinge, who just died in his like zone of thoughts.

Steve Hsu: I didn't realize. Sorry. Yeah. I didn't realize that.

Jaan Tallinn: Oh yeah. It's just like a couple of weeks ago.

Steve Hsu: Yes.

Jaan Tallinn: Sorry. Really unfortunate.

Steve Hsu: So all right. I promised I wouldn't keep you too long because I know it's the end of a long day for you. So final remarks. Maybe you could just say like, what are you trying to do to keep us safe from these machine aliens that we keep trying to create in our midst and what should average people do if they want to help?

Jaan Tallinn: So it's a little bit hard for me to answer currently because like I'm kind of in the midst of kind of switching my strategy from just supporting AI safety research, which I have done for more than a, more than, more than a decade towards kind of more direct action which is just like finding groups of competent people and supporting them to, to like either address, either develop some new approaches like AI, what's the word like provable safety or, or some other kind of like mathematics using mathematics as constraints on AIs or trying to advance some legislation about that kind of would apply to this frontier AIs. and then, yeah, I'm part of this UN, UN group which is another kind of thing that is about kind of more direct action than just like supporting groups from afar or supporting researchers from afar.

and like, when it comes to ordinary people, I think it's a few things. One is just like reading up on things on this, like this score, this kind of AI safety concern is, is now, I mean, depending how it is, You can go back to Turing Alan Turing but I think it really kind of picked up like a decade ago or so. You can apply kind of pressure to politicians so they would educate themselves,

And be kind of more motivated to push through or see through some legislation that would kind of make things safer.

And then finally, yeah, you should, I think it's still useful to actually interact with AI. So you actually know, have some firsthand experience like what they are about. They're becoming less and less tool-like as they advance.

Steve Hsu: Got it. And do you, do you continue to invest in AI companies? or are you, do you find that like not a good thing to be doing?

Jaan Tallinn: Yeah, I do. although like, I mean, I, in general, I've delegated away my investing to you know, dedicated full time team. so I don't spend that much time on it. but sure. Like I, whenever I see an opportunity that I think would make the group safer, I try to try to take that opportunity.

Steve Hsu: Well, Jaan, thanks a lot for your time. I want to keep this short and I hope we can talk again sometime.

Creators and Guests

Host

Stephen Hsu

Steve Hsu is Professor of Theoretical Physics and of Computational Mathematics, Science, and Engineering at Michigan State University.