Manifold | Transcript: Steve Hsu Q&A: Complex trait prediction in Genomics, and Genomic Prediction / Embryo Selection

February 3, 2022 • 70 Minutes

Steve Hsu Q&A: Complex trait prediction in Genomics, and Genomic Prediction / Embryo Selection

Steve answers questions about recent progress in AI/ML prediction of complex traits from DNA and applications in embryo selection.

Steve Hsu: Welcome to Manifold. Today, we're going to do something like an ask me anything episode.

But it's going to be based on questions that I receive typically through email. And typically from people who have read my blog or maybe seen video of an online talk that I've given.

I'm going to focus today on questions that have to do with genomics, either polygenic prediction of complex traits, which is an area of research that I work in and which has undergone very rapid advancement in the last few years.

Secondly, the company Genomic Prediction, of which I'm a founder, which has pushed forward the applications of this kind of polygenic prediction.

With me today is my friend Chase, who is something of an expert on both of these topics. He has been asked to be the audience ombudsman, and to ask me questions throughout this presentation or this episode.

And he's been asked for two types of questions, one for the layperson, who really doesn't follow this area at all. And just to try to keep me honest so that at least 80% or so of the content is comprehensible to the layperson. So if I just jump into something without defining terms, he's going to ask me to define the terms. He's going to keep me honest for the lay listener. And he's also going to inject a few questions, which really are for the experts that for someone who really follows this area closely, might need some clarification based on what I said. And he'll jump in and try to get that clarification. Are you with me today, Chase?

Chase: I am.

Steve Hsu: Okay, so we're excited to do this and maybe in the future, we'll do real kind of live AMAs, but this is as close as we're going to get at the beginning, but I've gotten so many questions on these topics over the last few years that I feel there are, there's a fair amount of interest in, in these specific types.

So, let me start with the status of polygenic prediction of complex traits. And I sometimes say this is the real decoding of the genome. Like in the early days, when they talked about genomics and how sequencing the human genome was going to lead to decoding of the genome or decoding of DNA, what they really meant at the end of the day was you give me the DNA of an organism and I tell you what that organism is like. I predict a person's eye color from their DNA sequence. I predict how tall they are. That's largely under genetic control. I predict whether they're high or low risk for heart disease, also largely under genetic control. I predict perhaps even what their IQ score is, their cognitive ability. That's also largely under genetic control.

So it's kind of amazing. This particular area of science has been mostly in the domain of science fiction for a long time, but in the last five to 10 years, there's been really tremendous progress in this area. And really in the past few years, since about 2017, I think there's been an explosion of papers published in this area. And I'll try to give the audience some idea of what has happened in this field of science.

Now I got into this about 10 years ago, and some of you may know that it's somewhat common, it's not uncommon, for people who are in math or theoretical physics to occasionally dabble in, or even invade some other subjects of science.

And of course there, there are many examples of this, even going back to the discovery of this structure of DNA. For example, Francis Crick and Max Delbruck and others were all physicists trained in physics, but then they sort of went into biology and actually in a sense, kind of created the whole field of molecular biology.

Now, 10 years ago, I was thinking a little bit about genomics because I had just read in the popular press, I think in the Economist, I had seen a graph that projected the cost curve for gene sequencing and it looked just like the Moore's law curve, where we were going to get exponential benefits over time. And it would become extremely cheap to read out people's genomes. And I could, just based on that curve, if I accepted the extrapolation of that curve, say 10 years into the future, I could rely on the existence of many, many, perhaps millions of human genomes that were available for things like machine learning and AI.

And so the very first thing that I got interested in was what would be the most efficient algorithms to try to predict, or learn to predict through the study of some training data, complex human trait values for an individual human based on their DNA. And so, you know, the first five or so years when I was working in this subject, only part-time, I wrote some papers with other collaborators in which we used real genetic data, but within simulation models to try to understand the performance of specific algorithms for machine learning, but specifically applied to genomes.

And in those papers, we identified something called a phase transition behavior. A phase transition in physics is when you change some exterior property of environment, say you change the temperature and that causes a dramatic qualitative change in some type of matter. For example, water. If you reduce its temperature below zero degrees Celsius it changes from a liquid to a solid, and that's called a phase transition. Similarly, you can have phase transitions in the behavior of mathematical algorithms. And so what we proved is that a certain well-known phase transition called the Donoho-Tanner phase transition, which was known for a certain set of learning algorithms – this phase transition was actually observed when we use that class of learning algorithms on genomic data.

Although probably very few people who knew their way around human DNA understood what we said in that paper. It made us very confident that given enough data, we would pass through this phase transition and we would be able to build very accurate predictors of highly heritable complex traits. And in particular, for example, we predicted this is about 10 years ago, we predicted that if we had a few hundred thousand genomes and the heights of each of those individuals, that we would be able to build a fairly accurate genomic predictor of human height. And lo and behold, in 2017, we got our hands on a dataset of roughly that size and indeed within a month or two of getting hold of that data and using it at the Michigan State University supercomputing center, we were able to build a fairly accurate height predictor. A predictor, which has standard error of about three centimeters. And that predictor is a mathematical function of about of order 10,000 different locations in the genome.

So they're about 10,000 regions of the genome where I need to know what specific letter you have in that spot. And that is the information that is input into the algorithm. And then the algorithm then predicts your height and the output could be your predicted height is 182 centimeters plus or minus a few centimeters.

And when we published that paper in 2017, people were shocked at the time. People just didn't think what we had done was possible. It was quite a shock. Now, fast forward to 2022. A much larger study involving 5 million genomes, more than 10 times as many genomes as we had access to, has identified at very high statistical significance, all of these 10,000 regions, which affect individual height. And has shown that all of the heritability that was expected, all of the – heritability is a technical term, which, you know, for example, you can estimate from classical genetic methods, like looking at twins that have the same DNA, but maybe were adopted into different families. How alike are they, how different are they on the trait? That allows you to estimate the heritability of the trait.

All of the expected heritability from common genetic variations in height has now been accounted for in the study of 5 million people. And it's all very consistent with what we had found in 2017. So I would say that just to encapsulate it all, people were very surprised that, back in 2017, that with this amount of data, one could build a good height predictor. That result has now been verified in numerous papers that have been published since, and with 5 million heights available, the nail is kind of in the coffin in the sense that we now know with very high confidence where those roughly 10,000 places are in the genome that are controlling human height.

Chase: Wow.

Chase: I was just going to say that that's some pretty amazing progress. So did this paper that recently came out, did it add anything to what you had discovered in 2017, or was it more just adding statistical power to confirm the results you already found?

Steve Hsu: Yeah, that's a great question. So, you can phrase this multiple ways and I think I will just to make it really clear. So our standard error was probably something like just under three centimeters or right about three centimeters. So the standard error would mean we predict your height is 180 centimeters, and roughly two-thirds of the people are plus, or minus three centimeters of that central value, that prediction.

Okay. With their results, they've reduced the standard error a little bit. Probably to more like two and a half centimeters, something like this. If you want to use this technical term called variance accounted for, our predictor got to about 40% variance accounted for, and their predictor has reached 50% variance accounted for.

Our predictor, the correlation between the predicted height and the actual height, was about 0.65. For them, the correlation is more like 0.7. So they've improved the predictor quantitatively, although maybe not qualitatively, but it's definitely improved quantitatively. And now they can say with very high confidence where these 10,000 regions are in the genome that are controlling height. In our machine learning prediction, we could not be absolutely sure that all of the loci used by the predictor were absolutely correct because the predictor, although it's doing a good job, it may have some inaccuracies in it. There might be some region that the predictor thinks is important for predicting the height. But that turns out that's due to some statistical fluctuation in our training data or something.

Now with 10 times, more than 10 times as much, data, they can make sure that they aren't susceptible to those kinds of statistical fluctuations.

So they just have a higher confidence determination of the genetic architecture.

Chase: Steve, I know what many of these predictors, the datasets that are used to construct them often come from a highly European population. With these larger sample sizes, are we able to have any, or were they able to have any significant impact on predictive accuracy for non-European populations?

Steve Hsu: Well, yes somewhat. So I believe that, just to go back to the correlation metrics. So their best prediction capability is for people of European ancestry. And the correlation I mentioned was about 0.7. And I think that when they try it on more distant ancestry groups at a more typical correlation, they might get more like 0.4 or 0.5.

So it's still not as good. You've pointed to one of the biggest outstanding problems in this field, which is, in order to, for example, do a complex trait prediction say in Japanese, do you have to assemble a training cohort of Japanese, which is as large as what we've already done for say height in Europeans?

Or can you leverage what you already know from the European studies to accelerate, to get yourself to a strong predictor among Japanese, without having to assemble just as large a training cohort of Japanese? And the answer is probably yes, we think we can accelerate the process quite a bit. We can leverage what we already know from the European studies, but it's not still not perfectly efficient. There still is a significant drop-off in prediction power. When you try to use a predictor trained in Europeans on a more distant ancestry.

Now, I focused on height because that's a very, it's kind of a convenient phenotype for a number of reasons. It's super heritable. So two twins who have the same DNA, but are raised in different families, assuming they both had good nutrition, they'll tend to be within something like half an inch or an inch of each other in height. That would be the average difference in height.

Steve Hsu: And so it's obviously a super heritable trait and furthermore it's a convenient trait to measure because typically in medical records or all kinds of information, you can even ask people how tall they are and you generally get a fairly accurate answer. Even though people, you know, men always inflate their height a little bit.

So it's a kind of phenotype that's super heritable and it's easy to get the data. And so that was the first thing we characterized that we focused on in order to get the first big result of like, wow, we've pretty much captured the heritability in height. Now, of course, there are many more interesting phenotypes in height and among them are many, many common, but super impactful disease conditions.

Steve Hsu: So these are the main, for example, causes of mortality or loss of quality of life that people experience. They range from diabetes to hypertension, to heart disease, to various cancers. And they are to varying degrees heritable. So you can estimate their heritability in a number of ways. The most simple way is that, you know, conditional on your relative having disease X, how much does that elevate your chances of having disease X? And from statistics like that,

we can estimate heritability for disease risks.

Chase: Steve, I'm curious. Can you maybe give an example of a disease that's highly heritable and then one that is perhaps not as heritable.

Steve Hsu: Yeah. Great question. So there is a type of diabetes called Type 1 diabetes that manifests very early in life. It manifests often when people are still children or adolescents. It's a very serious kind of diabetes. And that is super heritable. I think the heritability of that is something like 80% or something.

So 80% of the variance in Type 1 D risk is genetic in nature. And we do, just as a side remark, have now very strong Type 1 diabetes risk predictors. And interestingly, they depend on of order a hundred different locations in the genome. So the genetic architecture is much simpler than height, which depends on more like 10,000 regions of the genome.

So Type 1 diabetes is an example of a super heritable, common disease risk. Much less heritable would be well even just Type 2 diabetes. So that's diabetes that happens to you much later in life. That is much more sensitive to environmental conditions. So, you know, whether you exercise and how you eat, I think has much more impact on Type 2 diabetes.

However, it is still substantially heritable. And, and most diseases, typically the risk is somewhere between 30 and 60% heritable. So less heritable than height, but still heritable enough. I guess the practical way I would describe it is for almost all common diseases now, we can identify outliers. So we can identify people based on their polygenic score, who either are extremely low risk, unusually low risk, or unusually high risk. Those are outliers. Or kind of in the middle, like typical of the normal population in risk. And those kind of very crude buckets of people, we can now identify individuals just from their DNA. We can figure out kind of which bucket they're in.

Chase: You published a paper about breast cancer in which you looked at unusually high-risk patients and kind of examined the cost-effectiveness of screening for them and how that might improve treatment outcomes. Can you talk a bit about it?

Steve Hsu: That is one of my favorite examples because it is so clear-cut. So almost everybody is probably familiar with the idea that there are specific genetic mutations called BRCA mutations, which predispose women for breast cancer. Okay. And so, for example, you may remember Angelina Jolie when she discovered that she was, because of this kind of mutation, high risk for breast cancer, she had a radical mastectomy.

And so people are already kind of aware that there can be a genetic basis for high breast cancer risk. However, if you, once you look into the details, the most widely understood aspect of it is specific single-gene mutations that are not polygenic in nature. They're just a single mutation. Once we identified that it's in the BRCA gene, we might know, oh my gosh, people who have this, they have many times the normal risk for breast cancer and they should be on the lookout. They should be treated differently. They should get mammograms much earlier in life than ordinary risk women, et cetera.

Now you can ask what fraction of the female population are carriers of these risk-enhancing BRCA genetic mutations. And the answer is depending on which ancestry group you're talking about, you're talking about one in a thousand women. Or maybe few per thousand women. So it's a very small subset of the population.

In some sense, you could say Angelina Jolie in, in that, in the BRCA sense, was quite unlucky because she got a very impactful negative BRCA mutation. So it's a tiny portion of the female population, but it gets a lot of attention because it's a very dramatic situation and it's now been understood very well in a medical sense. And there's a standard way to deal with women who are diagnosed based on genotyping as BRCA carriers. So that's all very established stuff. That's, you know, now decades-old medical science.

The new advance, which we wrote about, is that you can ask, well, let's suppose I set aside these BRCA mutations. Let's look at the 99.9% or 99.7% of the female population who do not have any of these known BRCA mutations. And let's see if we can predict their breast cancer risk, but using common polygenic variance. So basically much more common mutations that are present in the population at say 1%, 5%, 10% level. But we may have to add up the impact of thousands of these individual common genetic variants in order to estimate the total breast cancer risk. And of course that can only be done through machine learning. It can't be done by a human brain. It's gotta be done by machine learning on many tens of thousands of women who have had breast cancer and for whom we have their genomes. And then some controls may be hundreds of thousands of controls who are women who did not have breast cancer in their lives. And we have their genomes. And the AI or the machine learning algorithm is looking at all this data to try to figure out what is to build a predictor for polygenic breast cancer risk.

And so now that's been successfully done and it's been published by multiple groups and that predictor is quite powerful and can also identify outliers for, say positive risk outliers, so women who are high risk for breast cancer. And interestingly, you can ask the following question, how big is the set of women who are high risk, say of equivalent risk to a BRCA carrier for breast cancer, but they don't have the BRCA variant. They're getting the risk from their polygenic distribution of genes. And it turns out that high-risk population of women is about 10 times, about an order of magnitude, larger than the BRCA carrier population. So it's more like a percent or a few percent of the female population that we can identify.

So there's this much bigger population which really should be treated the same way as the BRCA carriers. But because the technology that is used to identify them is so new. It's only a few years old now. Not even a few years, just maybe two years old. It still hasn't been incorporated into medical science.

Chase: Steve, can you maybe talk a bit about the absolute magnitudes of risk we're talking about here? What the average risk of breast cancer for an average women, and then what does that for a woman who is unfortunate enough to have one of these BRCA mutations?

Steve Hsu: So, I hope I don't get this wrong, but I think that for a woman who, let's suppose she's negative for the BRCA variants, and we don't know anything else about her. So she's just random in the population-- other than we know she's not a BRCA carrier. Then I think her lifetime risk, I might get this wrong, but I think it's something like 10% lifetime risk. And they would tend to get it much later in life. Someone who has one of the more impactful BRCA variants could be over 50%. And they would tend to get it much younger in life. So you're talking.

Chase: Wow. So like five times more higher risk.

Steve Hsu: Yeah, exactly. And similarly we can identify from polygenics. So, now take that population, which in aggregate maybe has ballpark kind of 10% lifetime risk for breast cancer. But now look for the ones in that population that have a high polygenic score, according to the algorithm. And those women can have 50% risk as well. This is a relatively recent result, but I think they can also have on average earlier onset.

So, it's an incredibly powerful thing. And I think, I didn't answer this part of your question earlier, which is what's the cost-benefit on all this? We did this very simple calculation where we just said, well, suppose that eventually, medical science figures out that this technology is, as I described it, it works as I just described in the past few minutes. And they say, well, wait a minute, why don't we just genotype all women and that will help us identify this larger group of women who are high risk for breast cancer. And then we can treat them the way we treat the BRCA carriers. We can give them early diagnosis, early monitoring, and we save a lot of money by doing that because you can treat these things, these cancers if you catch them early.

So what's the cost-benefit on that? And what we estimated is that the money you save by catching these non-BRCA, but nevertheless high polygenic risk women early, you can save as-- and this is using estimates that are not from us, but are in the published medical literature-- you can save enough money to pay for the genotyping of all the women in your society.

So it's a very favorable way to modify medical practice. And I think eventually you will see it becomes standard, standard of care, to genotype all women in order to figure out who are the ones at high risk. Not just for breast cancer, but of course there are many other diseases too, and then save a bunch of money by using that risk information to give them better treatment and diagnosis.

Chase: So, if we were to recommend this type of treatment for, you know, for all women or for perhaps all Americans or all people, would you have to go back in to be regional typed each time you want to look at a new disease or is it kind of a one and done?

Steve Hsu: It's one and done. That's the amazing part of this. If you did this and then the breast cancer savings paid for it, paid for that genotyping, the additional information you get, because you're going to read out the whole genome now. That could then create, in the accounting sense, just pure profit for all the other diseases that you gain some advantage on by knowing the risks for those things.

And a similar story for men where you talk about testicular cancer and prostate cancer, obviously applies there. So I think a serious public health analysis of the cost benefit of basically universal genotyping, I think would come back with a strongly positive number.

I should say that in the UK, the NHS is looking seriously into this. So the NHS, as, you know, as a national health care system. And so it's a single-payer system, so they have the proper incentive to want to invest money maybe early in life to save money in the system as a whole, because it's one unitary system, unlike our crazy patchwork of insurance companies and things like this.

So the NHS is actually studying what will be the cost-benefit impacts and treatment impacts of early genotyping of large chunks of the population. I think the first target population they're going to genotype is something like 5 million Brits.

Chase: Wow. Steve, if listeners are interested in this idea, perhaps simply for their own health, is this something that you can go and get done today and look at your predisposition to some of these diseases?

Steve Hsu: You know, the data sets that are used to build all these predictors so far are just using inexpensive gene array, genotyping, the same kind of genotyping that is used by 23andMe or Ancestry. So consequently, if you then go and get your Ancestry or 23andMe genotype, based on that information, if you had the raw genotype, you could then compute-- or someone, a bioinformaticist could help you-- compute all your risk scores and you could utilize all these predictors that we're describing. And I think there are a few companies that are springing up to do this for, you know, adults, people, not embryos, but people that are already living. So there are companies springing up to do this. It's a little bit of a wild west right now. There's no company that I know well enough that I would endorse them.

In a moment we're going to talk about embryo selection. And in that case, I know exactly what the Genomic Prediction bioinformatics team does and how they validate predictors that are in the published scientific literature and decide they're good enough to use in the embryo context. Something similar will have to be done in the adult health system context, something that NHS for example, is going to do in the UK, or is in the process of doing. So it will all mature, but there's this old saying, which is really a terrible saying in biomedical research scientific research, they'll say, hey, a breakthrough in the lab takes a decade before it makes it into your GP's office or your surgeons clinic or something like that. Unfortunately things don't move as fast as we would like them to move.

Chase: That's too bad. That's too bad. It's a good thing. The UK is assembling all those data. It sounds like many of these predictors are actually trained on, on their data set.

Steve Hsu: Yeah. UK is a real pioneer in all this, They have big large-scale projects like the UK biobank and this NHS project that I just described to you.

Let me turn to another topic, which I get a lot of email about, which is genomic prediction of cognitive ability. So obviously people are very concerned about this. It's kind of a hot button issue. It is true that If you look at two identical twins raised in different families, they tend to have very similar IQs.

And in some ways, if you just look at the heritability or the variance accounted for, it's very similar to height in terms of how similar the twins are, even though they were raised in different families and subsequent studies have confirmed that cognitive ability is highly heritable. So that raises the question, well can you predict it? And so in these early papers that we wrote 10 years ago, we said if you could get about a million individuals where you had their genotype, just there say if effectively they're 23andMe, or Ancestry level array genotype, what's called a SNP genotype.

If you had that data, and then you had a good cognitive score for each individual, like you have their military IQ test, or you had their SAT scores or something like this. From that data, we predicted 10 years ago that one would be able to build a reasonably accurate cognitive predictor. And by reasonably accurate, we meant something like the standard error might be like 10 points of IQ.

So you wouldn't necessarily be able to tell someone who's 110 from 100, but you would be able to tell someone who's 150 from 100 with some confidence.

And, you know, we made that prediction at the same time, in the same set of papers, where we made the height prediction. And I get emails, fairly frequently, people saying, well, you guys solved heights, why have we not solved cognitive ability? And it's a decent question. I'm sorry to say... Well, I'm not sorry to say, but the answer is simply that we don't have a data set in which we have well-measured cognitive scores for a million genotype people. If we did, I'm very confident, still very confident, and even more confident because our predictions were spot on for height, that we would be able to build a good cognitive predictor. We just don't have a data set of that type.

What we do have is some smaller data sets where we maybe have a hundred thousand people with cognitive scores. Or, alternatively, we have much larger data sets where you have millions of people and we have something which is correlated to their cognitive ability, but it's not actually their cognitive ability. So we might have number of years of education – how highly educated is the person.

From that kind of data, you can still build a cognitive predictor. It's not as good as what we expect to be able to do given much better data, but we can build a predictor which correlates with the actual cognitive score, with correlation of about 0.4. So it's not nothing, but it's not super accurate prediction. It can identify people who are at unusual risk of having say a very low IQ. It could also identify people who are embryos, for example, that are at unusual probability of having a super high IQ. So it can again pick out outliers. But it can't do the kind of detailed prediction that we can currently do with height.

Can you give me a sense of the difference between today's predictors and where we could get to, if we had perhaps a million well phenotyped belonging to people who have taken these SAT tests or who have taken the military IQ tests? Where are we at today and where could we get to, if we have that data set?

Steve Hsu: So I think the current best predictors have a correlation of about 0.4 between the predicted cognitive ability and the actual measured cognitive ability. And with say a million genotypes, I would predict you could get somewhere up to some correlation, like 0.6 or 0.7. So, significantly stronger. And maybe with a standard error, as small as 10 IQ points.

Chase: I think some of our listeners will probably be fairly familiar with the concept of embryo selection, where you pick among a number of embryos for some weighted combination of traits, often disease risk. If you were to say select among 10 embryos today, for an IQ score, how much of a gain would you expect to see?

Steve Hsu: Before I answer your question, Chase, let me just emphasize that most of my research in recent years has been focused on health risks in polygenic scores, not on cognitive ability. And the company Genomic Prediction does not provide IQ related or cognitive ability related scores in its embryo report. We focus entirely on improving health risks.

Now, to answer your question, this has been studied quite a bit in literature. I think the earliest paper I've seen was by the Oxford philosopher, Nick Bostrom, and his collaborator, Carl Schulman, who is an effective altruist. Maybe that's the best way to describe him. Or at least rationalist.

Then there was some followup work by this mysterious internet genius called Gwern who did even more extensive calculations. In the published literature, scientific literature, there's a paper in which the senior author is Shai Carmi, who's a professor of statistical genetics at Hebrew University in Israel.

And then even more recently, if you look at the last episode I recorded with James Lee, who is a behavioral geneticist at the University of Minnesota. For his Wall Street Journal editorial on embryo selection, he also repeated these calculations.

And the results are pretty consistent. So roughly speaking, if you're selecting best of 10 embryos with the current level of cognitive ability prediction, you might get three or four IQ points per generation.

If you reach this higher level of say a predictor that correlates perhaps 0.6 with actual phenotype, you might get something somewhat higher. I would guess maybe five to seven IQ points per generation. Of course, at that point, it's getting to be pretty significant and in just a couple of generations of that kind of selection, you, you start to produce a population, which is, in some sense, the overlap with the existing populations of humans starts to become relatively small.

Now you mentioned embryo selection. So I guess let me shift gears because the second, the other topic that we were going to discuss is the company Genomic Prediction, which among other things, allows parents to do embryo selection, who are the parents who are going through IVF.

So let me give a little bit of an update on that company. So, Genomic Prediction was founded, I guess, about four years ago. It's gone through several rounds of venture capital investment. It's pretty successful company right now. It works with hundreds of IVF clinics on six continents.

So no matter where you are, you can find a clinic that works with Genomic Prediction. And I think it's fair to say, the company would say, and I think it's fair to say, that it is the world leader in advanced genetic testing of embryos.

Let me say a little bit about IVF for people who are not experts on this, or have not been through it themselves or know somebody who's been through it.

So IVF means in vitro fertilization. It's a procedure that typically is contemplated by families, parents that are having fertility problems. So usually the reason for that is that the mother is older and maybe her ovarian reserve is impacted, or, you know, for whatever reason, they're having trouble conceiving.

The way the process works in IVF is that the mother is given some hormone treatment, which causes her to overproduce eggs in her cycle, reproductive cycle. And then using a very simple procedure, then those eggs are extracted. Younger women who go through this stimulation could produce 10, 15, even more embryos per cycle. Older women, it really depends exactly on the situation, but might produce three or five or something. Much, much fewer. Now once you have those eggs, it has become standard after the eggs are fertilized, they become an embryo and then it's allowed to grow to a certain size. It's become standard to freeze the embryos in liquid nitrogen.

And that is found not to harm them. The thawed embryos, after they're taken out of liquid nitrogen, worked fine. And by freezing them, I think one of the main reasons originally to freeze them or the reasons it's turned out to be good to freeze them is because it gives the mother's body some time to recover from the hormone stimulation.

And it improves the success rates if they wait, say at least a month or something after the first harvesting of the eggs before they actually do the transfer of the embryos back into the mother.

So because there's that freezing process, and because there's that typically, you know, a more than a month wait, it has become quite common now for a small biopsy of a few cells to be taken from the embryo. The embryos typically let's say 50 to a hundred cells at the point where it's frozen, it's already been allowed to grow to that point. A small biopsy of cells, which are taken actually not from the part of the embryo, which is going to become the child, but from the part of the embryo, that's going to become the placenta but has the same DNA as a child. Those cells can be biopsied without harming, as far as we can tell, without harming the embryo. And it has become quite common now to do some level of genotyping using the DNA, which can be extracted from those few cells. So that's now a common practice. So in the United States, well over 60% of all families going through IVF will do some kind of genetic testing using this process that I just described.

And of course, during the period of time where they're waiting for the mother's body to recover, and they've got the embryo frozen, they can look at the results of the genetic testing. They have plenty of time to make a decision about which embryo that they want to use. And so what we call the embryo selection problem: most couples going through IVF face this problem of selecting one of a number of embryos that they are going to transfer. And now that embryo selection problem can be approached in a much more sophisticated manner because you have the actual genotype of each of the embryos. So it's a totally different situation than prevailed just a few years ago. Now people know about Genomic Prediction, mainly because we're the only company in the world right now that can give you a full-blown genotype of your embryo, a whole-genome, a genotype of your embryo, from which one can compute all of these disease risks and polygenic trait predictors.

I wanted to say a word about another aspect of the genotyping, which is less well known, but in some sense at this moment is maybe the, by far more impactful and that's something called pre-implantation genetic testing, PGT for aneuploidy. So it's also known as PGT-A, pre-implantation genetic testing for aneuploidy.

And that is by far the most common type of genetic testing that families go through. And aneuploidy just means abnormal chromosome structure. The most familiar kind, being something called trisomy 21, where you have an extra copy of chromosome 21, and that leads to down syndrome. But there are many other ways in which there can be a problem with the chromosomes.

And it turns out one of the major causes for an embryo not to successfully implant or to not have a successful pregnancy is a problem with the chromosome structure. So it's very common now for embryos to be tested, to undergo PGT-A. And the number of embryos that go through PGT-A worldwide is already in the millions per year. As I said, it's well, over 60% of all families going through IVF in the United States will do PGT-A.

So now you might think that a company that can do a full-blown whole-genome genotype of the embryo might be able to do PGT-A more accurately than older methods, and that is indeed the case.

So one of the things that I think is most impactful about what Genomic Prediction has developed is a more accurate PGT-A screen. And in a recent study, which was done in a clinic near Seattle, It's in Kirkland, Washington, it's called Poma. This clinic has a very scientifically minded laboratory director named Dr. Klaus Wiemer. And he decided without telling us, without telling Genomic Prediction, that he was going to take 3000 embryos resulting from families going through that clinic, and for the PGT-A screening, he allocated the embryos effectively, randomly between our lab, which uses the Genomic Prediction PGT-A screen, and the older and entities, which use the older technology and what he found.

And these results were just announced at the annual American Society for Reproductive Medicine meeting, which happens every fall. So this was the 2021 ASRM meeting. Wiemer gave his talk and presented his results and amazingly the success rate for families that used our PGT-A screen, as opposed to the old technology, our success rate was something like 73% probability of successful pregnancy per transfer versus a number more like 50 to 60% using the older technology.

And the reason for that is because we are much more accurate in determining whether there is actually aneuploidy in a particular embryo. So we have fewer false positives and fewer false negatives. A false positive is where the tests suggest that there was a problem with the embryo, even though there isn't. And so you, you end up electing, not to use a healthy embryo, which decreases your success rate. Or it could be a false negative. A false negative means that your test says there is not a chromosome problem with this embryo, but in fact there is, and then when you transfer it, it fails to implant.

So it turns out our test, according to his data, has a significantly lower false-positive rate and a lower false-negative rate, and consequently a much higher success rate per transfer.

Just to clarify for the audience here success here is defined as a pregnancy that results in live birth.

Steve Hsu: Yes, I think that's right. So I should double-check and look at the paper for exactly how he defined a successful pregnancy. It certainly means that the pregnancy has proceeded past a certain point. I can't remember whether that's exactly the live birth number or it's something short of that. It could be just slightly short of that.

But nevertheless, the delta between our number, which was I think 73%, and the other numbers, which were in the fifties and sixties, is pretty significant. It's significant at like 95% confidence level. But yes, I think if most families could, if they knew that they could get to a 73% success rate, I think it's called clinical pregnancy success rate, per transfer they would really want that. Because normally I think people are used to thinking of IVF as being much more risky than that, a much lower success rate. And this is quite a good clinic. Obviously, it's a very scientifically minded clinic, Poma.

So the reason I mentioned all that is that, you know, this is, this is one of these things where we didn't set out to build a better PGT-A screen. We just assumed PGT-A was a kind of solved problem. We couldn't imagine it was actually difficult to figure out which embryos have a chromosome abnormality and which ones don't. And so we set out to do something completely different, which was to get this whole genotype, which would then allow us to compute all these health risks that we've been discussing. But just as a side effect, because the thing we built, the genotyping platform we built is so accurate and it's so extensive, it's covering, you know, it's measuring a million different regions of the genome, it's getting very precise information about a million different places in your genome. It becomes very easy for us to do PGT-A better than the old technology. And that might potentially have a larger impact in substance, at least right now than polygenic screening. And the reason I say that is because, if you increase the success rate from like 55% to say 73%, if you did that worldwide, I think I estimated back of the envelope, you would get hundreds of thousands of more babies born every year through IVF. So.

Chase: And this, the study looked at, this is the per transfer number, right? So how many transfers did the typical IVF couple go through?

Steve Hsu: This varies a lot by the, you know, the age of the couple and, and things like this. But the eventual success rate difference, see a lot of couples get discouraged or because of financial pressures they just stop after a certain number of cycles. Right. And so, you know, you have to make a more detailed estimate to get a precise number. But I think ballpark number is clearly if say our screen became universally used, I think you'd get on the order of hundreds of thousands of additional babies every year. If you just change, not change nothing else, but just switched everybody who was using the old technology to the newer technology. My estimate is something like hundreds of thousands of new babies per year, additional babies per year.

Chase: Wow. That's gigantic.

Steve Hsu: I think so. I think the, if I there's some effective altruists listening to this podcast. Because this is just a matter of making clinics aware that there's a better mousetrap out there that actually improves the key performance metric of any IVF clinic, which is the success rate of producing a pregnancy from each IVF cycle.

So, anyway, that was about PGT-A, which surprised everyone because we thought PGT-A is a solved problem. We were confident that our tests were going to be at least as good as the preexisting one. But we discovered thanks to this work by Klaus, Dr. Wiemer, to our surprise, our thing is qualitatively better. And the more people that know about that, obviously the better.

Now the exact same pipeline, the same biopsy, the same amplification of DNA, the same genotyping process that produces the PGT-A classification of the embryos, then can also be used to calculate things like breast cancer risk, heart attack risk, for 10 or 20 different common diseases, including even some psychiatric conditions like schizophrenia, for example, is highly heritable, major depression, and things like this.

Altogether, if you look at the set of conditions that can be predicted using these polygenic risk predictors, it covers almost everything, all the major diseases now, which impact not just life expectancy, but even the number of quality-adjusted years of life.

Chase: Steve, you mentioned a couple, a couple of those psychiatric. How about things like heart disease or cancers?

Steve Hsu: So there are good polygenic risk predictors for multiple cancers for both genders, heart disease, Type 1 and Type 2 diabetes, hypertension, hypothyroidism. It's not surprising that every major disease would eventually get a decent polygenic predictor because, as I said, the heritability varies between say 30 and 60% for most common diseases. And then because the diseases are common, there are a fair number of people in any big data set that you assemble, there are some people that have the condition. They're called cases. And given enough cases and enough controls, then the AI training or the ML training, will be able to build a predictor for that phenotype. So it's not surprising.

Now, one thing about having so many different traits or disease risks that can be predicted is that the doctors actually demanded of us because the report that we can produce for each batch of embryos is so detailed. The doctors demanded that there be a kind of single goodness metric that would just make their life easier in advising each patient on which embryo. How to prioritize the embryos for transfer. And so under their impetus, we produced an index, a kind of health score. And the logic behind the health score is very natural. So you basically say, let me compute for each of the major diseases, based on the genetic information for the embryo, what is the absolute risk that that embryo is going to have this condition?

And then we look up in the medical literature, in the public health literature, what is the lifespan impact or the quality-adjusted life span impact, life expectancy impact, of that disease condition. And that number is the weighting of each of the risks that go into our health index. So we sum over the set of disease conditions, we have an absolute risk for each disease.

And then the weighting is the sort of the badness of it, how much it shortens your life or worsens your life.

Chase: So just to clarify what this looks like. If I have one embryo that has a 10% risk of hypertension and then 10% risk of Type 1 diabetes, I know that the Type 1 diabetes risk is more important because that disease is more impactful to its health outcomes.

Steve Hsu: Yes. And so the relative weighting of the risk is different depending on what the public health literature says are the consequences of having that disease. So in logical terms, it's really the natural thing that you would do for ranking these embryos against each other. And it is what the doctors really wanted. That's the number that they want to emphasize in their conversations with the patients.

And so we built that index and it has a number of really interesting properties. So, I think the typical number, if you have a batch of maybe five or 10 embryos, this is something we're still doing more detailed research on, but I think very roughly, you're talking about something like a gain of like one year of life expectancy from that selection compared to say random selection of the embryos.

So if you live, if you think of the dollar value of that, that's quite significant. Like what would you pay to ensure that your child got an extra year of healthy life or something like this? You know, It's a pretty significant effect.

We discovered that contrary to many expectations of negative or competitive pleiotropy. So that means that trying to make you healthier on one disease risk makes you less healthy on another, right? So we, oh, we try to reduce your diabetes risk, but that ends up increasing your hypertension risk or something.

It turns out it doesn't work that way. It turns out there is some mild, positive internal correlation of the disease risks within the index. But basically, kind of give you a free lunch so that you don't have a zero-sum competition amongst these different health risks. You have a slightly reinforcing, positive, sum set of correlations. And that was a bit of a surprise for us.

Chase: I find that surprising as well. Do we have any idea of why evolution would have optimized us to have higher overall disease risk? Like, do we know have any idea of what these genes that seem to increase risk of multiple diseases are doing?

Steve Hsu: Well I think that the reason is that there are some-- this is, again, this is a subject of ongoing research and we haven't published our results on this. So, and, and what I'm about to say is somewhat speculative. So don't hold me to it. But it seems to be there some major systems that are affected by lots of different regions of your genome and those major systems then affect multiple disease conditions.

So that you can have an embryo, which is low risk across multiple different disease conditions at the same time. And you're not having to trade-off risk. In other words, there's no inevitability that trying to make you low risk for diabetes, therefore makes you high risk for heart disease or something like that.

There's nothing like that going on. To first approximation, these risks are largely independent of each other. And then to the extent that they're correlated, we're generally finding mild, positive correlations, meaning that you can lower all the risk, or risk across an entire cluster of disease conditions simultaneously.

Does that help?

Chase: Yeah. Yeah, I mean, it's surprising, but that's pretty amazing.

Steve Hsu: It's a little bit like the general factor of intelligence. So people at the beginning would, would have been surprised that: conditional on having high verbal ability, you also are more likely to have high math ability as opposed to the opposite, right? People often say, oh little Johnny is really good with words, so he won't be good with numbers. But actually, conditional on being good with words, it slightly raises the probability that you're also good with numbers.

And so just as there turns out to be a general factor of intelligence, there probably is some kind of general, if not general factor of overall health, general factor of cardio heart, lung system health, or general factor of metabolic health. You know what I'm saying? So again, this is preliminary stuff, but this Is what we're seeing through detailed analysis of our indices and some other researchers that study prediction of longevity using polygenic scores also, I think see some similar.

Chase: Is there any thought to perhaps just simply select for longevity or, or quality-adjusted life span, instead of looking at all these individual diseases and then aggregating them together into a health index?

Steve Hsu: We are currently studying the relationship between a polygenic score which is entirely trained to predict longevity and the index that we built through different means. So it could turn out that they're important similarities between those two different kinds of predictors.

But maybe some differences as well. We're trying to understand that. This is the subject of ongoing research right now.

Chase: Wow, exciting stuff. You did mention earlier that when you select against aids diseases, there is a lifespan impact. In other words, the embryo with the lowest way to disease risk would also tend to live longer. So it sounds like there is probably some positive relationship there.

Steve Hsu: Yeah, well that we're certain of, we're certain that there is a significant overlap between our lifespan impact weighted sum of risks over many diseases. So that's what we currently use. We give that score to physicians. It's called the embryo health score. There clearly is overlap between that thing or correlation between that thing and the polygenic score, which is trained just to predict longevity.

Chase: You know, Steve this conversation reminds me of one of the common criticisms I've heard of this field. Many people are concerned, I think, reasonably so that we seem to have two problems, both of which seem to be pushing in the same direction of exacerbating inequality through this technology.

One is that these polygenic predictors are not quite as accurate in non-European populations. And the other one is simply that in order for people to access PGT-A or any other form of polygenic screening, they have to pay for not only the cost of the testing but the cost of IVF itself.

And as you were alluding to earlier, the costs of IVF is a serious barrier for many parents, even in wealthy countries like the United States. Can you talk about your vision for how we might be able to lower the barrier and lower the barrier to entry for people who are interested in using this screening for their children.

Steve Hsu: Yeah, those are two great questions. On the different ancestry front. So as we discussed earlier, there's a lot more data from European ancestry samples. And so the prediction power, it's much further along for those ancestry groups. And I think that inequality is just a very clear inequality that needs to be ironed out through more research spending, focused on assembling large research cohorts in these other ancestry groups.

And so, for example, I know that for east Asians, like in Japan and Taiwan and China, there's really quite a lot of data that's going to come online. And I think that problem is going to be solved for that ancestry group. But for some other ancestry groups, it may take time and we really should prioritize this. I mean, it's a very clear instance of, you know, cross ancestry. I don't want to say cross-racial, but cross ancestry inequality is the focus of scientific research. So, we need to fix that and I think we will fix it just by focusing on gathering samples from these less represented ancestry groups.

That one's very, you know, you can tackle that one in a very concrete way. You can literally throw money at it and solve that problem.

The second question of access, you know, to some extent it's not avoidable because any new technology, when first introduced tends to be first utilized by the rich. But on the other hand, the argument goes, if you're a technical optimist or a free market optimist, but as rich people get into it, and it becomes a big market, the whole thing gets cheaper and cheaper and eventually it becomes accessible to the whole.

So the techno-optimist would say, oh, eventually this will become so cheap, it will become standard of care. And even, even very, nasty systems like the U.S. healthcare system eventually will become available to all Americans.

Now, I'm not quite that optimistic. I actually, you know, having some center-left political values in my past, I would say I would not mind if the government noticed that it actually saves money by making it free by paying for this kind of embryo selection and just makes it free for everybody.

And there are examples of countries where IVF is part of the public health care system, like in Denmark and Israel, various countries like that, that have a national healthcare system. Some of them have IVF as one of the benefits. And so you could go one step further and say, well, if, for example, I think it's true. We haven't discussed it in detail, but I think embryo selection actually pays for itself in having healthier people in the population. Eventually, you could have the government just say it's part of our healthcare system. It's free. We are not going to force you to do it. We can't force you to do it, but if you want to do it, we will pay for it.

And, it's probably a win-win if you look at the numbers. So I don't think we have to be super pessimistic about exacerbating inequality through these technologies, but I do think it's likely that in the short run we will exacerbate inequalities like for one generation or something, but then eventually in the long run, I think it could be solved.

Chase: It sounds like there's potentially two different questions here for government subsidies. One is, do you want to subsidize PGT-P testing for couples already doing IVF? And then the other one is, do you want to subsidize IVF and PGT-P for couples who don't have fertility issues or who don't otherwise need the IVF? And it sounds like there may be a different sort of cost calculus there.

Steve Hsu: Absolutely. So, that's another great question. So, you know, I think the barrier obviously is higher because like per cycle of IVF, you're talking significantly more money than just adding the testing on the screening on top of IVF. I think it'll be longer before any health system says, oh, you don't have a fertility problem. You were not going to use IVF, but we'll pay for you to use IVF. I think that's going to be a while in the making. It will also be a while in the making before rich people who don't need IVF say, oh, but I want to do it so I can do embryo selection. There might be some cases of that going on already now, but it's going to be pretty rare.

I mean, it's, it's currently pretty rare. So, you're totally right. That there are two separate problems here. I think in the long run they can both be solved. The other thing to keep in mind is IVF is not an intrinsically inexpensive process. It's super expensive in the U.S. just because of the way healthcare works in the United States. But if you look at the per cycle cost of IVF in countries like Taiwan and South Korea, which are, you know, have very advanced medical systems, but just, they're not as a, shall we say profit-driven as the U.S. system, the cost per cycle of IVF is a fraction of what it is here.

So it really doesn't require major surgery or anything like that. It's some hormone treatments, monitoring of the mother-to-be, an extraction, which I think can be done by a medical technician or nurse, not actually a surgeon or anything like that. And even the transfer, although it's a high skill kind of operation, it's nothing like a surgery or anything like that. So I think actually the economics of it, of just IVF itself if it becomes more widely adopted, it will become much, much cheaper.

Chase: Wow. So the solution to reducing the inequality impacts of AMBIO selection is to fix the U.S. healthcare system.

Steve Hsu: Oh, you ask this is a whole different topic, but yeah, the U.S. healthcare system is just totally broken. It's sort of like twice or maybe even three times as expensive in terms of GDP per cap or in terms of costs relative to GDP per capita than other systems. And it doesn't really seem to deliver better outcomes.

Now, I think for people who with really acute conditions near the end of life, our system is better. But for just general outcomes, life expectancy, and all that stuff, we don't really seem to deliver better outcomes. And we spend a lot more.

Chase: You mentioned earlier that South Korea and Taiwan have cheaper IVF options available to them. How much is a cycle of IVF in say South Korea versus the United States?

Steve Hsu: I could be way off on this, because the last time I looked into this was like, maybe when I saw some actual numbers, it was like a decade ago. But I seem to recall at the time it was easily three times cheaper there than here. And with similar success rates per cycle. So it wasn't lower quality.

Chase: Any reason people don't simply travel to other countries to do IVF?

Steve Hsu: There is a fair amount of IVF tourism. There's actually an increasing amount of medical tourism in general. So for example, some countries in Asia have become destinations for medical tourism because they have good healthcare systems, well-trained doctors and the cost structure's just lower.

I'll say, just to reveal something colorful to you when, Laurent-- Laurent Tellier is the CEO of and co-founder of Genomic Prediction. When he and I were doing some background research before starting the company, we visited a bunch of IVF clinics in Thailand and Kuala Lumpur in Malaysia. These are modern, you know, completely state-of-the-art facilities with huge waiting rooms, full of people.

The volumes that they do are big. And it seems like the quality is just as good as what you get in the United States. And they were enormously cheaper and people would travel there from, I think primarily people traveled there from China, but, you know, from all over the world, people would travel there to do IVF.

Chase: So it sounds like it's already happening.

Steve Hsu: It's already happening. And I think it's just gonna continue because these kinds of imbalances and costs tend to drive this kind of behavior.

So looking at my list of questions that I noted down from the emails and such that I received, I think we've covered pretty much anything. Do you have any last topic that you want to cover? We're over an hour so we could quit at any time, but if you have any last thing that you want to go over, we can do it.

Chase: Well, there's a million questions about this topic and I think endless alleys to go down. I'm curious, do you have any thoughts about where Genomic Prediction is heading in the future? Is PGT-P already seeing widespread adoption? You mentioned earlier that you know, PGT-A seems like it's, it's over half of all IVF cases, at least in the U.S. use it. Is PGT-P seeing that kind of uptake or where are we at?

Steve Hsu: We're still in early days for PGT-P, but we are seeing pretty rapid uptake and we've even done some surveying of couples because we obviously work with so many clinics. We can get data on what their customers think about certain things. And the attitudes are generally pretty positive about it. There doesn't seem to be any kind of squeamishness. All the squeamishness we encounter is typically with journalists and a few activists who don't like it. Journalists like to write sensationalist, clickbaity kind of articles. But when you talk to IVF doctors or genetic counselors or parents going through the process, the parents are obviously the most important individuals in this whole thing. The attitudes are generally pretty positive. So I would state with high confidence that in some number of years, you know, maybe three to five years, I would say this is going to become completely common in the IVF world.

Chase: It sounds like the biggest, no-brainer ever. I mean, you have, you already have embryos to choose from. You have no other criteria on which to choose them. Why not choose the one that has the lowest odds of developing some terrible disease in their lifetime?

Steve Hsu: Yeah. So the way that you just expressed it, Chase, is the perfect, you know, simple explanation. It's well, parents are already making an embryo selection, but they're making it on basically zero information. They might be making it just on the look of what the embryo looks like under the microscope. The embryologist says, oh, number three, looks good. Number three should become your daughter. I mean, I mean, really that's the best we can do. And then somebody else says, well, you know, I can inexpensively get you the entire genotype of each of these embryos. And then we can do all these calculations for you and, and advise you, and you don't even have to take our advice. But at a very modest cost, which is a fraction of the amount that you're paying to do the IVF cycle, we can get you all this information. Seems like, as you just said, the biggest no-brainer ever.

So Chase, if we meet again in a few years, maybe five years and do this podcast, I think we'll say, well, as we predicted, this has been very, very widely adopted and people are scratching their heads about why there was any controversy over it in the first place.

Creators and Guests

Host

Stephen Hsu

Steve Hsu is Professor of Theoretical Physics and of Computational Mathematics, Science, and Engineering at Michigan State University.