Discussion about this post

User's avatar
Jasper Götting's avatar

Really nice post, Niko, and a pretty wild result. After the Evo 2 paper, I thought viable genome generation was still further out. However, I think you're glossing over the downsides a tad too quickly.

You correctly point out that this result has biosecurity folks worried since Evo 2 is an open-weights model and many viruses of concern are only a few kb longer than ΦX174.

But your bottom line from that is, to me, a little bit disconnected: "Although the risk of training on human viruses seems troubling, the real barriers to moving from phages to larger organisms are data and atoms."

It is no doubt correct that larger organisms are much much more challenging to generate and synthesize, but that has little to do with the aforementioned worries about human viruses—and I do worry a lot at this point!

I think that a method that was able to de novo generate very sequence-divergent, viable viruses warrants a pretty thorough dual-use discussion, especially after we've seen what a little bit of genetic distance and a few kb more did in late 2019...

Expand full comment
halvorz's avatar

"Of course, this paper will probably be alarming to folks in the biosecurity community. The authors point out that Evo 2 excludes human viruses from its pretraining data"

This is unfortunately not entirely true, at least based on my reading of the methods section of the Evo 2; they *tried* to exclude human viruses but in fact I believe the training set included a large amount of viral sequence. This new result is quite impressive but I really wish they had consulted a bit more widely before releasing it. I don't think at this point it is an immediate biosecurity threat but (as someone who is typically fairly dismissive of AI biosecurity concerns) this paper is genuinely concerning to me.

Expand full comment
6 more comments...

No posts