> it appears to have solved the inverse folding problem for many proteins While ...

elcritch · on Dec 19, 2020

> While this is true, DeepFold’s algorithm is only applicable to extant proteins having enough evolutionary information,

Didn't the challenge include predicting previously unseen/unsolved proteins? Based on that, I would wager that what DeepFold learned is an evolutionary "language" that maps out a large useful subset of the entire possibility space. Natural evolution tends build upon previous successes so it seems probable evolution has mapped out a fairly useful "language" of patterns for useful protein shapes. Especially given only a portion of the shape is critical to function for many proteins.

But agreed, de novo protein design based on DeepFold's successes probably will outperform a naive approach by orders of magnitude. But why wait if even a naive approach is already orders of magnitude better than current methods?

Either way, I'm excited to see who comes up with the first custom protein(s) to catalyze industrial processes! Get some yeast/bacteria to mass produce a protein based alternative to platinum catalyzer's for fuel cells using an active site with organically available (and cheap) metals. Design half of it to stick to a polymer so it coats nicely. Bam, no more trying to get some weird polymer/perovskite with the right properties. Not sure mRNA is needed at that point versus CRISPR, but maybe it's more effective.

flobosg · on Dec 19, 2020

> Didn't the challenge include predicting previously unseen/unsolved proteins?

All protein sequences in the competition lacked a published solved structure, but they had enough effective (remotely) homologous sequences to predict coevolution-derived interresidue distances and contacts. All those sequences were already present in databases and therefore within the known protein universe.

> Based on that, I would wager that what DeepFold learned is an evolutionary "language" that maps out a large useful subset of the entire possibility space.

This might be true, depending on how “foldable” the unexplored space is. A reverse DeepFold would give some clues on that.

> Natural evolution tends build upon previous successes so it seems probable evolution has mapped out a fairly useful "language" of patterns for useful protein shapes

This is already known. The structural and functional diversity we see in existing proteins originated from a relatively limited repertoire of conserved protein domain folds, or even subdomain-sized fragments in some cases.

> Either way, I'm excited to see who comes up with the first custom protein(s) to catalyze industrial processes!

Same here! Especially since the de novo design of enzymes has been progressing slowly but steadily.

elcritch · on Dec 19, 2020

> but they had enough effective (remotely) homologous sequences to predict coevolution-derived interresidue distances and contacts.

Ah, I see what you're saying a bit better. Makes more sense to how the the possible "design" space for novel proteins could be limited. Designing novel active sites could be especially tricky, more than it'd seem at first glance. A folded structure that'd effectively transfer electrons from one target species to another in a catalyzer protein could likely be outside the explored "vocabulary" of extant proteins as it'd require specialized pathways and precise positioning. Chlorophyll is pretty unchanged in evolution as I understand it.Thanks, interesting background! I'll keep an eye out for the de novo design algorithms.