AI Designs 100 Quadrillion Proteins in a Single Run
Authors: Elizabeth Wood
Imagine searching for a perfect key to a lock. The lock is a receptor on a cancer cell. The number of keys you need to test is larger than the number of atoms in the observable universe. That is protein design in a nutshell. And a team at JURA Bio in Boston just cracked it — not on a screen, but in a test tube.
Their method, called Variational Synthesis, produces 10^17 unique protein sequences in a single chemical reaction. One hundred quadrillion molecules. Everything humanity had ever synthesized in protein engineering labs before this amounts to a rounding error by comparison.
Why Proteins Are So Hard to Engineer
Protein — a molecular machine assembled from a chain of amino acids. Its three-dimensional shape dictates its function: some proteins digest food, others attack viruses, others catalyze chemical reactions.
Designing proteins is among the most ambitious challenges in biotechnology. You want an antibody that locks onto a tumor marker? An enzyme that operates at 95 °C? A peptide that trains the immune system to recognize a specific pathogen? In theory, the amino acid sequence determines everything. In practice, the space of possible combinations is so vast that brute-force search is physically impossible.
Before Variational Synthesis, two approaches dominated. The computational route: AI models like AlphaFold and ProGen generate promising sequences on a computer. The catch is that these designs exist only as files. Synthesizing and testing each one costs tens to hundreds of dollars per gene. A million variants? A billion? No budget on Earth suffices.
The laboratory route: build a library of random mutations and screen them. It works, but most random mutations produce junk — they rarely land in functional regions of protein space.
Randomness as a Design Tool
Elizabeth Wood and her team at JURA Bio found a third path. The core insight: DNA synthesis is inherently a stochastic process. At every step, a nucleotide attaches to the growing chain at random. Traditionally, this is treated as a limitation. Variational Synthesis turns it into an advantage.
Stochastic synthesis — a process where each step is governed by probability rather than a rigid algorithm. A single synthesis run produces trillions of unique DNA molecules, even though every strand passes through the same reaction vessel.
Here is how it works. A generative AI model trains on real protein data — 290 million antibody sequences, for example. But instead of outputting a list of promising sequences, it outputs the parameters of a chemical reaction: precise nucleotide ratios at each synthesis step. These ratios are tuned so that the stochastic process produces molecules whose distribution matches the training data.
The result: one synthesis run yields 10^17 unique DNA molecules in a tube, each encoding an AI-designed protein. Not virtually. Physically.
Antibodies, Enzymes, Vaccines — One Pipeline
The team demonstrated Variational Synthesis on three protein classes. First, human antibodies. The model trained on 290 million CDR-H3 sequences — the hypervariable loop that determines what an antibody binds to — and generated a library that matched or outperformed IgLM, a leading protein language model, across quality metrics.
CDR-H3 (Complementarity-Determining Region H3) — the most variable segment of an antibody’s heavy chain, responsible for binding specificity. This is where «fitting the key to the lock» happens at the molecular level.
The method proved equally effective on DNA polymerases — enzymes critical for diagnostics and sequencing — and on HLA-presented peptides, the fragments displayed on cell surfaces that let the immune system spot infected cells. Same architecture, three very different protein families.
Cost? The authors estimate a trillion-fold reduction in synthesis expenses. What would have cost approximately $10^15 (more than the combined GDP of every nation) now costs a single lab experiment.
Medicines in Months, Not Decades
Traditional protein-based drug development takes 10–15 years. Protein engineering is just one phase, but it is the bottleneck: finding a molecule that works among an astronomical number of candidates. Variational Synthesis attacks this bottleneck directly.
If one reaction can yield quadrillions of candidates, and high-throughput screening can sift the best, the development cycle compresses dramatically. Meanwhile, in Shanghai, Hong Liang’s team at Jiao Tong University has built the Venus series of models trained on 9 billion protein sequences. Their approach already shortens R&D from 2–5 years to 6–12 months — without the scale of Variational Synthesis.
The fields most likely to benefit first: oncology antibodies, industrial enzyme engineering, and vaccine components. Anything that requires searching enormous variant space.
The Fine Print
Scale impresses, but scale is not everything. The paper was published in Nature Biotechnology in 2026 after peer review; a preprint had been available since September 2024. Even with that stamp of rigor, hard questions remain. The first question: of a quadrillion molecules in the tube, how many actually function? Variational Synthesis ensures that sequences resemble real proteins. Resemblance and function, however, are different things. A protein must fold correctly, bind its target, avoid triggering an immune response in the body, and remain stable during storage. None of these properties follow automatically from a plausible sequence.
A second limitation stems from the stochastic nature of the synthesis itself. You cannot specify which exact molecules end up in the tube. You define a distribution, not a roster. For screening tasks, this is a strength. For tasks requiring reproducibility of a specific molecule, it is a potential complication.
Third, JURA Bio is a commercial company. Their business model depends on the method proving useful in practice. Publication in Nature Biotechnology is a strong signal of scientific rigor, but long-term value will be determined by clinical trial outcomes, not library size.
From Bench to Pharmacy
The earliest practical applications will likely be diagnostic enzymes and research reagents — areas where rapid exploration matters and regulatory barriers are lower. Therapeutic antibodies will still require the standard clinical pipeline — Phase I, II, III — and the time savings there remain to be demonstrated.
Consider the progression. AlphaFold taught us to predict protein structure. ProGen taught us to generate new sequences on screen. Variational Synthesis completes the chain: from idea to physical molecule in one reaction. The gap between digital design and biological reality — the central obstacle in protein engineering — just narrowed.
The question is no longer whether we can design proteins at scale. The question is what to do with a quadrillion possibilities.
Frequently Asked Questions
How does Variational Synthesis differ from AlphaFold?
AlphaFold predicts the three-dimensional structure of a protein from its sequence. Variational Synthesis solves the inverse problem: it creates brand-new sequences that never existed before and manufactures them physically in a single reaction. AlphaFold is an analysis tool; Variational Synthesis is a production tool.
Could this method create a cancer drug?
Potentially, yes. One reaction can yield quadrillions of antibody variants, from which candidates that bind specific tumor markers can be selected. But the path from an antibody library to an approved drug still includes clinical trials that take years.
Why is it cheaper than conventional synthesis?
Conventional protein synthesis is a one-at-a-time operation: each sequence must be assembled individually at a cost of tens to hundreds of dollars per gene. Variational Synthesis uses a single chemical reaction whose stochastic nature automatically generates trillions of unique variants. The cost reduction is roughly a trillion-fold.
What does «stochastic synthesis» mean in practice?
At every step of DNA synthesis, one of four nucleotides attaches to the chain at random. Normally, this randomness is considered a source of error. Variational Synthesis harnesses it: the AI model tunes the attachment probabilities so that the trillions of random molecules emerging from the reaction encode exactly the proteins it intended.
When will drugs built with this method reach patients?
Diagnostic enzymes and research reagents could appear within 1–2 years. Therapeutic proteins will require standard clinical trials, putting them on a 5–10 year horizon even if the design phase is radically accelerated.
References
Original
Related Articles
AI Finds 25 Rare-Earth-Free Magnets in 67,000
AI-curated database of 67,000 magnetic materials reveals 25 high-temperature alternatives to rare-earth magnets for EVs.
Bacteria That Eat Tumors: Cancer Therapy
Engineers gave Clostridium sporogenes quorum sensing — bacteria find tumors, wait for backup, then destroy cancer from within.
AI Conquered Mathematics in Two Years
AI went from failing high school math to IMO gold in 24 months. AlphaProof, AlphaEvolve, Terence Tao — the profession is transforming.