The classical approach to medicinal chemistry starts from something known. You take a hit, look at what structural features correlate with activity, then systematically modify the scaffold — a methyl group here, a fluorine there, a different heterocycle — until you've built a compound that satisfies enough criteria to move forward. It's iterative. It's systematic. And it's fundamentally limited by the imagination and chemical intuition of the people doing it.
Generative molecular design changes the starting premise. Instead of modifying what exists, you specify what properties you want and ask the model to construct a molecule that satisfies them. The question is whether the molecules it produces are actually useful, or whether they're just chemically valid strings with no path to a real vial.
What Generative Models Are Actually Doing
There are several technical approaches to generative molecular design — variational autoencoders operating on SMILES strings, graph neural networks that generate molecular graphs atom-by-atom, diffusion models that operate in 3D coordinate space — but they share a common structure. You encode a molecule into a latent representation, the model learns to map regions of that latent space to molecular structures, and then you can navigate that space to find structures with desired properties.
The multi-property optimization aspect is what makes this genuinely useful. In traditional SAR, optimizing for one property often degrades another — improving potency by increasing lipophilicity runs into ADMET problems, improving selectivity by adding steric bulk hurts permeability. With a generative model coupled to property predictors, you can define constraints across all properties simultaneously and ask the model to find the Pareto frontier — the set of compounds where improving one property isn't possible without degrading another.
For CAI-014, our FGF21 receptor agonist program, we ran a generative design campaign with five simultaneous constraints: predicted binding affinity above a threshold, molecular weight under 480 Da, cLogP between 1.0 and 3.0, predicted microsomal stability above 30 minutes half-life in rat liver microsomes, and no CYP2D6 inhibition flags. Starting from a known FGF21 mimetic peptide scaffold that we'd converted to a small molecule seed, the model generated approximately 2,400 candidate structures over three days of compute.
The Synthesis Filter
Of those 2,400 structures, a medicinal chemist reviewed all of them — which took about four days — and flagged 340 as synthetically accessible within a reasonable number of steps. The rest were either structurally unusual in ways that would require exotic chemistry, contained functionality that would be difficult to handle at scale, or simply had connection patterns that didn't correspond to anything you could reasonably make with standard transformations.
This is the part of generative chemistry that the papers underreport. The models generate structures that are valid SMILES. Valid SMILES does not mean synthesizable. It doesn't mean stably isolable. And it doesn't mean that even if you can make it, you can make it in a form that's analytically pure enough to assay reliably.
Of the 340 flagged as feasible, 88 were prioritized for synthesis based on docking scores against the FGF21 receptor complex model and predicted selectivity versus related receptors. From those 88, we've now synthesized 62. Experimental binding data on those compounds has been fed back into the generative model for a second-round design cycle.
Where Generative Design Actually Adds Value
The honest application of generative design isn't replacing traditional SAR — it's exploring novel chemical space that traditional SAR wouldn't reach. If you're working a kinase target with 50 known inhibitors in the literature and a well-characterized binding pocket, generating millions of scaffold-hopped analogs via virtual screening of existing databases is probably more efficient than running a generative model.
Where generative design earns its place is targets where the known chemical space is thin, where existing scaffolds have IP problems, or where the multi-property optimization landscape is genuinely complex. Novel target classes where you don't have a natural starting series. Cases where you need significant structural novelty for IP reasons. Situations where three or four important properties are in genuine tension and you want the computer to find solutions humans might not think to look for.
The most realistic version of this technology is a tool that gives chemists better starting points, not one that eliminates the need for chemists. The models are good at efficiently searching chemical space. They are not good at the judgment calls — which synthetic route to propose, which assay artifact to suspect, when to break a design rule because the biology demands it. Those still require people who understand the full context of a drug discovery program.