This section covers models that generate novel molecular structures. Notes are organized into subsections by approach:
- Autoregressive Generation covers models that produce SMILES tokens sequentially, from early RNNs and LSTMs through pre-trained transformers (Chemformer, GP-MoLFormer), state-space models (S4), and semi-supervised methods.
- RL-Tuned Generation covers reinforcement learning pipelines that optimize generative policies toward multi-parameter property objectives (REINVENT, DrugEx, Link-INVENT, ORGAN).
- Target-Aware Generation covers models conditioned on protein targets, binding pockets, or 3D structural constraints for structure-based drug design.
- Latent-Space Generation covers VAEs and gradient-based optimization in continuous molecular latent spaces (seminal VAE, Grammar VAE, LIMO, LatentGAN).
- Search-Based Generation covers genetic algorithms and training-free mutation strategies (STONED) that serve as baselines and alternatives to learned generative models.
- Evaluation, Benchmarks & Surveys covers benchmark suites (GuacaMol, MOSES, PMO), scoring frameworks (MolScore, FCD), docking benchmarks, failure analysis, and surveys of the molecular generation field.