RL-Tuned Generation

This group covers models that use reinforcement learning to steer molecular generators toward desired property profiles. The common thread is reward-shaped optimization of a base generative policy, whether RNN, Transformer, GAN, or graph-based.

Paper	Year	Base Model	Key Idea
REINVENT	2017	RNN	Augmented episodic likelihood for goal-directed SMILES generation
ORGAN	2017	SeqGAN	RL reward functions for tunable drug-likeness, solubility, and synthesizability
DrugEx v2	2019	RNN	Pareto ranking and evolutionary exploration for multi-objective design
Link-INVENT	2019	RNN	Molecular linker design with flexible multi-parameter RL scoring
MolecularRNN	2019	Graph RNN	Atom-by-atom generation with valency rejection sampling and policy gradients
Curriculum Learning	2019	REINVENT	Sequential task decomposition accelerating convergence on complex objectives
Memory-Assisted RL	2020	REINVENT	Scaffold memory penalizes repeated solutions, increasing diversity fourfold
DrugEx v3	2021	Graph Transformer	Scaffold-constrained generation with 100% validity via adjacency-based encoding
Augmented Hill-Climb	2022	RNN	Hybrid RL strategy improving sample efficiency ~45x over REINVENT
REINVENT 4	2024	RNN + Transformer	Open-source framework unifying RL, transfer learning, and curriculum learning

All Notes

Computational Chemistry

Pareto front plot for multi-objective optimization alongside DrugEx v2 explorer-exploiter architecture

DrugEx v2: Pareto Multi-Objective RL for Drug Design

DrugEx v2 introduces Pareto-based multi-objective optimization and evolutionary exploration strategies into an RNN reinforcement learning framework for de novo drug design toward multiple protein targets.

Computational Chemistry

Diagram showing how memory-assisted reinforcement learning explores multiple local maxima in chemical space compared to standard RL

Memory-Assisted RL for Diverse De Novo Mol. Design

Introduces a memory unit that modifies the RL reward function to penalize previously explored chemical scaffolds, substantially increasing the diversity of generated molecules while maintaining relevance to known active ligands.

Computational Chemistry

Molecular graph being built atom-by-atom with BFS ordering and property optimization bars

MolecularRNN: Graph-Based Molecular Generation and RL

Proposes MolecularRNN, a graph recurrent model that generates molecular graphs atom-by-atom with 100% validity via valency-based rejection sampling, then shifts property distributions using policy gradient reinforcement learning.

Computational Chemistry

Architecture diagram showing ORGAN generator, discriminator, and objective reward with lambda interpolation formula

ORGAN: Objective-Reinforced GANs for Molecule Design

Proposes ORGAN, a framework that extends SeqGAN with domain-specific reward functions via reinforcement learning, enabling tunable generation of molecules optimized for druglikeness, solubility, and synthesizability while maintaining sample diversity.

Computational Chemistry

REINVENT pipeline showing Prior, Agent, and Scoring Function with augmented likelihood equation

REINVENT: Reinforcement Learning for Mol. Design

Introduces a policy-based reinforcement learning method that fine-tunes an RNN pre-trained on ChEMBL SMILES to generate molecules with specified desirable properties, using an augmented episodic likelihood that anchors the agent to its prior while optimizing a user-defined scoring function.

Computational Chemistry

Bar chart showing Augmented Hill-Climb achieves up to 45x sample efficiency over REINVENT

Augmented Hill-Climb for RL-Based Molecule Design

Proposes Augmented Hill-Climb, a hybrid RL strategy for SMILES-based generative models that improves sample efficiency ~45-fold over REINVENT by filtering low-scoring molecules from the loss computation, with diversity filters to prevent mode collapse.

Computational Chemistry

Line chart showing curriculum learning converges faster than standard RL for molecular generation

Curriculum Learning for De Novo Drug Design (REINVENT)

Introduces curriculum learning to the REINVENT de novo design platform, decomposing complex drug design objectives into simpler sequential tasks that accelerate agent convergence and improve output quality over standard reinforcement learning.

Computational Chemistry

Bar chart comparing RNN and GPT architectures with SMILES and Graph representations on desirability scores

DrugEx v3: Scaffold-Constrained Graph Transformer

DrugEx v3 extends scaffold-constrained drug design by introducing a Graph Transformer with adjacency-matrix-based positional encoding, achieving 100% molecular validity and high predicted affinity for adenosine A2A receptor ligands.

Computational Chemistry

Schematic of Link-INVENT architecture showing encoder-decoder RNN with reinforcement learning scoring loop

Link-INVENT: RL-Driven Molecular Linker Generation

Link-INVENT is an RNN-based generative model for molecular linker design that uses reinforcement learning with a flexible scoring function, demonstrated on fragment linking, scaffold hopping, and PROTAC design.

Computational Chemistry

Horizontal bar chart showing REINVENT 4 unified framework supporting seven generative model types

REINVENT 4: Open-Source Generative Molecule Design

Overview of REINVENT 4, an open-source generative molecular design framework from AstraZeneca that unifies RNN and transformer generators within reinforcement learning, transfer learning, and curriculum learning optimization algorithms.

All Notes#

All Notes