Computational Chemistry
Taxonomy of transformer-based chemical language models organized by architecture type

Transformer CLMs for SMILES: Literature Review 2024

A comprehensive review of transformer-based chemical language models operating on SMILES, categorizing encoder-only (BERT variants), decoder-only (GPT variants), and encoder-decoder models with analysis of tokenization strategies, pre-training approaches, and future directions.

Computational Chemistry
Diagram showing sequence-to-sequence translation from chemical names to SMILES with atom count constraints

Transformer Name-to-SMILES with Atom Count Losses

This paper applies a Transformer sequence-to-sequence model to predict SMILES strings from chemical compound names (Synonyms). Two enhancements, an atom-count constraint loss and SMILES/InChI multi-task learning, improve F-measure over rule-based and vanilla Transformer baselines.

Computational Chemistry
Bar chart comparing Transformer-CNN RMSE against RF, SVM, CNN, and CDDD baselines

Transformer-CNN: SMILES Embeddings for QSAR Modeling

Transformer-CNN extracts dynamic SMILES embeddings from a Transformer trained on SMILES canonicalization and feeds them to a TextCNN for QSAR modeling, achieving strong results across 18 benchmarks with built-in LRP interpretability.

Computational Chemistry
Bar chart comparing Char-RNN and Molecular VAE on validity and novelty metrics

VAE for Automatic Chemical Design (2018 Seminal)

This foundational paper introduces a variational autoencoder (VAE) that encodes SMILES strings into a continuous latent space, allowing gradient-based optimization of molecular properties. Joint training with a property predictor organizes the latent space by chemical properties, and Bayesian optimization over the latent surface discovers drug-like molecules with improved QED and synthetic accessibility.

Computational Chemistry
Horizontal bar chart showing X-MOL achieves best performance across five molecular tasks

X-MOL: Pre-training on 1.1B Molecules for SMILES

X-MOL applies large-scale Transformer pre-training on 1.1 billion molecules with a generative SMILES-to-SMILES strategy, then fine-tunes for five molecular analysis tasks including property prediction, reaction analysis, and de novo generation.

Computational Chemistry
Bar chart showing retrieval accuracy of chemical language models across four SMILES augmentation types

AMORE: Testing ChemLLM Robustness to SMILES Variants

Introduces AMORE, an embedding-based retrieval framework that evaluates whether chemical language models can recognize the same molecule across different SMILES representations. Results show current models are not robust to identity-preserving augmentations.

Computational Chemistry
Diagram showing back translation workflow with forward and reverse models mapping between source and target molecular domains, augmented by unlabeled ZINC molecules

Back Translation for Semi-Supervised Molecule Generation

Adapts back translation from NLP to molecular generation, using unlabeled molecules from ZINC to create synthetic training pairs that improve property optimization and retrosynthesis prediction across Transformer and graph-based architectures.

Computational Chemistry
Bar chart comparing LLM, DeBERTa, GCN, and GIN performance on three OGB molecular classification benchmarks

Benchmarking LLMs for Molecular Property Prediction

Benchmarks large language models on six molecular property prediction datasets, finding that LLMs lag behind GNNs but can augment ML models when used collaboratively.

Computational Chemistry
Bar chart comparing fixed molecular representations (RF, SVM, XGBoost) against learned representations (MolBERT, GROVER) across six property prediction benchmarks under scaffold split

Benchmarking Molecular Property Prediction at Scale

This study trains over 62,000 models to systematically evaluate molecular representations and models for property prediction, finding that traditional ML on fixed descriptors often outperforms deep learning approaches.

Computational Chemistry
Hierarchical pyramid showing ChemEval's four evaluation levels from basic knowledge QA to scientific knowledge deduction

ChemEval: Fine-Grained LLM Evaluation for Chemistry

ChemEval is a four-level, 62-task benchmark for evaluating LLMs across chemical knowledge, literature understanding, molecular reasoning, and scientific deduction, revealing that general LLMs excel at comprehension while chemistry-specific models perform better on domain tasks.

Computational Chemistry
Stylized visualization of protein-ligand docking and benchmark performance bars across five drug targets

DOCKSTRING: Docking-Based Benchmarks for Drug Design

DOCKSTRING bundles an AutoDock Vina wrapper, a 260K-molecule docking dataset across 58 protein targets, and pharmaceutically relevant benchmarks for regression, virtual screening, and de novo design.

Computational Chemistry
Scatter plot showing molecules ranked by perplexity score with color coding for task-relevant (positive delta) versus pretraining-biased (negative delta) generations

Perplexity for Molecule Ranking and CLM Bias Detection

This study applies perplexity, a model-intrinsic metric from NLP, to rank de novo molecular designs generated by SMILES-based chemical language models and introduces a delta score to detect pretraining bias in transfer-learned CLMs.