Molecular Representations
Bar chart showing CLM architecture publication trends from 2020 to 2024, with transformers overtaking RNNs

Systematic Review of Deep Learning CLMs (2020-2024)

PRISMA-based systematic review of 72 papers on chemical language models for molecular generation, comparing architectures and biased methods using MOSES metrics.

Molecular Representations
Diagram showing the t-SMILES pipeline from molecular graph fragmentation to binary tree traversal producing a string representation

t-SMILES: Tree-Based Fragment Molecular Encoding

t-SMILES represents molecules by fragmenting them into substructures, building full binary trees, and traversing them breadth-first to produce SMILES-type strings that reduce nesting depth and outperform SMILES, DeepSMILES, and SELFIES on generation benchmarks.

Molecular Representations
Taxonomy of transformer-based chemical language models organized by architecture type

Transformer CLMs for SMILES: Literature Review 2024

A comprehensive review of transformer-based chemical language models operating on SMILES, categorizing encoder-only (BERT variants), decoder-only (GPT variants), and encoder-decoder models with analysis of tokenization strategies, pre-training approaches, and future directions.

Molecular Representations
Diagram showing sequence-to-sequence translation from chemical names to SMILES with atom count constraints

Transformer Name-to-SMILES with Atom Count Losses

This paper applies a Transformer sequence-to-sequence model to predict SMILES strings from chemical compound names (Synonyms). Two enhancements, an atom-count constraint loss and SMILES/InChI multi-task learning, improve F-measure over rule-based and vanilla Transformer baselines.

Predictive Chemistry
Bar chart comparing Transformer-CNN RMSE against RF, SVM, CNN, and CDDD baselines

Transformer-CNN: SMILES Embeddings for QSAR Modeling

Transformer-CNN extracts dynamic SMILES embeddings from a Transformer trained on SMILES canonicalization and feeds them to a TextCNN for QSAR modeling, achieving strong results across 18 benchmarks with built-in LRP interpretability.

Molecular Generation
Bar chart comparing Char-RNN and Molecular VAE on validity and novelty metrics

VAE for Automatic Chemical Design (2018 Seminal)

This foundational paper introduces a variational autoencoder (VAE) that encodes SMILES strings into a continuous latent space, allowing gradient-based optimization of molecular properties. Joint training with a property predictor organizes the latent space by chemical properties, and Bayesian optimization over the latent surface discovers drug-like molecules with improved QED and synthetic accessibility.

Molecular Representations
Horizontal bar chart showing X-MOL achieves best performance across five molecular tasks

X-MOL: Pre-training on 1.1B Molecules for SMILES

X-MOL applies large-scale Transformer pre-training on 1.1 billion molecules with a generative SMILES-to-SMILES strategy, then fine-tunes for five molecular analysis tasks including property prediction, reaction analysis, and de novo generation.

Molecular Representations
Bar chart showing retrieval accuracy of chemical language models across four SMILES augmentation types

AMORE: Testing ChemLLM Robustness to SMILES Variants

Introduces AMORE, an embedding-based retrieval framework that evaluates whether chemical language models can recognize the same molecule across different SMILES representations. Results show current models are not robust to identity-preserving augmentations.

Molecular Generation
Diagram showing back translation workflow with forward and reverse models mapping between source and target molecular domains, augmented by unlabeled ZINC molecules

Back Translation for Semi-Supervised Molecule Generation

Adapts back translation from NLP to molecular generation, using unlabeled molecules from ZINC to create synthetic training pairs that improve property optimization and retrosynthesis prediction across Transformer and graph-based architectures.

Computational Chemistry
Bar chart comparing LLM, DeBERTa, GCN, and GIN performance on three OGB molecular classification benchmarks

Benchmarking LLMs for Molecular Property Prediction

Benchmarks large language models on six molecular property prediction datasets, finding that LLMs lag behind GNNs but can augment ML models when used collaboratively.

Predictive Chemistry
Bar chart comparing fixed molecular representations (RF, SVM, XGBoost) against learned representations (MolBERT, GROVER) across six property prediction benchmarks under scaffold split

Benchmarking Molecular Property Prediction at Scale

This study trains over 62,000 models to systematically evaluate molecular representations and models for property prediction, finding that traditional ML on fixed descriptors often outperforms deep learning approaches.

Computational Chemistry
Hierarchical pyramid showing ChemEval's four evaluation levels from basic knowledge QA to scientific knowledge deduction

ChemEval: Fine-Grained LLM Evaluation for Chemistry

ChemEval is a four-level, 62-task benchmark for evaluating LLMs across chemical knowledge, literature understanding, molecular reasoning, and scientific deduction, revealing that general LLMs excel at comprehension while chemistry-specific models perform better on domain tasks.