Computational Chemistry
DrugChat architecture showing GNN encoder, linear adaptor, and Vicuna LLM for conversational drug analysis

DrugChat: Conversational QA on Drug Molecule Graphs

DrugChat is a prototype system that bridges molecular graph neural networks with large language models for interactive, multi-turn question answering about drug compounds. It trains only a lightweight linear adaptor between a frozen GNN encoder and Vicuna-13B using 143K curated QA pairs from ChEMBL and PubChem.

Computational Chemistry
Pipeline diagram showing natural language chemistry questions flowing through fine-tuned GPT-3 to chemical predictions across molecules, materials, and reactions

Fine-Tuning GPT-3 for Predictive Chemistry Tasks

Jablonka et al. show that fine-tuning GPT-3 on natural language chemistry questions achieves competitive or superior performance to dedicated ML models across 15 benchmarks, with particular strength in low-data settings and inverse molecular design.

Computational Chemistry
Visualization of Galactica corpus composition and benchmark performance comparing Galactica 120B against baselines

Galactica: A Curated Scientific LLM from Meta AI

Galactica trains a decoder-only Transformer on a curated 106B-token scientific corpus spanning papers, proteins, and molecules, achieving strong results on scientific QA, mathematical reasoning, and citation prediction.

Computational Chemistry
SMolInstruct dataset feeding into four base models for chemistry instruction tuning

LlaSMol: Instruction-Tuned LLMs for Chemistry Tasks

LlaSMol fine-tunes Mistral, Llama 2, and other open-source LLMs on SMolInstruct, a 3.3M-sample instruction tuning dataset covering 14 chemistry tasks. The Mistral-based model outperforms GPT-4 and Claude 3 Opus across all tasks.

Computational Chemistry
PharmaGPT two-stage training from domain continued pretraining to weighted supervised fine-tuning with RLHF

PharmaGPT: Domain-Specific LLMs for Pharma and Chem

PharmaGPT is a suite of domain-specific LLMs (13B and 70B parameters) built on LLaMA with continued pretraining on biopharmaceutical and chemical data, achieving strong results on NAPLEX and Chinese pharmacist exams.

Computational Chemistry
Three-stage progression from task-specific transformers through multimodal models to LLM chemistry agents

Transformers and LLMs for Chemistry Drug Discovery

A review chapter tracing three stages of transformer adoption in chemistry: task-specific single-modality models (reaction prediction, retrosynthesis), multimodal approaches bridging spectra and text, and LLM-powered agents like ChemCrow for general chemical reasoning.

Computational Chemistry
Bar chart showing GPT-4 relative performance across eight chemistry tasks grouped by understanding, reasoning, and explaining capabilities

ChemLLMBench: Benchmarking LLMs on Chemistry Tasks

A comprehensive benchmark evaluating GPT-4, GPT-3.5, Davinci-003, Llama, and Galactica on eight practical chemistry tasks, revealing that LLMs are competitive on classification and text tasks but struggle with SMILES-dependent generation.

Computational Chemistry
Bar chart comparing GPT-3 ada and GNN accuracy across molecular classification tasks

Fine-Tuning GPT-3 for Molecular Property Prediction

This paper fine-tunes GPT-3’s ada model on SMILES strings for classifying electronic properties (HOMO, LUMO) of organic semiconductor molecules, finding competitive accuracy with graph neural networks and exploring robustness through ablation studies.

Computational Chemistry
Bar chart comparing small and big foundation models surveyed across property prediction, MLIPs, inverse design, and multi-domain chemistry applications

Foundation Models in Chemistry: A 2025 Perspective

This perspective from Choi et al. reviews foundation models in chemistry, categorizing them as ‘small’ (domain-specific, e.g., property prediction, MLIPs, inverse design) and ‘big’ (multi-domain, e.g., multimodal and LLM-based). It surveys pretraining strategies, key architectures (GNNs and language models), and outlines future directions for scaling, efficiency, and interpretability.

Computational Chemistry
Diagram showing the CaR pipeline from SMILES to ChatGPT-generated captions to fine-tuned RoBERTa predictions

LLM4Mol: ChatGPT Captions as Molecular Representations

Proposes Captions as Representations (CaR), where ChatGPT generates textual explanations for SMILES strings that are then used to fine-tune small language models for molecular property prediction.

Computational Chemistry
Bar chart showing vision language model performance across chemistry tasks including equipment identification, molecule matching, spectroscopy, and laboratory safety

MaCBench: Multimodal Chemistry and Materials Benchmark

MaCBench evaluates frontier vision language models across 1,153 chemistry and materials science tasks spanning data extraction, experimental execution, and data interpretation, uncovering fundamental limitations in spatial reasoning and cross-modal integration.

Computational Chemistry
Conceptual diagram showing natural language prompts flowing into code generation for chemistry tasks

NLP Models That Automate Programming for Chemistry

Hocky and White argue that NLP models capable of generating code from natural language prompts will fundamentally alter how chemists interact with scientific software, reducing barriers to computational research and reshaping programming pedagogy.