This section covers large language models and vision-language models applied to chemistry. These differ from chemical language models (ChemBERTa, MoLFormer, etc.) in that they build on general-purpose LLM or VLM backbones rather than learning representations directly from molecular string notations.
Foundation Models & Domain-Specific LLMs
Models built or fine-tuned specifically for chemical reasoning and molecular understanding.
| Year | Paper | Focus |
|---|---|---|
| 2022 | Galactica | Large-scale scientific LLM from Meta AI trained on curated scientific corpora |
| 2024 | ChemLLM | Framework for building chemistry-focused LLMs with structured chemical instruction data |
| 2024 | LlaSMol | Instruction-tuned LLMs (Llama-based) for core chemistry tasks |
| 2024 | Fine-Tuning GPT-3 for Molecular Properties | GPT-3 fine-tuning for molecular property prediction |
| 2024 | Fine-Tuning GPT-3 for Predictive Chemistry | GPT-3 fine-tuning for predictive chemistry tasks (yield, selectivity) |
| 2024 | PharmaGPT | Domain-specific LLMs for pharmaceutical and chemical applications |
| 2025 | ChemDFM-R | Chemical reasoning LLM with atomized step-by-step knowledge decomposition |
Multimodal Models
Models that integrate molecular graphs, images, spectra, or documents with text.
| Year | Paper | Focus |
|---|---|---|
| 2024 | ChemDFM-X | Multimodal foundation model aligning molecular graphs and text |
| 2025 | ChemVLM | Vision-language model for chemical image understanding |
| 2025 | InstructMol | Multi-modal molecular LLM bridging graphs, SMILES, and text for drug discovery |
| 2025 | MERMaid | Multimodal extraction of chemical reactions from scientific PDFs |
| 2025 | Multimodal Search in Chemical Documents | Cross-modal retrieval across chemical documents and reaction diagrams |
Agentic & Tool-Augmented Systems
LLM agents that autonomously plan and execute chemistry workflows using external tools.
| Year | Paper | Focus |
|---|---|---|
| 2023 | Coscientist | Autonomous multi-agent system for chemical research with robotic lab integration |
| 2024 | ChemCrow | LLM augmented with 18 chemistry tools for synthesis planning and safety |
Drug Discovery & Molecular Optimization
LLM-based approaches for drug editing, molecule optimization, and compound QA.
| Year | Paper | Focus |
|---|---|---|
| 2023 | DrugChat | Conversational QA over drug molecule graphs |
| 2023 | LLM4Mol | Using ChatGPT-generated captions as molecular representations |
| 2024 | ChatDrug | Conversational drug editing with retrieval-augmented generation |
| 2024 | DrugAssist | Interactive LLM-guided molecule optimization |
Benchmarks & Evaluation
Datasets and evaluation frameworks for assessing LLM performance on chemistry tasks.
| Year | Paper | Focus |
|---|---|---|
| 2023 | ChemLLMBench | Eight-task benchmark for LLM chemistry capabilities |
| 2023 | Code-Gen Chemistry Assessment | Evaluating chemistry knowledge in code-generation LLMs |
| 2024 | Benchmarking LLMs for Molecular Prediction | Systematic comparison of LLMs on molecular property prediction |
| 2024 | ChemEval | Fine-grained, multi-level evaluation framework for chemistry LLMs |
| 2024 | ChemSafetyBench | Safety-focused benchmark for chemistry LLMs |
| 2025 | ChemBench | Large-scale evaluation comparing LLMs against human chemistry experts |
| 2025 | MaCBench | Multimodal benchmark for chemistry and materials science |
Surveys & Perspectives
Broad reviews and position papers on the role of LLMs in chemistry.
| Year | Paper | Focus |
|---|---|---|
| 2022 | NLP Models That Automate Programming for Chemistry | Early perspective on NLP and code generation for chemical workflows |
| 2024 | Survey of Scientific LLMs in Bio and Chem | Comprehensive survey of LLM applications across biology and chemistry |











