This section covers large language models and vision-language models applied to chemistry. These differ from chemical language models (ChemBERTa, MoLFormer, etc.) in that they build on general-purpose LLM or VLM backbones rather than learning representations directly from molecular string notations.

Foundation Models & Domain-Specific LLMs

Models built or fine-tuned specifically for chemical reasoning and molecular understanding.

YearPaperFocus
2022GalacticaLarge-scale scientific LLM from Meta AI trained on curated scientific corpora
2024ChemLLMFramework for building chemistry-focused LLMs with structured chemical instruction data
2024LlaSMolInstruction-tuned LLMs (Llama-based) for core chemistry tasks
2024Fine-Tuning GPT-3 for Molecular PropertiesGPT-3 fine-tuning for molecular property prediction
2024Fine-Tuning GPT-3 for Predictive ChemistryGPT-3 fine-tuning for predictive chemistry tasks (yield, selectivity)
2024PharmaGPTDomain-specific LLMs for pharmaceutical and chemical applications
2025ChemDFM-RChemical reasoning LLM with atomized step-by-step knowledge decomposition

Multimodal Models

Models that integrate molecular graphs, images, spectra, or documents with text.

YearPaperFocus
2024ChemDFM-XMultimodal foundation model aligning molecular graphs and text
2025ChemVLMVision-language model for chemical image understanding
2025InstructMolMulti-modal molecular LLM bridging graphs, SMILES, and text for drug discovery
2025MERMaidMultimodal extraction of chemical reactions from scientific PDFs
2025Multimodal Search in Chemical DocumentsCross-modal retrieval across chemical documents and reaction diagrams

Agentic & Tool-Augmented Systems

LLM agents that autonomously plan and execute chemistry workflows using external tools.

YearPaperFocus
2023CoscientistAutonomous multi-agent system for chemical research with robotic lab integration
2024ChemCrowLLM augmented with 18 chemistry tools for synthesis planning and safety

Drug Discovery & Molecular Optimization

LLM-based approaches for drug editing, molecule optimization, and compound QA.

YearPaperFocus
2023DrugChatConversational QA over drug molecule graphs
2023LLM4MolUsing ChatGPT-generated captions as molecular representations
2024ChatDrugConversational drug editing with retrieval-augmented generation
2024DrugAssistInteractive LLM-guided molecule optimization

Benchmarks & Evaluation

Datasets and evaluation frameworks for assessing LLM performance on chemistry tasks.

YearPaperFocus
2023ChemLLMBenchEight-task benchmark for LLM chemistry capabilities
2023Code-Gen Chemistry AssessmentEvaluating chemistry knowledge in code-generation LLMs
2024Benchmarking LLMs for Molecular PredictionSystematic comparison of LLMs on molecular property prediction
2024ChemEvalFine-grained, multi-level evaluation framework for chemistry LLMs
2024ChemSafetyBenchSafety-focused benchmark for chemistry LLMs
2025ChemBenchLarge-scale evaluation comparing LLMs against human chemistry experts
2025MaCBenchMultimodal benchmark for chemistry and materials science

Surveys & Perspectives

Broad reviews and position papers on the role of LLMs in chemistry.

YearPaperFocus
2022NLP Models That Automate Programming for ChemistryEarly perspective on NLP and code generation for chemical workflows
2024Survey of Scientific LLMs in Bio and ChemComprehensive survey of LLM applications across biology and chemistry