Graph-Theory

Three-panel diagram showing DGCNN point cloud processing: input space k-NN graph, EdgeConv operation, and semantic feature space clustering

DGCNN: Dynamic Graph CNN for Point Cloud Learning

DGCNN introduces the EdgeConv operator, which constructs k-nearest neighbor graphs dynamically in feature space at each network layer. This enables the model to capture both local geometry and long-range semantic relationships for point cloud classification and segmentation.

Predictive Chemistry

Three-stage canonical generation pipeline (geng, vcolg, multig) alongside a log-scale speed comparison showing Surge outperforming MOLGEN 5.0 by 42-161x across natural product molecular formulas

Surge: Fastest Open-Source Chemical Graph Generator

Surge is a constitutional isomer generator based on the canonical generation path method, using nauty for graph automorphism computation. Its three-stage pipeline (simple graph generation, vertex coloring for atom assignment, edge multiplicity for bond orders) generates 7-22 million molecules per second, outperforming MOLGEN 5.0 by 42-161x on natural product molecular formulas.

Molecular Simulation

Diagram showing the Ewald decomposition of long-range interactions into short-range and Fourier-space components for molecular graph neural networks

Ewald Message Passing for Molecular Graphs

Proposes Ewald message passing, a Fourier-space scheme inspired by Ewald summation that captures long-range interactions in molecular graphs. The method is architecture-agnostic and improves energy MAEs by 10% on OC20 and 16% on OE62 across four baseline GNN models.

Machine Learning

Diagram showing the Lagrangian Neural Network pipeline from coordinates through a learned Lagrangian to energy-conserving dynamics

Lagrangian Neural Networks for Physics

Lagrangian Neural Networks (LNNs) use neural networks to parameterize arbitrary Lagrangians, enabling energy-conserving learned dynamics without canonical coordinates. Unlike Hamiltonian approaches, LNNs handle relativistic systems and extend to graphs via Lagrangian Graph Networks.

Computational Chemistry

DrugChat architecture showing GNN encoder, linear adaptor, and Vicuna LLM for conversational drug analysis

DrugChat: Conversational QA on Drug Molecule Graphs

DrugChat is a prototype system that bridges molecular graph neural networks with large language models for interactive, multi-turn question answering about drug compounds. It trains only a lightweight linear adaptor between a frozen GNN encoder and Vicuna-13B using 143K curated QA pairs from ChEMBL and PubChem.

Molecular Representations

MolFM trimodal architecture fusing 2D graph, knowledge graph, and biomedical text via cross-modal attention

MolFM: Trimodal Molecular Foundation Pre-training

MolFM pre-trains a multimodal encoder that fuses 2D molecular graphs, biomedical text, and knowledge graph entities through fine-grained cross-modal attention, achieving strong gains on cross-modal retrieval, molecule captioning, text-based generation, and property prediction.

Molecular Representations

MoMu architecture showing contrastive alignment between molecular graph and scientific text modalities

MoMu: Bridging Molecular Graphs and Natural Language

MoMu pre-trains dual graph and text encoders on 15K molecule graph-text pairs using contrastive learning, enabling cross-modal retrieval, molecule captioning, zero-shot text-to-graph generation, and improved molecular property prediction.

Molecular Representations

Log-log plots showing power-law scaling of ChemGPT validation loss versus model size and GNN force field loss versus dataset size

Neural Scaling of Deep Chemical Models

Frey et al. discover empirical power-law scaling relations for both chemical language models (ChemGPT, up to 1B parameters) and equivariant GNN interatomic potentials, finding that neither domain has saturated with respect to model size, data, or compute.

Optical Chemical Structure Recognition

GraSP feed-forward architecture showing GNN, FiLM-conditioned CNN, and MLP classification head

GraSP: Graph Recognition via Subgraph Prediction (2026)

GraSP introduces a general framework for recognizing graphs in images by framing it as sequential subgraph prediction with a binary classifier. A GNN conditions a CNN via FiLM layers to predict whether a candidate graph is a subgraph of the target. Applied to OCSR on QM9, GraSP achieves 67.5% accuracy with no domain-specific modifications.

Machine Learning

Graph network block diagram showing input graph transformed through edge, node, and global update steps to produce an updated graph

Relational Inductive Biases in Deep Learning (2018)

Battaglia et al. argue that combinatorial generalization requires structured representations, systematically analyze the relational inductive biases in standard deep learning architectures (MLPs, CNNs, RNNs), and present the graph network as a unifying framework that generalizes and extends prior graph neural network approaches.

Optical Chemical Structure Recognition

AtomLenz learns atom-level detection from hand-drawn molecular images with weak supervision

AtomLenz: Atom-Level OCSR with Limited Supervision

Introduces AtomLenz, an OCSR tool that combines object detection with a molecular graph constructor. Features a novel weakly supervised training scheme (ProbKT*) to learn atom-level localization from SMILES-only data, achieving state-of-the-art results on hand-drawn images.

Optical Chemical Structure Recognition

Overview of the MolScribe encoder-decoder architecture predicting atoms with coordinates and bonds from a molecular image.

MolScribe: Robust Image-to-Graph Molecular Recognition

MolScribe reformulates molecular recognition as an image-to-graph generation task, explicitly predicting atom coordinates and bonds to better handle stereochemistry and abbreviated structures compared to image-to-SMILES baselines.