Computational Chemistry
SELFIES strings guarantee 100% valid molecules - even when generated randomly

Converting SELFIES Strings to 2D Molecular Images

Visualize SELFIES molecular representations and test their 100% robustness through random sampling experiments.

Computational Chemistry
Aspirin molecular structure generated from SMILES string

Converting SMILES Strings to 2D Molecular Images

Learn how to create 2D molecular images from SMILES strings using RDKit and PIL, with proper formatting and legends.

Computational Chemistry
SELFIES representation of 2-Fluoroethenimine molecule

SELFIES (Self-Referencing Embedded Strings)

SELFIES is a 100% robust molecular string representation for ML, implemented in the open-source selfies Python library.

Computational Chemistry
MARCEL dataset Kraken ligand example in 3D conformation

MARCEL: Molecular Representation & Conformers

MARCEL dataset provides 722K+ conformers across 76K+ molecules for drug discovery, catalysis, and molecular …

Computational Chemistry
Benzene molecule with SMILES notation

SMILES: Compact Notation for Chemical Structures

SMILES (Simplified Molecular Input Line Entry System) represents chemical structures using compact ASCII strings.

Computational Chemistry
GEOM dataset example molecule: N-(4-pyrimidin-2-yloxyphenyl)acetamide

GEOM: Energy-Annotated Molecular Conformations

Dataset card for GEOM, providing energy-annotated molecular conformations generated via CREST/xTB and refined with DFT …

Computational Chemistry
Potential energy surface showing molecular conformation space with equilibrium and low energy conformations

DenoiseVAE: Adaptive Noise for Molecular Pre-training

Liu et al.'s ICLR 2025 paper introducing DenoiseVAE, which learns adaptive, atom-specific noise for better molecular …

Computational Chemistry
Adaptive grid merging visualization for benzene molecule showing multi-resolution spatial discretization

Beyond Atoms: 3D Space Modeling for Molecular Pretraining

Lu et al. introduce SpaceFormer, a Transformer that models entire 3D molecular space (not just atoms) for superior …

Computational Chemistry
Comparison of 2D molecular graph versus 3D conformer ensemble showing latanoprost molecule in multiple conformations

GEOM Dataset: 3D Molecular Conformer Generation

Learn how GEOM transforms 2D molecular graphs into dynamic 3D conformer ensembles for molecular machine learning …

Computational Chemistry
SELFIES robustness demonstration

Invalid SMILES Benefit Chemical Language Models: A Study

Skinnider (2024) shows that generating invalid SMILES actually improves chemical language model performance through …

Computational Chemistry
3D ball-and-stick model of butane molecule representing the structural isomer generation process

Synthetic Isomer Data Generation Pipeline

An end-to-end cheminformatics pipeline transforming 1D chemical formulas into 3D conformer datasets using graph …

Computational Chemistry
Comparison chart showing k-NN significantly outperforming logistic regression for molecular classification across different alkane sizes

Can You Hear the Shape of a Molecule? (Part Three)

Supervised learning reveals hidden eigenvalue patterns that clustering missed, testing k-NN and logistic regression on …