Algorithms

Unified framework converts handwritten chemical expressions to structured graph representations

Unified Framework for Handwritten Chemical Expressions

Proposes a unified statistical framework for recognizing both inorganic and organic handwritten chemical expressions. Introduces the Chemical Expression Structure Graph (CESG) and uses a weighted direction graph search for structural analysis, achieving 83.1% top-5 accuracy on a large proprietary dataset.

Computational Chemistry

Chemical Structure Reconstruction with chemoCR

Describes chemoCR, a system that converts bitmap chemical diagrams into connection tables using a pipeline of texture-based vectorization, OCR, and a rule-based expert system, achieving 65.6% perfect recall on the TREC 2011 task.

Computational Chemistry

ChemInk: Real-Time Recognition for Chemical Drawings

ChemInk introduces a sketch recognition system for chemical diagrams that combines multi-level visual features via a joint Conditional Random Field (CRF), achieving 97.4% accuracy and outperforming CAD tools in user speed.

Computational Chemistry

CLiDE Pro: Optical Chemical Structure Recognition Tool

This paper introduces CLiDE Pro, an advanced OCSR system that segments document images and reconstructs chemical connection tables. It features novel handling for crossing bonds and generic structures, validating performance on a publicly released benchmark of 454 scanned images.

Computational Chemistry

Imago: Structure Recognition at TREC-CHEM 2011

Imago is an open-source, cross-platform C++ toolkit designed to recognize 2D chemical structure images from scientific papers and convert them into machine-readable molecule formats using a rule-based pipeline.

Computational Chemistry

Structural Analysis of Handwritten Chemical Formulas

This paper proposes a strategy for interpreting handwritten chemical formulas by converting bitmap images into a dynamic structural graph of quadrilaterals. It achieves ~97% recognition on graphical elements by using recursive ‘specialists’ to identify chemical bonds and rings.

Computational Social Science

NOMINATE spatial plot showing Senate vote on Balanced Budget Amendment (1995) with legislators positioned on liberal-conservative dimension

A Spatial Model for Legislative Roll Call Analysis

This paper introduces NOMINATE, a probabilistic spatial model that recovers metric coordinates for legislators and roll calls from nominal voting data, demonstrating that a single liberal-conservative dimension explains the vast majority of Congressional voting behavior.

Computational Chemistry

Automatic chemical image recognition pipeline from raster image to structured file

Automatic Recognition of Chemical Images

This methodological paper presents a system for digitizing chemical images into SDF files. It utilizes a custom vectorization algorithm and chemical rule validation, achieving 94% accuracy on benchmark datasets compared to 50% for commercial tools.

Planetary Science

Orbital diagram showing chaotic planetary trajectories

Chaotic Evolution of the Solar System (1992)

Sussman and Wisdom’s 1992 study used the Supercomputer Toolkit and symplectic mapping to integrate the entire Solar System for 100 million years, confirming chaotic behavior with an exponential divergence timescale of ~4 million years and demonstrating that long-term planetary motion is fundamentally unpredictable.

Computational Chemistry

Chemical Literature Data Extraction: The CLiDE Project

The CLiDE project presents a foundational architecture for Optical Chemical Structure Recognition (OCSR). It details a three-phase pipeline to convert bitmapped journal pages into chemically significant connection tables, handling complex features like stereochemistry.

Computational Chemistry

Visualization of Gabor wavelets and Kohonen networks for chemical image classification

Chemical Machine Vision

This 2003 paper introduces a machine vision approach for extracting chemical metadata from raster images. By using Gabor wavelets for feature extraction and Kohonen networks for classification, it distinguishes between chemical and non-chemical images, as well as ring and non-ring systems, without requiring high-resolution inputs.

Computational Chemistry

Graph of the Lennard-Jones 12-6 potential showing the characteristic attractive and repulsive forces

Dynamical Corrections to TST for Surface Diffusion

This paper bridges Molecular Dynamics and Transition State Theory by applying a dynamical corrections formalism to surface diffusion, identifying a low-temperature bounce-back mechanism causing non-Arrhenius behavior.