Paper Summary

Citation: Henze, H. R., & Blair, C. M. (1931). The number of isomeric hydrocarbons of the methane series. Journal of the American Chemical Society, 53(8), 3077-3085. https://doi.org/10.1021/ja01359a034

Publication: Journal of the American Chemical Society (JACS) 1931

What kind of paper is this?

This is a foundational theoretical paper in mathematical chemistry and chemical graph theory. Rather than proposing an approximation or empirical formula, it derives exact mathematical laws governing molecular topology. The paper also serves as a benchmark resource, establishing validated isomer counts that corrected historical errors and remain the gold standard for molecular enumeration.

What is the motivation?

The primary motivation was the lack of a rigorous mathematical relationship between carbon content ($N$) and isomer count.

  • Previous failures: Earlier attempts by Cayley and Schiff used “centric” symmetry tree methods that failed for $N > 13$.
  • The theoretical gap: Existing formulas were empirical or limited (e.g., adding correction terms for each unit increase in $N$), meaning they could not reliably predict counts for larger molecules like $C_{40}$.

This work aimed to develop a theoretically sound, generalizable method that could be extended to any number of carbons.

What is the novelty here?

The core novelty is the proof that no direct function $f(N)$ exists. Instead, the count of hydrocarbons is a recursive function of the count of alkyl radicals (alcohols) of size $N/2$ or smaller.

The Symmetry Constraints: The paper rigorously divides the problem space to prevent double-counting:

  • Group A (Centrosymmetric): Hydrocarbons that can be bisected into two smaller alkyl radicals.
    • Even $N$: Split into two radicals of size $N/2$.
    • Odd $N$: Split into sizes $(N+1)/2$ and $(N-1)/2$.
  • Group B (Asymmetric): Hydrocarbons that cannot be bisected.
    • Defined by a central node with 3 or 4 branches, where no branch is larger than $(N/2 - 1)$.
The five structural isomers of hexane classified into Group A and Group B based on their decomposition
The five isomers of hexane ($C_6$) classified by Henze and Blair’s symmetry scheme. Group A molecules (top row) can be bisected along a bond (highlighted in red) into two $C_3$ alkyl radicals. Group B molecules (bottom row) have a central carbon atom (red circle) with 3-4 branches, preventing symmetric bisection.

This classification is the key insight that enables the recursive formulas. By exhaustively partitioning hydrocarbons into these mutually exclusive groups, the authors could derive separate combinatorial expressions for each and sum them without double-counting.

For each structural class, combinatorial formulas are derived that depend on the number of isomeric alcohols ($T_k$) where $k < N$. This transforms the problem of counting large molecular graphs into a recurrence relation based on the counts of smaller, simpler sub-graphs.

What experiments were performed?

The work is theoretical, so the “experiments” were computational and enumerative:

  1. Derivation of the recursion formulas: The main effort was the mathematical derivation of the set of equations for each structural class of hydrocarbon.
  2. Calculation: They applied their formulas to calculate the number of isomers for alkanes up to $N=40$, reaching over $6.2 \times 10^{13}$ isomers. This was far beyond what was previously possible.
  3. Validation by exhaustive enumeration: To prove the correctness of their theory, the authors manually drew and counted all possible structural formulas for the undecanes ($C_{11}$), dodecanes ($C_{12}$), tridecanes ($C_{13}$), and tetradecanes ($C_{14}$). This brute-force check confirmed their calculated numbers and corrected long-standing errors in the literature.
    • Key correction: The manual enumeration proved that the count for tetradecane ($C_{14}$) was 1,858, not 1,855 as previously cited by Losanitsch.

What were the outcomes and conclusions drawn?

  • Theoretical outcome: The paper proves that the problem’s inherent complexity requires a recursive approach. There is no simple, direct formula relating the number of isomers to $N$.
  • Benchmark resource: The authors published a table of validated isomer counts up to $C_{40}$ (Table II), establishing the definitive ground truth for molecular isomers and correcting historical errors.
Log-scale plot showing exponential growth of alkane isomer counts from C1 to C40
The number of structural isomers grows super-exponentially with carbon content, reaching over 62 trillion for C₄₀. This plot, derived from Henze and Blair’s Table II, illustrates the combinatorial explosion that makes direct enumeration intractable for larger molecules.

The plot above illustrates the staggering growth rate. While methane ($C_1$) through propane ($C_3$) each have exactly one isomer, the count accelerates rapidly: 75 isomers at $C_{10}$, nearly 37 million at $C_{25}$, and over 4 billion at $C_{30}$. By $C_{40}$, the count exceeds $6.2 \times 10^{13}$. This super-exponential scaling demonstrates why brute-force enumeration becomes impossible and why the recursive approach was essential.

  • Foundational impact: This work established the mathematical framework that would later evolve into modern chemical graph theory and computational chemistry approaches for molecular enumeration. In the context of AI for molecular generation, this is an early form of expressivity analysis, defining the size of the chemical space that generative models must learn to cover.