Overview
This taxonomy provides a systematic framework for understanding and classifying research papers at the intersection of AI and the physical sciences. It uses a superposition model where each paper is viewed as a linear combination of six fundamental contribution types (basis vectors).
The framework helps answer: “What is this paper’s primary contribution?” by identifying rhetorical patterns and structural elements that signal different research paradigms.
Core Principle: The Superposition Model
All papers in this domain can be viewed as a superposition of fundamental contribution vectors.
Concept: Most papers exhibit a complex profile across six basis vectors, blending multiple contribution types (e.g., Method + Theory). While one vector usually provides the primary narrative thrust (the “Why”), secondary vectors provide the necessary supporting evidence (the “How”).
Goal: Classify a paper by identifying its Primary Projection (the dominant contribution) and its Secondary Projections (supporting work).
The Six Independent Basis Vectors ($\Psi$)
| Basis Vector | Alias/Focus | Core Question | Primary Output |
|---|---|---|---|
| 1. $\Psi_{\text{Method}}$ | The Methodological Basis (Architecture/Algorithm) | How well does this work? | New algorithm, architecture, or approximation |
| 2. $\Psi_{\text{Theory}}$ | The Theoretical Basis (Formal Analysis) | Why does this work? | Formal proof, generalization bound, or physical derivation |
| 3. $\Psi_{\text{Resource}}$ | The Infrastructure Basis (Data/Software) | What resources are available? | Dataset, benchmark, or open-source software ecosystem |
| 4. $\Psi_{\text{Systematization}}$ | The Review Basis (Synthesis) | What do we know? | Comprehensive survey or new organizing taxonomy (SoK) |
| 5. $\Psi_{\text{Position}}$ | The Sociological Basis (Perspective) | Where should the field go? | Opinion piece, perspective, or critique of community practice |
| 6. $\Psi_{\text{Discovery}}$ | The Translational Basis (Application) | What new thing did we find? | Experimentally validated material, molecule, or physical law |
Assessment Guide: Rhetorical Indicators
To identify the primary basis vector, look for these specific rhetorical elements, structural features, and claims in the paper:
1. $\Psi_{\text{Method}}$: The Methodological Paper
Focuses on proposing a novel mechanism, architecture, or approximation (e.g., a new Transformer variant, a GNN with symmetry, a new DFT functional).
Rhetorical Indicators:
- Ablation Study: Authors systematically remove components of their system to prove their specific innovation drives the performance gain
- Baseline Comparison: A prominent table comparing the new method against the State-of-the-Art (SOTA)
- Pseudo-code: An explicit block detailing the algorithmic steps (e.g., for training, sampling, or inference)
2. $\Psi_{\text{Theory}}$: The Theoretical Paper
Focuses on mathematical guarantees, proofs, or derivations from first principles.
Rhetorical Indicators:
- Mathematical Proof Sections: Sections titled “Theorem 1,” “Proof of Equivariance,” or “Formal Bounds”
- Analysis of Limits/Capacity: Investigates the expressivity (e.g., comparing a GNN to the Weisfeiler-Lehman Test) or analyzes the geometry of the optimization landscape
- Generalization/OOD: Derives generalization bounds on test error or formally defines “chemical space coverage” for out-of-distribution (OOD) behavior
- Exact Constraints: Derives exact conditions that true physical functions (like the universal Density Functional) must satisfy
3. $\Psi_{\text{Resource}}$: The Infrastructure Paper
Focuses on creating and sharing foundational tools for the community.
Rhetorical Indicators:
- Curation Description: Detailed steps on how data was generated, filtered, or curated (e.g., describing millions of CPU-hours of DFT calculations for a dataset like QM9)
- “Datasheets” and “Data Cards”: Inclusion of formal documentation detailing provenance, copyright, and potential biases in the data
- Benchmark Definition: Argues that “Metric X on Dataset Y” is the correct proxy for progress in a specific scientific task
4. $\Psi_{\text{Systematization}}$: The Review Paper
Focuses on organizing and synthesizing existing literature.
Rhetorical Indicators:
- Survey Structure: Follows a linear, often chronological, progression or is grouped by architecture (e.g., VAEs, GANs, Diffusion)
- Systematization of Knowledge (SoK): A higher-order contribution that proposes a new taxonomy or a unified framework to connect disparate concepts
5. $\Psi_{\text{Position}}$: The Sociological Paper
Focuses on meta-science, arguing for a change in community norms, or critiquing systemic issues.
Rhetorical Indicators:
- Venue/Track: Often found in “Position Tracks” or called “Blue Sky” or “Forward Looking” papers
- Argumentative Tone: Uses qualitative or quantitative analysis (meta-analysis) to argue for a shift in how research is conducted or funded (e.g., a paper arguing that AI contracts the focus of science)
- Goal: To highlight a systemic issue
6. $\Psi_{\text{Discovery}}$: The Translational Paper
Focuses on the discovery of novel scientific artifacts using AI/ML tools.
Rhetorical Indicators:
- Structure: Follows a workflow: (1) Computational Screening (AI selects candidates), (2) Experimental Validation (wet-lab synthesis, physical characterization)
- Core Claim: The primary contribution is a new material, molecule, or measurement, with the AI/ML part serving as the necessary first step
- Key Question: Does the AI’s prediction hold true in reality (e.g., in a physical experiment)?
Usage Notes
When assessing a paper:
- Look for the rhetorical indicators to identify the Primary basis vector (the main claim/narrative).
- Identify any Secondary basis vectors (heavy supporting work).
- Use this “fingerprint” (e.g., Primary: Method, Secondary: Resource) to accurately map the paper’s contribution to the broader field.
This framework is particularly useful for:
- Organizing literature reviews
- Understanding conference/journal acceptance criteria
- Identifying gaps in research portfolios
- Recognizing different types of scientific contributions
