AI & Physical Sciences Taxonomy: A Six-Vector Framework

Overview

This taxonomy provides a systematic framework for understanding and classifying research papers at the intersection of AI and the physical sciences. It uses a superposition model where each paper is viewed as a linear combination of six fundamental contribution types (basis vectors).

The framework helps answer: “What is this paper’s primary contribution?” by identifying rhetorical patterns and structural elements that signal different research paradigms.

Core Principle: The Superposition Model

All papers in this domain can be viewed as a superposition of fundamental contribution vectors.

Concept: Most papers exhibit a complex profile across six basis vectors, blending multiple contribution types (e.g., Method + Theory). While one vector usually provides the primary narrative thrust (the “Why”), secondary vectors provide the necessary supporting evidence (the “How”).

Goal: Classify a paper by identifying its Primary Projection (the dominant contribution) and its Secondary Projections (supporting work).

The Six Independent Basis Vectors ($\Psi$)

Basis Vector	Alias/Focus	Core Question	Primary Output
1. $\Psi_{\text{Method}}$	The Methodological Basis (Architecture/Algorithm)	How well does this work?	New algorithm, architecture, or approximation
2. $\Psi_{\text{Theory}}$	The Theoretical Basis (Formal Analysis)	Why does this work?	Formal proof, generalization bound, or physical derivation
3. $\Psi_{\text{Resource}}$	The Infrastructure Basis (Data/Software)	What resources are available?	Dataset, benchmark, or open-source software ecosystem
4. $\Psi_{\text{Systematization}}$	The Review Basis (Synthesis)	What do we know?	Comprehensive survey or new organizing taxonomy (SoK)
5. $\Psi_{\text{Position}}$	The Sociological Basis (Perspective)	Where should the field go?	Opinion piece, perspective, or critique of community practice
6. $\Psi_{\text{Discovery}}$	The Translational Basis (Application)	What new thing did we find?	Experimentally validated material, molecule, or physical law

Assessment Guide: Rhetorical Indicators

To identify the primary basis vector, look for these specific rhetorical elements, structural features, and claims in the paper:

1. $\Psi_{\text{Method}}$: The Methodological Paper

Focuses on proposing a novel mechanism, architecture, or approximation (e.g., a new Transformer variant, a GNN with symmetry, a new DFT functional).

Rhetorical Indicators:

Ablation Study: Authors systematically remove components of their system to prove their specific innovation drives the performance gain
Baseline Comparison: A prominent table comparing the new method against the State-of-the-Art (SOTA)
Pseudo-code: An explicit block detailing the algorithmic steps (e.g., for training, sampling, or inference)

2. $\Psi_{\text{Theory}}$: The Theoretical Paper

Focuses on mathematical guarantees, proofs, or derivations from first principles.

Rhetorical Indicators:

Mathematical Proof Sections: Sections titled “Theorem 1,” “Proof of Equivariance,” or “Formal Bounds”
Analysis of Limits/Capacity: Investigates the expressivity (e.g., comparing a GNN to the Weisfeiler-Lehman Test) or analyzes the geometry of the optimization landscape
Generalization/OOD: Derives generalization bounds on test error or formally defines “chemical space coverage” for out-of-distribution (OOD) behavior
Exact Constraints: Derives exact conditions that true physical functions (like the universal Density Functional) must satisfy

3. $\Psi_{\text{Resource}}$: The Infrastructure Paper

Focuses on creating and sharing foundational tools for the community.

Rhetorical Indicators:

Curation Description: Detailed steps on how data was generated, filtered, or curated (e.g., describing millions of CPU-hours of DFT calculations for a dataset like QM9)
“Datasheets” and “Data Cards”: Inclusion of formal documentation detailing provenance, copyright, and potential biases in the data
Benchmark Definition: Argues that “Metric X on Dataset Y” is the correct proxy for progress in a specific scientific task

4. $\Psi_{\text{Systematization}}$: The Review Paper

Focuses on organizing and synthesizing existing literature.

Rhetorical Indicators:

Survey Structure: Follows a linear, often chronological, progression or is grouped by architecture (e.g., VAEs, GANs, Diffusion)
Systematization of Knowledge (SoK): A higher-order contribution that proposes a new taxonomy or a unified framework to connect disparate concepts

5. $\Psi_{\text{Position}}$: The Sociological Paper

Focuses on meta-science, arguing for a change in community norms, or critiquing systemic issues.

Rhetorical Indicators:

Venue/Track: Often found in “Position Tracks” or called “Blue Sky” or “Forward Looking” papers
Argumentative Tone: Uses qualitative or quantitative analysis (meta-analysis) to argue for a shift in how research is conducted or funded (e.g., a paper arguing that AI contracts the focus of science)
Goal: To highlight a systemic issue

6. $\Psi_{\text{Discovery}}$: The Translational Paper

Focuses on the discovery of novel scientific artifacts using AI/ML tools.

Rhetorical Indicators:

Structure: Follows a workflow: (1) Computational Screening (AI selects candidates), (2) Experimental Validation (wet-lab synthesis, physical characterization)
Core Claim: The primary contribution is a new material, molecule, or measurement, with the AI/ML part serving as the necessary first step
Key Question: Does the AI’s prediction hold true in reality (e.g., in a physical experiment)?

Usage Notes

When assessing a paper:

Look for the rhetorical indicators to identify the Primary basis vector (the main claim/narrative).
Identify any Secondary basis vectors (heavy supporting work).
Use this “fingerprint” (e.g., Primary: Method, Secondary: Resource) to accurately map the paper’s contribution to the broader field.

This framework is particularly useful for:

Organizing literature reviews
Understanding conference/journal acceptance criteria
Identifying gaps in research portfolios
Recognizing different types of scientific contributions

Overview#

Core Principle: The Superposition Model#

The Six Independent Basis Vectors ($\Psi$)#

Assessment Guide: Rhetorical Indicators#

1. $\Psi_{\text{Method}}$: The Methodological Paper#

2. $\Psi_{\text{Theory}}$: The Theoretical Paper#

3. $\Psi_{\text{Resource}}$: The Infrastructure Paper#

4. $\Psi_{\text{Systematization}}$: The Review Paper#

5. $\Psi_{\text{Position}}$: The Sociological Paper#

6. $\Psi_{\text{Discovery}}$: The Translational Paper#

Usage Notes#

Overview

Core Principle: The Superposition Model

The Six Independent Basis Vectors ($\Psi$)

Assessment Guide: Rhetorical Indicators

1. $\Psi_{\text{Method}}$: The Methodological Paper

2. $\Psi_{\text{Theory}}$: The Theoretical Paper

3. $\Psi_{\text{Resource}}$: The Infrastructure Paper

4. $\Psi_{\text{Systematization}}$: The Review Paper

5. $\Psi_{\text{Position}}$: The Sociological Paper

6. $\Psi_{\text{Discovery}}$: The Translational Paper

Usage Notes