Computational Chemistry
Comparison chart showing k-NN significantly outperforming logistic regression for molecular classification across different alkane sizes

Can You Hear the Shape of a Molecule? (Part Three)

Supervised learning reveals hidden eigenvalue patterns that clustering missed, testing k-NN and logistic regression on …

Computational Chemistry
Charts showing Dunn Index, distance metrics, and computation time analysis revealing clustering performance degradation with molecular size

Can You Hear the Shape of a Molecule? (Part Two)

Clustering analysis reveals why Coulomb matrix eigenvalues struggle with larger alkanes, using Dunn Index and silhouette …

Natural Language Processing
Word vector illustration showing text classification and NLP concepts

Sarcasm Detection with Transformers: A Cautionary Tale

Learn how dataset bias can lead to misleading results in NLP: a sarcasm detection model that actually learned to …

Computational Chemistry
3D ball-and-stick model of butane molecule showing linear carbon chain structure

Can You Hear the Shape of a Molecule?

Explore molecular shape recognition using Coulomb matrix eigenvalues. Analysis of alkane isomers from data generation to …

Computational Social Science
Top features for Armed Forces and National Security policy classification showing veterans, defense, military keywords

Classifying Congressional Bills with Machine Learning

Testing ML classification of congressional bills by policy area. Comparing Naive Bayes, Logistic Regression, and XGBoost …

Computational Chemistry
Coulomb matrix heatmap visualization showing molecular structure encoding on logarithmic scale

Coulomb Matrices for Molecular Machine Learning

Learn how Coulomb matrices encode 3D molecular structure for machine learning from basic theory to Python implementation …

Computational Social Science
Top features for Social Welfare policy classification showing social, poverty, benefits keywords

Congressional Knowledge Graph & Policy Classification

A 47,000+ bill knowledge graph from Congress.gov with sponsor networks and 87% policy classification accuracy.

Computational Chemistry
SELFIES robustness demonstration

SELFIES and the Future of Molecular String Representations

Perspective on SELFIES as a 100% robust SMILES alternative, with 16 future research directions for molecular AI.

Natural Language Processing
Information Quality Ratio plot showing statistical dependencies decay as window size increases

Analytical Solution to Word2Vec Softmax & Bias Probing

Analytical derivation of Word2Vec's softmax objective factorization and a new framework for detecting semantic bias in …

Natural Language Processing
Heatmap visualization of the EigenNoise analytical co-occurrence prior matrix showing word rank relationships

EigenNoise: Data-Free Word Vector Initialization

Investigation into EigenNoise, a data-free initialization scheme for word vectors that approaches pre-trained model …

Computational Social Science
Diagram of the Universal Message schema showing fields like ID, Text, Author, and Reply Sets that normalize data across platforms

PyConversations: Social Media Conversational Analysis

Undergraduate thesis exploring representation learning for social media text and developing tools for cross-platform …

AI Safety
A nonsensical trigger sequence 'WTC theoriesclimate Flat Hubbard Principle' is fed into GPT-2, which then generates Flat Earth conspiracy text

GPT-2 Susceptibility to Universal Adversarial Triggers

Investigation into whether universal adversarial triggers can control both topic and stance of GPT-2's generated text …