
InstructMol: Multi-Modal Molecular Assistant
A multi-modal LLM aligning 2D molecular graphs with text via two-stage instruction tuning for drug discovery tasks.

A multi-modal LLM aligning 2D molecular graphs with text via two-stage instruction tuning for drug discovery tasks.

A method for improving legislative vote prediction across sessions by augmenting bill text embeddings with sponsor …

Production-grade Word2Vec in PyTorch with vectorized Hierarchical Softmax, Negative Sampling, and torch.compile support.

Analytical derivation of Word2Vec's softmax objective factorization and a new framework for detecting semantic bias in …

Investigation into EigenNoise, a data-free initialization scheme for word vectors that approaches pre-trained model …

We introduce an unsupervised algorithm for inducing semantic networks from noisy, crowd-sourced data, producing a …

Learn count vectorization in Python: convert text to numerical vectors using scikit-learn's CountVectorizer with …

Learn about word embeddings in NLP: from basic one-hot encoding to contextual models like ELMo. Guide with examples.