MolParser Datasets
MolParser-7M and WildMol Datasets for Robust Chemical Structure Recognition
| Dataset Details | |
| Authors | Xi Fang, Jiankun Wang, Xiaochen Cai, Shangqian Chen, Shuwen Yang, Haoyi Tao, Nan Wang, Lin Yao, Linfeng Zhang, Guolin Ke |
| Paper Title | MolParser: End-to-end Visual Recognition of Molecule Structures in the Wild |
| Institution | DP Technology |
| Published In | arXiv |
| Category | Document Processing |
| Format | Molecule Images (PNG) Extended SMILES (E-SMILES) strings |
| Size | Test molecules: 20,000 Training pairs: 7,740,871 |
| Date | October 2025 |
| Year | 2025 |
| Links | 📊 Dataset • 📄 Paper |
