
Image2SMILES: Transformer OCSR with Synthetic Data Pipeline
A Transformer-based system for optical chemical structure recognition introducing a comprehensive data generation pipeline (FG-SMILES, Markush structures, visual contamination) achieving 79% accuracy on real-world images, outperforming rule-based systems like OSRA.






