April/May 2025 MMLI Spotlights

Scott Denmark

Ayush Shah

Chiyoung Ahn


Faculty Spotlight: Scott Denmark

Scott Denmark is the Reynold C. Fuson Professor of Chemistry at the University of Illinois at Urbana-Champaign. He is the Thrust lead for the AI-enabled Catalyst Discovery thrust and is collaborating with different groups in MMLI to accelerate small molecules discovery and development.  [Click to read more]

What is your background/what did you do before your current role?
I have been a Chemistry faculty member since 1980. You can read more about Scott’s background on his website!

What is your current position/short description of what you are working on right now with MMLI. 
I am the lead for Thrust 2, Catalysis, in the MMLI. Currently, my group is developing and applying chemoinformatics and machine learning workflows to accelerate the discovery and optimization of small molecule catalysts. Over the past 4+ years, we have reported successful optimization of bisoxazoline•copper complexes for enantioselective aldol addition reactions, have developed a substrate adaptive predictor for palladium catalyzed C–N coupling reactions, have created “molli” a python package that automates our workflow, and have recently developed a predictive model for osmium-catalyzed, asymmetric dihydroxylation of alkenes.

In my role as a PI in Thrust 3, Manufacturing, we have recently reported the discovery and optimization of a highly-catalytic addition of allyl groups to native carbohydrates in aqueous solution. In collaboration with Klavs Jensen’s group at MIT, we developed and patented a continuous flow system for the synthesis of 1-allylsorbitol which is a key building block for a high volume polypropylene clarifying agent.

Finally in my role as PI in Thrust 1, AI, we collaborated with Prof. Richard Zanibbi at RIT to create “ChemScraper” to extract chemical structures and text from dead pdf documents and convert them into machine readable format. This tool is critical for the mining of the chemical literature for data and curation for modeling efforts!

What drew you to MMLI (or your lab)? 
My group has been active in the merger of artificial intelligence and chemical synthesis since 2008, long before it was even considered possible. We published our first efforts in 2011. When the NSF first announced the creation of AI/physical science institutes, it was a natural fit and also a validation of our vision.

What has been your favorite part of being a part of MMLI?
First and foremost, it has been the opportunity to have my graduate students and postdocs interact with computer scientists to expand their skill sets and also for me to learn about the amazing advances in AI in real time within the institute. One of the most exciting aspects will be the collaboration with Computer Science faculty in developing generative AI for the de novo identification of high performance catalysts, something I could only dream about when we started the MMLI.

How do you like to spend your free time? 
I love to read (non-fiction) cook, workout, and enjoy riding each my five high-performance motorcycles (three here and two in San Diego). I raced Porsches for over 20 years and am now waiting for the new Corvette ZR-1 for track days.

Fun fact (or extremely average fact) about yourself you would like to share.
I have been fountain pen collector for five decades and have amassed a stable of over 60 pens from all over the world ranging in price from $50 up to many thousands of dollars along with a drawer with more than 80 different inks. I keep 10 inked and use them according to my mood.



Student Spotlight: Ayush Kumar Shah

Ayush Kumar Shah is a current graduate student in Richard Zanibbi’s group at the Computer Science department of Rochester Institute of Technology. He is currently working on ChemScraper and ReactionMiner Search to enhance chemical document understanding and retrieval. ChemScraper is a fast molecule diagram parser that extracts characters and graphics from born-digital PDFs without OCR, generating training data for visual parsing of molecular diagrams in raster formats using a segmentation-aware CNN. Multimodal Chemical Search (ReactionMiner Search) integrates text, SMILES, and reaction-based queries to extract and link chemical reactions, molecular diagrams, and textual descriptions, enabling structured retrieval for chemists. These projects leverage deep learning, graph-based methods, and multimodal integration to improve accuracy and usability in chemical research. [Click to read more]

What is your background and describe your current work/role/lab and the project you are most excited to be working on right now.
Before my current role, I completed my Bachelors in Computer Engineering from Nepal, and worked for a year as a Machine Learning Engineer at Fusemachines, where I worked on handwritten OCR for bank documents (English & Nepali) and AI education initiatives. I then pursued my Ph.D. at RIT, focusing on math and chemical formula recognition using graph-based methods. I also interned at Amazon Alexa, where I worked on speaker identification using semi-supervised learning, contributing to a model that was approved for production..

What drew you to your project and/or MMLI?
I was drawn to MMLI (or my lab, DPRL) because of its focus on document intelligence, pattern recognition, and machine learning. My research interests align with the lab’s emphasis on mathematical and chemical diagram recognition, where I could explore graph-based methods, deep learning, and multimodal learning to improve document parsing and retrieval. The opportunity to work on real-world challenges in scientific literature analysis and contribute to open-source projects like ChemScraper, MathDeck and Multimodal Chemical Search made DPRL/MMLI an ideal environment for me to advance my expertise in computer vision, NLP, and AI-driven document understanding. Additionally, the lab’s collaborative and interdisciplinary approach provided a platform to explore novel solutions in structured document processing and multimodal AI.

What has been your favorite part of being a part of MMLI?
My favorite part of being a part of MMLI has been the opportunity to collaborate with multiple teams from different organizations, bringing together diverse expertise in machine learning, document intelligence, and scientific literature analysis. Working on projects like ChemScraper and Multimodal Chemical Search, I have engaged with researchers across institutions, combining knowledge from computer vision, NLP, and chemistry to develop more effective solutions for document parsing and retrieval. These collaborations have not only expanded my technical skills but also provided valuable insights into interdisciplinary problem-solving, allowing me to contribute to research with real-world impact.
 
How do you like to spend your free time? (or what would you do for fun if you had more free time!)
In my free time, I enjoy playing football (soccer), which helps me stay active and unwind. I also love traveling and exploring new places, experiencing different cultures. When I have the chance, I like to play the guitar.
 
Fun fact (or extremely average fact) about yourself you would like to share.
I can speak in 4 languages.


Alumni Spotlight: Chiyoung Ahn

Chiyoung Ahn was a graduate student in M. Christina White’s group at the Department of Chemistry, University of Illinois. In MMLI, Chiyoung developed a C-H steric descriptor that uses a C-H oxidation catalyst (PDP)-like probe, and designed a neural network-based PDP C-H oxidation site prediction model using this steric probe along with DFT-based features. The model demonstrated very high validation performance in both Kfold and out-of-bag validation settings. He graduated with a degree in pharmacy from Seoul National University and worked for a few years in pharmaceutical industry prior to joining White group and MMLI. Upon completion of his PhD at Illinois, he moved to Berkeley for a postdoc. Let’s congratulate Chiyoung for completing his PhD! [Click to read more]

What is your background and describe your current work/role/lab and the project you are most excited to be working on right now.
After I graduated pharmacy school (Seoul National University), I pursued a Master’s degree in pharmaceutical chemistry focusing on the total synthesis of an iridoid natural product. I then spent 3.5 years in the pharmaceutical industry before joining the White group (UIUC), where I worked on improving the chemoselectivity of manganese-catalyzed C-H oxidation before and during working on MMLI project.

What drew you to your project and/or MMLI?
Predicting the most oxidizable Csp3-H bond in a molecule is a very challenging problem, due to non-linearity between C-H descriptors and reactivity, as well as the complexity of capturing 3D interactions between catalyst and C-H bonds across all C-H bonds throughout the whole training set. I was gravitated towards these challenges, with a view to leveraging resources and expertise that MMLI can provide to solve this problem.

What has been your favorite part of being a part of MMLI?
Interacting with computational scientists who are deeply passionate about solving chemistry problems. I think there is always much to learn from diverse perspectives, especially when they come with a high level of technical expertise.
 
How do you like to spend your free time? (or what would you do for fun if you had more free time!)
I would enjoy watching well-made movies.
 
Fun fact (or extremely average fact) about yourself you would like to share.
Fun fact is that I am leaving UIUC for a postdoc at Berkeley in 3 weeks. I would say it was very good timing that I was chosen to write this MMLI highlight!

Note: Cindy contacted Chiyoung in March 2025 to feature him as part of our community spotlight just before his move to Berkeley.