Data-Driven Prediction of Metal-Organic Dissolution and Adhesion Using Machine Learning

Authors

  • Devin Wanchoo Sid and Reva Dewberry Department of Civil, Environmental, and Infrastructure Engineering, George Mason University, Fairfax, VA
  • Neeraj Dandamudi Sid and Reva Dewberry Department of Civil, Environmental, and Infrastructure Engineering, George Mason University, Fairfax, VA
  • Shawn Li Sid and Reva Dewberry Department of Civil, Environmental, and Infrastructure Engineering, George Mason University, Fairfax, VA
  • Junyi Wang Sid and Reva Dewberry Department of Civil, Environmental, and Infrastructure Engineering, George Mason University, Fairfax, VA
  • Xijin Zhang Sid and Reva Dewberry Department of Civil, Environmental, and Infrastructure Engineering, George Mason University, Fairfax, VA

Abstract

Natural biomolecules, like phenolic compounds, are important to regulate metal ion dissolution and adhesion in various chemical and biological systems. However, there is a gap in predictive understanding of how environmental variables, organic ligands, and metal properties interact to influence these processes. This gap limits the production of accurate models for predicting metal-organic interactions, which are important in fields ranging from sustainable agriculture to biomaterials engineering. 

This study seeks to fill this gap by compiling a dataset of 71 experimentally validated metal-organic-environmental interactions, 45 adhesive and 26 dissolution reactions, and training machine learning models to predict both outcomes. The molecular properties of metal ions, including ionic radius, polarizability, electron affinity, atomic number, and electron orbital configurations, were compiled from public databases like PubChem. 41 organic compounds were processed through Mordred to generate 1,614 structural and topological molecular descriptors per compound. Combined with environmental variables like pH, compound concentration, reaction temperature, and reaction speed, this information was merged into a single unified dataset for model development. 

To support the predictive model, Random Forest and Support Vector Machine (SVM) models were trained with two main goals: to identify environmental conditions and molecules that maximize reactivity and produce favorable results, and to identify key molecular features and functional groups driving dissolution or adhesion. By learning from known reactions, the models reduce the need for trial-and-error experimentation and allow for improved compound selection. This work establishes a scalable, data-driven platform for simulating and guiding metal-organic reactivity in both natural and engineered environments. 

Published

2025-09-25

Issue

Section

College of Engineering and Computing: Department of Civil, Environmental and Infrastructure Engineering