CAREER: Probing pathway complexity of biomolecular assemblies through coarse-grained deep learning models
National Science FoundationDescription
Living systems build complex structures, such as cellular scaffolds or protein-based compartments, by assembling many small pieces in dynamic and often unpredictable ways. These processes do not always follow a single path. Instead, they can proceed through many possible routes depending on environmental conditions such as temperature and chemical signals. This project seeks to determine how and why certain assembly pathways are preferred over others, especially when systems are driven away from stable states by external influences. To address this challenge, the project will develop new computational tools that combine physics-based simulations with machine learning to track how structures form over time and to quantify the “irreversibility” of different pathways. By identifying which pathways are most likely to occur, this work will enable new strategies to design biomolecular materials that respond to their environment, with applications in biotechnology, medicine, and sustainable manufacturing. These advances will contribute to national priorities in health, energy, and advanced materials by enabling predictive design of complex molecular assemblies. In parallel, the project will create interactive learning tools, including hands-on simulations and visual modules, to introduce students to computational biology and data-driven science. These educational activities will help prepare the next generation of scientists to work at the interface of biology, physics, engineering, and artificial intelligence. This project develops a data-driven, multiscale computational framework to quantify pathway complexity during stochastic, out-of-equilibrium biomolecular assembly. The central question is whether path entropy production can serve as a unifying metric to distinguish thermodynamic versus kinetic control and to predict preferential assembly pathways under nonequilibrium conditions. To probe this question, the project integrates coarse-grained molecular simulations with deep learning-based probabilistic forecasting models to efficiently generate trajectory ensembles and estimate path probabilities in high-dimensional systems. Transformer-based architectures will be used to learn effective coarse-grained dynamics, including non-Markovian memory effects, enabling high-throughput simulation of biomolecular systems. Complementary entropy production estimator models will be developed to quantify irreversibility along trajectories under time-varying environmental conditions, such as changes in temperature and ion concentration. These methods will be applied to representative protein assembly systems, including bacterial microcompartments and coiled-coil assemblies, where morphology is highly sensitive to external stimuli. By connecting stochastic thermodynamics with machine learning-enabled coarse-grained modeling, this work establishes a generalizable framework for mapping high-dimensional assembly landscapes onto predictive, physically motivated metrics of pathway selection. This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria. NSF Award ID: 2543548 | Program: 01003031DB NSF RESEARCH & RELATED ACTIVIT,01002627DB NSF RESEARCH & RELATED ACTIVIT | Principal Investigator: Alexander Pak | Institution: Colorado School of Mines, GOLDEN, CO | Award Amount: $672,000 View on NSF Award Search: https://www.nsf.gov/awardsearch/show-award/?AWD_ID=2543548 View on Research.gov: https://www.research.gov/awardapi-service/v1/awards/2543548.html
Interested in this grant?
Sign up to get match scores, save grants, and start your application with AI-powered tools.
Grant Details
$672,000 - $672,000
May 31, 2031
GOLDEN, CO
External Links
View Original ListingWant to see how well this grant matches your organization?
Get Your Match Score