Hypothesis-based interpretability for high-dimensional models of T cell signaling
National Science FoundationDescription
Living cells receive signals from other cells and process this information to make decisions and take actions. One example is provided by immune cells processing information about other cells to decide whether to mount an immune response. Decades of mathematics have revealed how parts of this information processing system work, but the insights were limited by the mathematical technology of the time. Recent artificial intelligence and machine learning methods have powerful abilities to predict how cells respond to signals, but do not have a straightforward way to harness the insights from previous decades. This project develops a method that combines the insights from previous decades with modern machine learning. In doing so, the method achieves higher accuracy predictions, even in the face of complex signals. The method is applied to immune cells receiving complex signals of different frequencies, and complex combinations of primary signals with secondary signals (so-called accessory receptors or co-signaling receptors). The secondary signals were previously particularly challenging to understand, because multiple signals act simultaneously, creating a high number of combinations. The project will train graduate students in machine learning, immunology, and applied mathematics, and develop a course for coding practices for reliably using modern tools such as artificial intelligent coding assistants. A central goal of mathematical biology is to build quantitative, predictive models of how cells respond to signals. The need is especially acute for T cells, given their role in cell-based immunotherapies. Recent high-dimensional models fit data better but raise three concerns: computation and data needs, overfitting, and interpretability. The first two have seen progress, but the third has remained challenging. This project adopts the view that interpretability is the ability to explore, reject and use hypotheses expressible in plain language, including hypotheses from previous decades of mathematical biology research. From that perspective, more model flexibility is not better if the model can no longer reject a false hypothesis. This project develops classes of models whose flexibility is tunable to the hypothesis being tested. To do so, the project develops families of functions with adjustable flexibility for use in trainable models. This method is applied to predictively understand the response of T cells stimulated with temporal pulses at varying frequencies, a technique borrowing from classical control theory, and T cells exposed to combinatorial mixtures of accessory ligands. The mathematical novelty lies in working with intermediate-flexibility functions, which are not amenable to either gradient descent or Monte Carlo training algorithms. Flexibility is measured by a model's ability to fail to fit data, by introducing a design-specific Rademacher complexity metric. The project also extends the NSF-funded "DevOps for Mathematical Biologists" program, shifting toward widely accessible resources in the era of AI-assisted scientific computing. This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria. NSF Award ID: 2602179 | Program: 01002627DB NSF RESEARCH & RELATED ACTIVIT | Principal Investigator: Jun Allard | Institution: University of California-Irvine, IRVINE, CA | Award Amount: $465,777 View on NSF Award Search: https://www.nsf.gov/awardsearch/show-award/?AWD_ID=2602179 View on Research.gov: https://www.research.gov/awardapi-service/v1/awards/2602179.html
Interested in this grant?
Sign up to get match scores, save grants, and start your application with AI-powered tools.
Grant Details
$465,777 - $465,777
June 30, 2029
IRVINE, CA
External Links
View Original ListingWant to see how well this grant matches your organization?
Get Your Match Score