CAREER: Foundations of Investigative Intelligence
Description
To discover new molecules or materials, scientists typically identify potential candidates, calculate their properties using complex simulations, and determine the "recipes" needed to create them in a lab. Today, artificial intelligence (AI) is used at every stage of this journey. Specialized models suggest new molecules or materials, AI-based simulations attempt to predict how those molecules and materials will behave, and large language models (LLMs) search through millions of scientific papers to find synthesis instructions. However, all these AI methods suffer from a shared flaw known as "mode collapse." This occurs when the AI becomes overly focused on a narrow range of familiar options when generating potential candidates, failing to intelligently navigate the many regions of chemical space where true breakthroughs are hidden. Mode collapse also occurs in simulations, where only narrow regions of energy landscapes are explored, leading to an incomplete and inaccurate understanding of the behaviors of molecules and materials. And in LLM-based retrieval, mode collapse manifests as the tendency of AI to retrieve redundant answers from the scientific literature. Because of this tendency to play it safe, current AI is not yet fully equipped for genuine scientific discovery. This award addresses this important limitation by equipping AI with "investigative intelligence," a new paradigm centered on discovery. Investigative intelligence allows machines to navigate complex search spaces intelligently, enabling the discovery of high-performing molecules and materials currently beyond the reach of automated systems. The objective of this project is to establish the methodological foundations of "investigative intelligence" to mitigate the problem of mode collapse in automated discovery. The research is organized into three primary thrusts: (1) the development of novel differentiable entropy functionals that integrate both sample similarity and utility to provide a mathematical basis for intelligent exploration; (2) the application of these functionals to create new methods for interacting with data and evaluating the success of discovery, specifically by identifying redundancy, memorization, novelty, and rarity; and (3) the integration of these entropy functionals into the design of algorithms for generative modeling, sampling, experimental design, and information retrieval to ensure these methods do not suffer from mode collapse. The three thrusts touch on all parts of the machine learning pipeline for the purpose of discovery, from data to algorithms and evaluation. The project will produce open-source tools for the broader scientific community to accelerate the identification of novel, stable materials. Furthermore, the education activities of the project will endow students with the interdisciplinary skills required to lead the future of AI-driven scientific research. This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria. NSF Award ID: 2541965 | Program: 01003031DB NSF RESEARCH & RELATED ACTIVIT,01002930DB NSF RESEARCH & RELATED ACTIVIT,01002627DB NSF RESEARCH & RELATED ACTIVIT | Principal Investigator: Adji Bousso Dieng | Institution: Princeton University, PRINCETON, NJ | Award Amount: $330,208 View on NSF Award Search: https://www.nsf.gov/awardsearch/show-award/?AWD_ID=2541965 View on Research.gov: https://www.research.gov/awardapi-service/v1/awards/2541965.html
Interested in this grant?
Sign up to get match scores, save grants, and start your application with AI-powered tools.
Grant Details
$330,208 - $330,208
April 30, 2031
PRINCETON, NJ
External Links
View Original ListingWant to see how well this grant matches your organization?
Get Your Match Score