Artificial Intelligence (AI) is becoming increasingly ubiquitous, but with that ubiquity comes a sharp increase in data center energy consumption. This project will develop new algorithms that enable highly efficient training and deployment of AI models, contributing to the United States' strategic advantage in AI capabilities. On the training side, the new algorithms will make smarter decisions about how to explore the vast design space of possible AI models, discovering higher-quality models while spending less time and energy training them. On the deployment side, also called "AI inference," the new algorithms will make smarter decisions about the order to schedule incoming streams of AI tasks, enabling AI systems to run with higher throughput (more tasks done per second) and lower latency (less time waiting for task results). These algorithms' reach will extend beyond AI: the exploration algorithms could help with engineering design problems like drug discovery and fusion reactor design; and the task-scheduling algorithms could help reduce waiting times not just in other computer systems, but also services familiar from everyday life like food delivery and urgent care clinics. The project will approach the two core problems, namely AI training and AI inference, with a classical but under-used theoretical tool: the Gittins index. Roughly speaking, the Gittins index can act as a "universal prioritizer," summarizing all the information one has about a complicated task as a single numerical priority. The research will develop new versions of the Gittins index that make it practical for AI training (specifically, by developing Gittins index acquisition functions for exotic Bayesian optimization problems) and AI inference (specifically, by developing Gittins index schedulers to optimize tail latency in queues). Outcomes of the research will include: (a) new theoretical descriptions of Gittins index algorithms; (b) theoretical proofs that the new algorithms are optimal or near-optimal under certain assumptions; (c) open-source prototype implementations of the new algorithms; and (d) pilot studies demonstrating the effectiveness of the new algorithms. In addition to the new algorithms, the project will develop teaching materials for the discipline of performance modeling, which underlies the task-scheduling side of the project, that break down the advanced math required to understand today's complex computer systems to a wide range of audiences, from a course curriculum for undergraduates to a video tutorial series for working engineers. This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria. NSF Award ID: 2544452 | Program: 01002627DB NSF RESEARCH & RELATED ACTIVIT,01003031DB NSF RESEARCH & RELATED ACTIVIT,01002930DB NSF RESEARCH & RELATED ACTIVIT | Principal Investigator: Ziv Scully | Institution: Cornell University, ITHACA, NY | Award Amount: $451,594 View on NSF Award Search: https://www.nsf.gov/awardsearch/show-award/?AWD_ID=2544452 View on Research.gov: https://www.research.gov/awardapi-service/v1/awards/2544452.html

CAREER: Efficient Scheduling for Machine Learning Training and Inference via the Gittins Index

Description

Interested in this grant?

Grant Details

External Links

Get personalized grant matches