Modern cloud services rely on expensive and power-hungry hardware, making efficient use of computing resources essential for controlling cost and energy consumption. This project focuses on maximizing how much useful work each server can perform without becoming overloaded or unresponsive. The central idea is to make cloud systems determine, within a few microseconds, how much work a server can safely accept, allocate resources to individual tasks accordingly, and then distribute incoming requests across servers based on these allocations. Today, resource allocation and load distribution are handled independently, which leads to inefficient resource use and slow reactions to rapid changes in workload. By combining these operations into a coordinated framework, the project makes these capabilities easier for users to adopt. The overall goal is to improve cloud services without continuously adding more hardware. The project aims to redesign load and resource management in a coordinated manner across software and hardware layers. This problem is fundamentally challenging because resource demands vary widely across requests, bottlenecks shift over time, and independent control mechanisms often operate at similar timescales and interfere with each other. Addressing these challenges requires fine grained visibility into application behavior and new control abstractions that coordinate decisions across layers without introducing excessive overhead. To achieve this, the work is organized around three technical thrusts. The first thrust plans to develop unified and transparent mechanisms that track resource usage for each application request and enforce admission decisions across multiple shared bottlenecks. The second thrust plans to integrate these decisions with operating system scheduling, jointly managing application load and the resources allocated to handle it. The third thrust plans to extend these ideas to clusters of servers, redesigning load balancing, backpressure, and scaling mechanisms for applications built as chains of microservices. The broader impacts of this project include improved performance, lower cost, and reduced energy use for cloud services that support our modern economy. By reducing the need for over provisioned computing resources, the project contributes to more sustainable and environmentally responsible infrastructure. The resulting tools will help application developers build faster and more predictable systems without deep expertise in low level resource management. Educational activities will integrate the research into courses, seminars, and online programs, preparing students for careers in computing systems while encouraging innovation through hands on projects and research experiences. The project will release software, data, and experimental results through publicly accessible repositories and the accompanying website: https://saeed.github.io/career/ This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria. NSF Award ID: 2542973 | Program: 01002627DB NSF RESEARCH & RELATED ACTIVIT,01002930DB NSF RESEARCH & RELATED ACTIVIT,01003031DB NSF RESEARCH & RELATED ACTIVIT | Principal Investigator: Ahmed Saeed | Institution: Georgia Tech Research Corporation, ATLANTA, GA | Award Amount: $348,081 View on NSF Award Search: https://www.nsf.gov/awardsearch/show-award/?AWD_ID=2542973 View on Research.gov: https://www.research.gov/awardapi-service/v1/awards/2542973.html

CAREER: Integrated Load and Resource Management for High-Utilization Datacenters

Description

Interested in this grant?

Grant Details

External Links

Get personalized grant matches