Category II: Transitioning the National Science Data Fabric Pilot into a National Operational Cyberinfrastructure and Service for Democratized, AI-Driven Scientific Discovery
Description
Scientific research today generates data at a scale and pace that far exceeds what most research institutions can manage on their own. Petabytes of measurements from telescopes, particle accelerators, weather sensors, and medical imaging systems sit in disconnected storage systems across the country, out of reach of many scientists who could use them to drive scientific discoveries and AI innovation. Universities and colleges with limited computing resources are seldom able to participate in large-scale, data-driven research, limiting who contributes to scientific progress, workforce training, and AI readiness. This project operationalizes the National Science Data Fabric (NSDF), a national data infrastructure service funded through the NSF Integrated Data and Systems Sciences (IDSS) program, that connects researchers at institutions of all sizes to scientific data wherever it resides at national laboratories, experimental facilities, cloud platforms, leadership-class computing centers, campus clusters, or laboratory instruments while reducing the need for costly and time-consuming data movement. By removing infrastructure gaps that limit participation in national-scale research, NSDF advances the NSF's mission to promote the progress of science and advance national health, prosperity, and welfare. The project also builds the next generation of data and AI-capable scientists through a national Fellows Program, summer training institutes, and hands-on workshops open to students and early-career researchers at universities and colleges across the country. This Category II IDSS project transitions the successful NSDF pilot (NSF Award #2138811) to a full production-grade national cyberinfrastructure and service. The core technical approach replaces the costly data-to-compute model with a federated architecture that connects computation directly to data across varied environments through standardized NSDF Entry Points deployed at campuses, national laboratories, computing centers, and cloud systems. Entry Points standardize identity management, data ingestion, metadata capture, and workflow integration across independent sites while preserving local data governance and access policies. Five integrated technical innovations drive the system: interoperable federation through open APIs, Findable, Accessible, Interoperable, and Reusable (FAIR)-aligned schemas, persistent identifiers, and federated identity services (CILogon, InCommon); AI-ready workflows and standardized AI benchmarking through MLCommons/MLPerf-integrated pipelines and containerized execution environments; automated data quality, provenance, and lineage tracking to support reproducible and verifiable AI outputs; a national user support system built on a structured five-phase user lifecycle model, a dedicated helpdesk, and AI-assisted documentation services; and continuous community co-design through domain liaisons, advisory boards, and an annual All-Hands Meeting. The NSDF-Catalog indexes datasets across scientific repositories using FAIR Digital Object standards and Croissant metadata schemas, facilitating the creation of data cohorts for AI training. As the federation grows, the project deepens integration with the National AI Research Resource (NAIRR) ecosystem, and advances long-term sustainability through consortial, academic, and industry partnership models. The project is led by the University of Tennessee Knoxville in partnership with the University of Utah, Purdue University, the Texas Advanced Computing Center (TACC), and MLCommons, with collaborating institutions across academia, national laboratories, and industry. This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria. NSF Award ID: 2609465 | Program: 01002627DB NSF RESEARCH & RELATED ACTIVIT | Principal Investigator: Michela Taufer | Institution: University of Tennessee Knoxville, KNOXVILLE, TN | Award Amount: $9,000,000 View on NSF Award Search: https://www.nsf.gov/awardsearch/show-award/?AWD_ID=2609465 View on Research.gov: https://www.research.gov/awardapi-service/v1/awards/2609465.html
Interested in this grant?
Sign up to get match scores, save grants, and start your application with AI-powered tools.
Grant Details
$9,000,000 - $9,000,000
August 31, 2029
KNOXVILLE, TN
External Links
View Original ListingWant to see how well this grant matches your organization?
Get Your Match Score