Large Language Model (LLM)-assisted coding, where an LLM automatically generates code based on a developer-specified prompt, is already popular and projected to grow, promising to bolster coding productivity while reducing software development time and effort. LLM-generated code can be insecure for a variety of reasons, for example omitting critical security checks, or containing mistakes that adversaries can exploit. When this insecure code eludes scrutiny and makes its way into production systems, our software infrastructure is at risk. This project advances the state of research and practice in LLM-assisted coding by using program analysis and verification to generate secure code. The project's novelties are to introduce and use guardrails named contexts that define security properties and guide LLMs into producing code that meets such security guarantees. The project's broader significance and importance are empowering programmers to understand the security implications of LLM-assisted coding and in turn write more secure code efficiently, helping fuel economic prosperity and increasing national security. The project consists of three thrusts. The first thrust uses program analysis to bridge the gap between security properties and LLM code generation via contexts and LLM prompts. The second thrust constructs an iterative LLM-centered code generation approach with criteria designed to improve security in each iteration. The third thrust develops a minimization approach, designed to produce minimal code examples in situations where LLM generation fails, or the LLM cannot converge toward producing secure code. These approaches are usable in a variety of other settings: scenarios where LLM-generated code needs to be rigorously or formally verified, settings where the LLM-produced code must meet certain specifications from the start, and the widely-applicable technique of generating a minimal example when LLM generation fails, so the programmers can easily understand and mitigate code generation failures. This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria. NSF Award ID: 2453331 | Program: 01002627DB NSF RESEARCH & RELATED ACTIVIT | Principal Investigator: Zhihao Yao | Institution: New Jersey Institute of Technology, NEWARK, NJ | Award Amount: $450,000 View on NSF Award Search: https://www.nsf.gov/awardsearch/show-award/?AWD_ID=2453331 View on Research.gov: https://www.research.gov/awardapi-service/v1/awards/2453331.html

SaTC: CORE: Small: Improving the Security of Large Language Model-Assisted Coding

Description

Interested in this grant?

Grant Details

External Links

Get personalized grant matches