The project investigates how to make intelligent robots more reliable, understandable, and safe when performing complex tasks over long periods of time. Modern learning-based systems can achieve impressive performance, but they often lack guarantees about correctness and safety, especially when tasks involve sequencing, repetition, and conditional behavior. The project addresses these challenges by combining robot learning with rigorous reasoning techniques. The project's novelties are a formulation that embeds temporal logic task constraints directly into the reinforcement learning objective to translate temporal progress into informative reward signals, a mechanism for automatically inferring temporal task specifications, and compositional methods for verifying long-horizon agent behaviors. The project's impacts are the development of trustworthy robotic systems capable of operating in uncertain environments, improved accessibility of formal methods for non-expert users, and new foundations for reliable AI/ML systems. The project will also benefit AI/ML tools and toolchains by introducing verification components that can be integrated into existing learning pipelines. In addition, the resulting algorithms and software provide reusable capabilities for specification inference and compositional verification in broader AI systems. The project will introduce formal methods into K–12 and undergraduate classrooms through the development of an interactive robot simulation tool that connects formal task specifications with robot behaviors, supporting hands-on coursework and mentoring activities that prepare a workforce trained in formal methods for building trustworthy systems. The project develops a principled framework for synthesizing and verifying robot control policies for Linear Temporal Logic (LTL) objectives in continuous, stochastic environments. The framework leverages Limit-Deterministic Buchi Automata (LDBAs) as an abstract representation to guide both learning and verification. First, it introduces LDBA-guided policy learning that generates temporally aligned reward signals from the agent's own exploratory trajectories, enabling efficient learning under sparse supervision. Second, it proposes methods for inferring temporal objectives and event detectors directly from demonstrations, reducing reliance on manually specified LTL formulas and enabling intuitive task specification. Third, it develops a compositional verification methodology that decomposes infinite-horizon temporal properties into shorter-horizon subproblems, enabling scalable reasoning through inductive invariants with probabilistic guarantees in black-box settings. Together, these contributions will significantly improve the reliability, scalability, and safety of robot learning systems, advancing the scientific foundations of formal methods for AI and supporting trustworthy autonomous behavior in safety-critical applications. This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria. NSF Award ID: 2544018 | Program: 01003031DB NSF RESEARCH & RELATED ACTIVIT,01002627DB NSF RESEARCH & RELATED ACTIVIT,01002930DB NSF RESEARCH & RELATED ACTIVIT | Principal Investigator: He Zhu | Institution: Rutgers University New Brunswick, NEW BRUNSWICK, NJ | Award Amount: $395,821 View on NSF Award Search: https://www.nsf.gov/awardsearch/show-award/?AWD_ID=2544018 View on Research.gov: https://www.research.gov/awardapi-service/v1/awards/2544018.html

CAREER: Structured Learning and Verification of Control Policies for Linear Temporal Logic Objectives

Description

Interested in this grant?

Grant Details

External Links

Get personalized grant matches