Principles of Programming Research Group

VERIFAI: Traceability and Verification of natural-language requirements
(2024-2026)

VERIFAI is led by Professor Rosemary Monahan and Dr. Diarmuid O'Donoghue in the Dept. of Computer Science and Hamilton Institute at Maynooth University . The project is funded via the ADAPT Research Centre , funding a postdoctoral researcher Dr. Arshad Beg from December 2024 until January 2026, who collaborated and worked with members of the Principles of Programming Research Group.

Our overall goal of the project is to provide support for the automatic generation of the formal specifications, and the traceability of the requirements from the initial software design stage through the systems implementation and verification. In this way we can ensure that when a system is implemented, we not only build the correct system, but we also build the system correctly. In addition, if an error occurs in the expected behaviour of the system, we can trace to where and when the source of the violation occurs. Approaches which will be explored as part of this project include Natural Language Processing, use of ontologies to describe the software systems domain, reuse of existing software artefacts from similar systems (e.g. through similarity –based), and large language models to identify and declare the specification as well as use of artificial intelligence to guide the process.

The research delivered following items during the course of the project:

Formalising Software Requirements using Large Language Models A two page short paper, defining scope of the project, presented as a poster at ADAPT Annual Scientific Conference (AASC'25) at DCU, Ireland.
A Short Survey on Formalising Software Requirements using Large Language Models Submitted to SAIV 2025. Abstract: This paper presents a focused literature survey on the use of large language models (LLM) to assist in writing formal specifications for software. A summary of thirty-five key papers is presented, including examples for specifying programs written in Dafny, C and Java. Our methodology employed multiple academic databases to identify relevant research. The AI-assisted tool Elicit facilitated the initial paper selection, which were manually screened for final selection. The survey provides valuable insights and future directions for utilising LLMs while formalising software requirements.
Leveraging LLMs for Formal Software Requirements: Challenges and Prospects Presented at OVERLAY at ECAI 2025 (28th European Conference on Artificial Intelligence). It reported:
(1) A focussed literature survey and analysis of forty (40)
research articles appeared in 2021 – 2025.
(2) Outcome of the experiments mentioned below
(3) Challenges and prospects of the field

Methodology of the initial experiments is to combine LLM with symbolic analysis tools in the Frama-C ecosystem. The workflow integrates path-based I/O examples and verification outputs to guide the generation of context-aware ACSL specifications.
As a first step, we picked up 36 C programs from iFM2024 tutorial on Frama-C ecosystem and applied the above work pipeline to C programs.
Conclusions drawn from Verifiers Output: (1) Across the evaluated 36 C programs, 28 were evaluated successfully, in which, approximately 94% of all verification goals were successfully proven, demonstrating high reliability of the verification toolchain. (2) Failures were primarily due to timeout or partial prover coverage on complex arithmetic and array manipulation examples. Overall, the combined prover strategy offers robust verification performance with only marginal cases requiring further proof engineering or model refinement. Supporting documents are available from GitHub Repo .
Traceable and verifiable software requirements: A synthesis of AI-enabled formal methods submitted to Journal of Software Testing, Verification and Reliability, Nov. 2025, under review, ver. v1.

Abstract: Drawing on an analysis of more than one hundred studies, it identifies state-of-the-art methods for translating natural language requirements into formal specifications across various programming languages and verification tools, including model checkers and theorem provers. Beyond technical methodologies, the work highlights key challenges in requirements traceability, tool integration, and specification reuse.

Complementing this survey, the paper reports an empirical evaluation of Frama-C plugins, including the Runtime Error (RTE) and Value Analysis (EVA) analysers, as well as the PathCrawler automatic test generation tool. These experiments assess the capability of formal analysis tools to infer assertions, generate ANSI/ISO C Specification Language (ACSL) annotations, detect runtime anomalies, and produce test cases for representative C programs. The study identifies practical barriers such as solver instability, path coverage constraints, and tool-specific restrictions, proposing strategies like dataset adaptation, selective path exploration, and local configuration to overcome them. The findings demonstrate that automated formalisation and verification workflows can enhance program correctness while emphasising the continued importance of human-guided intervention and coordinated multi-tool use.
Code and data are available from GitHub Repo .
Learning-Infused Formal Reasoning: From Contract Synthesis to Artifact Reuse and Formal Semantics Accepted at VERIFAI-2026: The Interplay between Artificial Intelligence and Software Verification LASER center, Villebrumier, France, March 8-11, 2026.
It presents a long term vision of the research space.
Abstract: This vision paper articulates a long-term research agenda for formal methods at the intersection with artificial intelligence, outlining multiple conceptual and technical dimensions and reporting on our ongoing work toward realising this agenda. It advances a forward-looking perspective on the next generation of formal methods based on the integration of automated contract synthesis, semantic artifact reuse, and refinement-based theory. We argue that future verification systems must move beyond isolated correctness proofs toward a cumulative, knowledge-driven paradigm in which specifications, contracts, and proofs are continuously synthesised and transferred across systems. To support this shift, we outline a hybrid framework combining large language models with graph-based representations to enable scalable semantic matching and principled reuse of verification artifacts. Learning-based components provide semantic guidance across heterogeneous notations and abstraction levels, while symbolic matching ensures formal soundness. Grounded in compositional reasoning, this vision points toward verification ecosystems that evolve systematically, leveraging past verification efforts to accelerate future assurance.
Evaluating LLM-Generated ACSL Annotations for Formal Verification Submitted to Formal Techniques for Judicious Programming (FTfJP2026 at ECOOP) Conference. Status: Under Review.
Abstract: Formal specifications are crucial for building verifiable and dependable software systems, yet generating accurate and verifiable specifications for real-world C programs remains challenging. This paper empirically evaluates the extent to which formal-analysis tools can automatically generate and verify ACSL specifications without human or learning-based assistance. We conduct a controlled study on a recently released dataset of 506 C programs, repurposing it from interactive, developer-driven workflows to an automated evaluation setting. Five ACSL generation systems are compared: a rule-based Python script, Frama-C’s RTE plugin, and three large language models–DeepSeek-V3.2, GPT-5.2, and OLMo 3.1 32B Instruct. All generated specifications are verified under identical conditions using the Frama-C WP plugin powered by multiple SMT solvers, allowing a direct comparison of annotation quality, solver sensitivity, and proof stability. Our results provide new empirical evidence on the capabilities and limitations of automated ACSL generation, complementing prior survey-based work. Code and data are available from GitHub Repo .

Funding: This project is funded via ADAPT research centre platform funding (Phase 2). ADAPT, the world-leading SFI Research Centre for AI-Driven Digital Content Technology, brings leading academics, researchers, and industry partners together to deliver excellent science, engage the public, develop novel solutions for business across all sectors and enhance Ireland’s international reputation. Coordinated by Trinity College Dublin and co-hosted by Dublin City University, ADAPT’s partner institutions include University College Dublin, Technological University Dublin, Maynooth University, Munster Technological University, Technological University of the Shannon: Midlands Midwest, and the National University of Ireland Galway.

Principles of Programming Research Group

VERIFAI: Traceability and Verification of natural-language requirements (2024-2026)

VERIFAI: Traceability and Verification of natural-language requirements
(2024-2026)