VERIFAI: Traceability and Verification of natural-language requirements
(2024-2026)
(2024-2026)
VERIFAI is led by Professor Rosemary Monahan and Dr.Diarmuid O'Donoghue in the Dept. of Computer Science and Hamilton Institute at Maynooth University . The project is funded via the ADAPT Research Centre , funding a postdoctoral researcher Dr. Arshad Beg from December 2024 until January 2026, who collaborated and worked with members of the Principles of Programming Research Group.
Our overall goal of the project is to provide support for the automatic generation of the formal specifications, and the traceability of the requirements from the initial software design stage through the systems implementation and verification. In this way we can ensure that when a system is implemented, we not only build the correct system, but we also build the system correctly. In addition, if an error occurs in the expected behaviour of the system, we can trace to where and when the source of the violation occurs. Approaches which will be explored as part of this project include Natural Language Processing, use of ontologies to describe the software systems domain, reuse of existing software artefacts from similar systems (e.g. through similarity –based), and large language models to identify and declare the specification as well as use of artificial intelligence to guide the process.
The research delivered following items during the course of the project:
- Formalising Software Requirements using Large Language Models A two page short paper, defining scope of the project, presented as a poster at ADAPT Annual Scientific Conference (AASC'25) at DCU, Ireland.
- A Short Survey on Formalising Software Requirements using Large Language Models Submitted to SAIV 2025. Abstract: This paper presents a focused literature survey on the use of large language models (LLM) to assist in writing formal specifications for software. A summary of thirty-five key papers is presented, including examples for specifying programs written in Dafny, C and Java. Our methodology employed multiple academic databases to identify relevant research. The AI-assisted tool Elicit facilitated the initial paper selection, which were manually screened for final selection. The survey provides valuable insights and future directions for utilising LLMs while formalising software requirements.
- Leveraging LLMs for Formal Software Requirements:
Challenges and Prospects Presented at OVERLAY at ECAI 2025 (28th European Conference on Artificial Intelligence). It reported:
(1) A focussed literature survey and analysis of forty (40) research articles appeared in 2021 – 2025.
(2) Outcome of the experiments mentioned below
(3) Challenges and prospects of the field
Methodology of the initial experiments is to combine LLM with symbolic analysis tools in the Frama-C ecosystem. The workflow integrates path-based I/O examples and verification outputs to guide the generation of context-aware ACSL specifications. As a first step, we picked up 36 C programs from iFM2024 tutorial on Frama-C ecosystem and applied the above work pipeline to C programs. Conclusions drawn from Verifiers Output: (1) Across the evaluated 36 C programs, 28 were evaluated successfully, in which, approximately 94% of all verification goals were successfully proven, demonstrating high reliability of the verification toolchain. (2) Failures were primarily due to timeout or partial prover coverage on complex arithmetic and array manipulation examples. Overall, the combined prover strategy offers robust verification performance with only marginal cases requiring further proof engineering or model refinement. - Traceable and verifiable software requirements: A synthesis of AI-enabled formal methods submitted to Journal of Software Testing, Verification and Reliability, Nov. 2025, under review, ver. v1.
Drawing on an analysis of more than one hundred studies, it identifies state-of-the-art methods for translating natural language requirements into formal specifications across various programming languages and verification tools, including model checkers and theorem provers. Beyond technical methodologies, the work highlights key challenges in requirements traceability, tool integration, and specification reuse. Complementing this survey, the paper reports an empirical evaluation of Frama-C plugins, including the Runtime Error (RTE) and Value Analysis (EVA) analysers, as well as the PathCrawler automatic test generation tool. These experiments assess the capability of formal analysis tools to infer assertions, generate ANSI/ISO C Specification Language (ACSL) annotations, detect runtime anomalies, and produce test cases for representative C programs. The study identifies practical barriers such as solver instability, path coverage constraints, and tool-specific restrictions, proposing strategies like dataset adaptation, selective path exploration, and local configuration to overcome them. The findings demonstrate that automated formalisation and verification workflows can enhance program correctness while emphasising the continued importance of human-guided intervention and coordinated multi-tool use. - Learning-Infused Formal Reasoning: From Contract Synthesis to Artifact Reuse and Formal Semantics Accepted at VERIFAI-2026: The Interplay between Artificial Intelligence and Software Verification LASER center, Villebrumier, France, March 8-11, 2026
It presented a long term vision of the research space. Abstract: This vision paper articulates a long-term research agenda for formal methods at the intersection with artificial intelligence, outlining multiple conceptual and technical dimensions and reporting on our ongoing work toward realising this agenda. It advances a forward-looking perspective on the next generation of formal methods based on the integration of automated contract synthesis, semantic artifact reuse, and refinement-based theory. We argue that future verification systems must move beyond isolated correctness proofs toward a cumulative, knowledge-driven paradigm in which specifications, contracts, and proofs are continuously synthesised and transferred across systems. To support this shift, we outline a hybrid framework combining large language models with graph-based representations to enable scalable semantic matching and principled reuse of verification artifacts. Learning-based components provide semantic guidance across heterogeneous notations and abstraction levels, while symbolic matching ensures formal soundness. Grounded in compositional reasoning, this vision points toward verification ecosystems that evolve systematically, leveraging past verification efforts to accelerate future assurance.
