Proving correctness and worst-case time complexity of bisimulation algorithms

Context

Efficient algorithms for behavioural equivalences such as strong bisimulation and branching bisimulation rely on refinable partition data structures [1]. We recently developed an O(m log n) algorithm for computing branching bisimulation equivalence of labelled transition systems [2], and implemented it in the mCRL2 toolset. Since the proofs of correctness and of the time complexity of the algorithm are highly non-trivial, it is desired to formally verify these proofs.

The data structures used in the branching bisimulation algorithm are, in essence, based on the refinable partition data structures by Valmari [1]. Previously, students have made initial formalizations of this data structure in Rocq and in TLA+. Recently, Emilia Huynen formalized a version of the Kanellakis-Smolka algorithm for strong bisimulation [3] (which is much simpler than the branching bisimulation algorithm) in Rocq using its embedded functional programming language, and established the correctness and worst-case running time complexity of the algorithm.

Objective

Based on this earlier work, there are several directions that can be explored to work towards full correctness and time complexity proofs of the branching bisimulation algorithm. Two examples of research problems that could be addressed are:

Using the approach by Huynen, formalize the Paige-Tarjan algorithm [4] using Gallina. As this algorithm uses advanced techniques to achieve an O(m log n) running time, this will require investigating how amortized running time analysis can be done for this formalization using Rocq.
An alternative to using Gallina is to use an embedded language approach to formalize bisimulation algorithms in Rocq. In this case, reasoning about the correctness of the program requires separation logic, and the first challenge would again be to prove correctness and worst-case running time of the Kanellakis-Smolka algorithm.

How the worst-case running time can be formalized in this setting needs to be investigated. It would be interesting to compare this approach to the one used by Huynen, and establish which approach is more promising for the formalization of the more complex (branching) bisimulation algorithms.

Supervision

This assignment will be supervised by Jeroen Keiren and Herman Geuvers

References

[1] Valmari, A., Lehtinen, P.: Efficient Minimization of DFAs with Partial Transition Functions. In: Albers, S. and Weil, P. (eds.) 25th International Symposium on Theoretical Aspects of Computer Science. pp. 645-656 Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik, Dagstuhl, Germany (2008). https://doi.org/10.4230/LIPIcs.STACS.2008.1328.

[2] Jansen D.N., Groote J.F., Keiren J.J.A., Wijs A. (2020) An O(m log n) algorithm for branching bisimilarity on labelled transition systems. In: Biere A., Parker D. (eds) Tools and Algorithms for the Construction and Analysis of Systems. TACAS 2020. Lecture Notes in Computer Science, vol 12079. Springer, Cham https://doi.org/10.1007/978-3-030-45237-7_1

[3] Kanellakis, P.C., Smolka, S.A.: CCS expressions, finite state processes, and three problems of equivalence. Information and Computation. 86, 43–68 (1990). https://doi.org/10.1016/0890-5401(90)90025-D.

[4] Paige, R., Tarjan, R.: Three partition refinement algorithms. SIAM J. Comput. 16, 973–989 (1987). https://doi.org/10.1137/0216062.