CHAPTER 1
Proving Termination Using Dependent Types: the Case of Xor-Terms
Jean-François Monin, Judicaël Courant
Abstract: We study a normalization function in an algebra of terms quotiented by an associative, commutative and involutive operator (logical xor). This study is motivated by the formal verification of cryptographic systems, relying on a normalization function for xor-terms. Such a function is easy to define using general recursion. However, as it is to be used in a type theoretic proof assistant, we also need a proof of its termination. Instead of using a mixture of various rewriting orderings, we follow an approach involving the power of Type Theory with dependent types. The results are to be applied in the proof of the security API described in.
1.1 INTRODUCTION
This work originates in the verification of a cryptographic system in the Coq proof assistant. In the course of this verification, we modelized plaintext and encrypted data as first-order sorted terms. For instance, a typical term could be {K1 [direct sum] K2}H(KM,H(EXP,KP)) where {x}y denotes encryption of x using a key y, H(x,y) denotes the fingerprint (cryptographic checksum) of the pair (x, y), x [direct sum] y denotes the bitwise exclusive or of x and y. The cryptographic primitives {_}_ and H are commonly supposed to be perfect, and can thus be treated as free constructors. On the contrary, we need to deal with the equational theory of the exclusive or function as many (potential or effective) attacks are based on its algebraic properties.
Deduction in presence of a non-trivial equational theory in the general case is notoriously difficult: shows that the matching problem in the presence of an associative-commutative (AC) function symbol is NP-complete, even in very restricted cases, and AC matching is still an active research topic. In order to solve our particular problem, we only needed a way to define a canonical form for closed terms and a proof of existence of a canonical form for each term. More precisely, we needed a canonicalization function putting any term into its canonical form. On paper, we would use results from term rewriting theory in order to prove the existence of a normalization function with respect to some rewriting system. Unfortunately, the formalization of these results in Coq is only partial yet. Therefore we tried to apply the methods of terms rewriting theory to our problem but stopped when we realized the amount of work needed to complete the definition of the normalization function (section 1.2).
Then we observed that the existence of such a normalization function could be reduced to the existence of a normalization function for terms built over [direct sum] and 0 only. In other words, we can layer our terms into alternating levels of free and non-free constructors, and normalization of the overall term can be deduced from the normalization of all layers (section 1.3).
We formalized such a proof in Coq (section 1.4). Our results are to be applied in our proof of security of an API. However, the approach investigated here is not tied to the specific term algebra of this case study. We reflected this at several places in our formal development, by stating some key definitions and lemmas in an abstract setting, independently of any specific application.
1.2 BACKGROUND
1.2.1 The Need for a Canonicalization Function
In order to reason of our cryptographic system, we need to define some functions or properties over terms. In order to be consistent, we must ensure that these definitions are compatible with the equational theory E induced by commutativity, associativity and cancelation laws of [direct sum] and 0. We can define such a property P using one of the following ways:
• We define P on the whole set of terms (for instance by structural induction). We check afterwards that P is indeed compatible with equality, that is, we check that [MATHEMATICAL EXPRESSION OMITTED]
• We give a canonicalization function N for terms, i.e. a function N from terms to terms such that [MATHEMATICAL EXPRESSION OMITTED]. Then we define P over canonical forms only and extend it in the only compatible way. More formally, we first define some auxiliary predicate P' over terms (for instance by structural induction) and P(t) is defined as P'(N(t)). Thus P is compatible with E by construction.
In our case, the latter approach looks the only reasonable one. For instance, we needed to check whether some secret constant appears in a given term. Let us consider an example: it is clear that KM does not appear in {K1}H(EXP,KEK). But checking that it does not appear in KM [direct sum] {K1} H(EXP,KEK) [direct sum] KM is a bit more difficult, as you have to notice that, although KM syntactically is a subterm of this latter term, the occurences of KM have no semantic significance here because of the self-cancelation law. We therefore do not see how to give a definition for this notion by structural induction on the whole set of terms. On the other hand, once we applied all possible simplication rules (self-cancelation and neutral element cancelation, modulo associativity and commutativity), we just need to check a syntactical occurence of KM.
Therefore, we need to define a canonicalization function N over terms which is also a simplification function, i.e. it ensures that for all term t, no simplification rule applies to N(t).
1.2.2 What Term Rewriting Theory Says
The equational theory of [direct sum] and 0 can be described by the following equations:
[MATHEMATICAL EXPRESSION OMITTED] (1.1)
[MATHEMATICAL EXPRESSION OMITTED] (1.2)
Neutral element x [direct sum] 0 [??] x (1.3)
Involutivity x [direct sum] x [??] 0 (1.4)
Term rewriting theory modulo associativity and commutativity (AC), as described in would say that this equational theory is generated by the following rewrite system,...