Fun with Fixpoints
I was recently reading a paper on higher-order program verification. The paper uses higher-order modal fixpoint logic (HFL), which is an extension to the $\mu$-calculus to include higher-order predicates. The authors also extended the logic to include basic integer predicates and operations. However, the authors did not include explicit support for quantifiers over integers, noting that the expression $\exists x.\phi(x)$ can be encoded by the following HFL formula:
\[ \big(\mu X.\lambda x.\phi(x) \vee X(x - 1) \vee X(x + 1)\big) (0) \]
and similarly, $\forall x.\phi(x)$ can be encoded as follows:
\[ \big(\nu X.\lambda x.\phi(x) \wedge X(x - 1) \wedge X(x + 1)\big) (0) \]
(In the above, $\mu X.e$ denotes the least-fixed point of the function $F(X) = e$ where $e$ has $X$ free, and $\nu X.e$ denotes the greatest-fixed point of a similarly defined function.)
This construction makes a lot of intuitive sense! Let’s think about how we might write a computer program to verify each of the expressions. For an exists expression, we would start at zero, and begin a search into the positive and negative numbers until we found some number that satisfies $\phi$. For the $\forall$ expression, we enumerate all negative and positive numbers (starting at 0) and verify the property holds for each number. Of course, both of these programs may never terminate, but they provide a good intuition behind these constructions. But let’s try proving that these construtions actually do what we say.
Note that in the above and what follows, I will be deviating from the standard definitions of HFL. In particular, I will completely ignore the modal fragment of the logic. Thus, our semantic domain for propositions is $\{ \top, \bot \}$, abbreviated as $\textbf{B}$. We define $\bot \sqsubseteq \top$, so functions form a complete lattice, with the obvious definitions.
Exists as a Least Fixed-Point
Let’s first prove the $\exists$ case. We will show that, for any $\phi$:
\[ \def\meta#1{{\color{blue}#1}} \def\denote#1{\llbracket#1\rrbracket} \denote{\big(\mu X.\lambda x.\phi(x) \vee X(x - 1) \vee X(x + 1)\big) (0)} \mathrel{\meta{\Leftrightarrow}} \meta{\exists x}.\denote{\phi}(x) \]
Note that the $\Leftrightarrow$ and $\exists$ symbols are in our metalanguage, so I have colored them blue.
It is instructive to first consider why we chose a least-fixed point characterization for this encoding. Let’s switch the $\mu$ to $\nu$, and see what happens. Then we are considering the greatest-fixed point of (the denotation of) the following function:
\[ M : (int \rightarrow \bullet) \rightarrow (int \rightarrow \bullet) = \lambda X: (int \rightarrow \bullet). \lambda x: int.\phi(x) \vee X(x - 1) \vee X(x + 1) \]
(I have added type annotations here for clarity, although I will often drop them in the rest of this post. Here $\bullet$ is the type of propositions.)
Unfortunately for us, the greatest fixed point of this function is actually $\lambda y.\top$, i.e., the function that trivially returns true! Let’s see how. First, let’s verify that $\lambda y.\top$ is the greatest fixed point of the denotation of $M$:
\[ \begin{align} \denote{M}(\lambda y.\top) & = \lambda x.\denote{\phi}(x) \mathrel{\meta{\vee}} (\lambda y.\top) (x - 1) \mathrel{\meta{\vee}} (\lambda y.\top) (x + 1) \\ & = \lambda x.\denote{\phi}(x) \mathrel{\meta{\vee}} \top \mathrel{\meta{\vee}} \top \\ & = \lambda x.\top \end{align} \]
Note that in the above, $\meta{\vee}$ is disjunction in our metalanguage. Alpha-renaming gives that $\denote{M}(\lambda y.\top) = \lambda y.\top$. Further, as $\lambda y.\top$ is the largest element of the functions of type $int \rightarrow \textbf{B}$, it must be the greatest such fixpoint.
With this greatest fixed point, the forward case of the bi-implication becomes:
\[ (\lambda y.\top) (0) \mathrel{\meta{\Rightarrow}} \meta{\exists x.}\denote{\phi}(x) \]
Which after simplification, reveals that we must prove that every $\phi$ is true for some $x$, which is clearly nonsense.
Let us return then to the least-fixed point characterization. We will first prove the forward direction via the contrapositive, i.e.:
\[ \meta{\big(}\meta{\forall x.\neg}\denote{\phi}(x)\meta{\big)} \mathrel{\meta{\Rightarrow}} \meta{\neg}\denote{(\mu X.\lambda x.\phi(x) \vee X(x - 1) \vee X(x + 1)\big)(0)} \]
Given that the $\denote{\phi}(x)$ term must always be $\bot$, the denotation of the $\mu$ term is equivalent to:
\[ \denote{\mu X.\lambda x.X(x - 1) \vee X(x + 1)} \]
which is just $\lambda y.\bot$, whence $(\lambda y.\bot) (0) = \bot$, which completes the proof
Let us now prove the backward direction. Without loss of generality, we’ll assume that $\phi$ holds at some arbitrary $y$, i.e., $\denote{\phi}(y)$. Let us denote by $m$ the least fixed-point of the denotation of $M$. $m$ is a predicate on integers, i.e. a function from $int$ to $\textbf{B}$. Then we must have $m(y)$, as $m = \denote{M}(m)$ (because $m$ is a fixpoint of $\denote{M}$), and expanding we have:
\[ \begin{align} m(y) = \denote{M}(m)(y) & = (\lambda x.\denote{\phi}(x) \mathrel{\meta{\vee}} m(x - 1) \mathrel{\meta{\vee}} m(x + 1))(y) \\ & = \denote{\phi}(y) \mathrel{\meta{\vee}} m(y - 1) \mathrel{\meta{\vee}} m(y + 1) = \top \end{align} \]
If $y$ is 0, we’re done! But if it’s not, without loss of generality, let’s assume $y$ is positive. We can now prove by induction that $\forall n.\denote{M}^n(m)(y - n)$ (where $\denote{M}^n(m)$ denotes $n$ successive applications of the denotation of $M$). The base case is trivial by the above reasoning. In the inductive step we have:
\[ \begin{align} \denote{M}^{n}(m)(y - n) & = \denote{\phi}(y - n) \mathrel{\meta{\vee}} \denote{M}^{n - 1}((y - n) - 1) \mathrel{\meta{\vee}} \denote{M}^{n - 1}((y - n) + 1) \end{align} \]
Simple algebra on the last term and the inductive hypothesis gives us that \[ \denote{M}^{n - 1}((y - n) + 1) = \denote{M}^{n - 1}(y - (n - 1)) \]
With this proven, we have that:
\[ \denote{M}^y(m)(y - y) = m(y - y) = m(0) = \denote{(\mu X.\lambda x.\phi(x) \vee X(x - 1) \vee X(x + 1)) 0} \]
Where the first equality follows from the fact that $m$ is a fixed-point of $\denote{M}$.
Forall as a Greatest Fixed-Point
We can repeat this exercise with the characterization of a $\forall$ statement as a greatest fixed-point. We wish to prove that:
\[ \denote{\big(\nu X.\lambda x.\phi(x) \wedge X(x - 1) \wedge X(x + 1)\big) (0)} \mathrel{\meta{\Leftrightarrow}} \meta{\forall x.}\denote{\phi}(x) \]
Again, let us first consider why we use a greatest fixed-point instead of a least fixed-point. Here, we are considering the least fixed-point of the denotation of the following function:
\[ M = \lambda X.\lambda x.\phi(x) \wedge X(x - 1) \wedge X(x + 1) \]
In pleasing duality with the exists case above, the least fixed-point of $\denote{M}$ is $\lambda y.\bot$. Plugging this least fixed-point into our bi-implication, we will soon find ourselves asked to prove that when $\denote{\phi}$ holds at all $x$ that $(\lambda y.\bot) (0) = \bot$, which is of course impossible.
Let us then prove that our greatest fixed point characterization is correct. Let the greatest fixed-point be called $g$, which is again a predicate on integers (equivalently, a function from $int$ to $\textbf{B}$). We will first prove the forward direction by showing that:
\[ \forall x.g(x) \Rightarrow g(x + 1) \wedge g(x - 1) \]
Let us prove that implication holds for first conjunct, the proof of the second holds by similar reasoning. As $g$ is a fixpoint of $\denote{M}$, we have that $g = \denote{M}(g)$, whence:
\[ g(x) = \denote{M}(g)(x) = \denote{\phi}(x) \mathrel{\meta{\wedge}} g(x - 1) \mathrel{\meta{\wedge}} g(x + 1) \Rightarrow g(x + 1) \]
Starting from our assumption that $g(0)$, a simple inductive argument shows that $g(x)$ holds at all $x$.
Next, again exploiting the fact that $g = \denote{M}(g)$, we can show that $g(x) \Rightarrow \denote{\phi}(x)$, which combined with the fact that $\forall x.g(x)$, gives us that $\denote{\phi}(x)$ must hold at all $x$.
We will now prove the backwards case. Assuming that ${\color{blue}\forall x.}\denote{\phi}(x)$, we will show that $\lambda y.\top \sqsubseteq g$. Tarski’s fixed-point theorem tells us that the greatest fixed point $g$ is least upper bound of all pre-fixed points of $\denote{M}$. In other words, for any $g’$ such that $g’ \sqsubseteq \denote{M}(g’)$ then $g’ \sqsubseteq g$. Thus, it suffices to show that $\lambda y.\top \sqsubseteq \denote{M}(\lambda y.\top)$. Using our assumption that ${\color{blue}\forall x.}\denote{\phi}(x)$, and expanding $\denote{M}$ we have that:
\[ \begin{align} \denote{M}(\lambda y.\top) & = \lambda x.\denote{\phi}(x) \mathrel{\meta{\wedge}} (\lambda y.\top) (x - 1) \mathrel{\meta{\wedge}} (\lambda y.\top) (x + 1) \\ & = \lambda x.\top \mathrel{\meta{\wedge}} \top \mathrel{\meta{\wedge}} \top = \lambda x.\top \end{align} \]
Alpha-renaming shows that $\lambda y.\top$ is not only a prefixed point, but also fixed point of $\denote{M}$ and—because it is the supremum of the domain $int \rightarrow \textbf{B}$—the greatest such fixed point! However, suppose we didn’t somehow know that $\lambda y.\top$ was our greatest fixed point. Tarski’s fixed-point theorem would still give us that $\lambda y.\top \sqsubseteq g$, which implies that $(\lambda y.\top) (0) \sqsubseteq g(0)$, which in turn implies $g(0)$ as required.
Thanks to James Wilcox and Nate Yazdani for their feedback on early versions of this post.