Hillel Wayne recently published a blog post titled "Modeling Adversaries with TLA+". The post does a good job showing how adversaries and environmental effects can be modeled in TLA^{+}. The post is also subtly incorrect when it comes to two of the liveness properties. In other words, "Modeling Adversaries with TLA+" not only shows how to model adversaries but also comes with a spec to discuss the intricacies of temporal logic. :-)

Before we get started, let me make clear that the spec indeed satisfies all of the properties described in prose. The error is that some of the specified properties do not match their description but are instead trivially true. This is why Lamport - in e.g. the TLA^{+} video lecture #5 - cautions us to "Always be suspicious of success". Reasoning about liveness is delicate business and easy to get wrong.

To quickly recap, the spec models a system that converges to a set of goal states (`x \in Goal`

) and an environment `World`

that perturbs the system by knocking it out of the goal states. Below is a slightly simplified excerpt of the original spec:

```
VARIABLES x
TotalInterval == 0..10
Goal == 2..4
TypeInvariant == x \in TotalInterval
Init == x \in TotalInterval
Machine == IF x < 3 THEN x' = x + 1 ELSE x' = x - 1
World == x' \in TotalInterval
Next == Machine \/ World
Spec == Init /\ [][Next]_x /\ WF_x(Machine)
```

Several invariants and liveness properties are discussed that are satisfied by `Spec`

. Two properties however are incorrect; `FiniteWorldStable`

and `RareWorldResilient`

.

## FiniteWorldStable

The comment for the `FiniteWorldStable`

property says:

[...] If the world is only kicking things out of alignment a finite number of times, say one million, then it should still be stable. My argument is that after the millionth kick, we're now somewhere in TotalInterval and the spec is equivalent to one without the World. We can represent “finite World actions” by saying “it's not always eventually the case that World happens”, which we'd write in TLA+ as

`~[]<><<World>>_x`

. Here the <<>> means “an action that changes x”, not sequence.While Stable still doesn't hold, FiniteWorldStable does.

This is translated into the following TLA^{+} property:

```
Safe == x \in Goal
Stable == <>[]Safe
FiniteWorldStable == ~[]<><<World>>_x => Stable
```

The reasoning in the comment is correct. Checking `FiniteWorldStable`

with TLC confirms that the property holds. Being suspicious of success though, let us alter `FiniteWorldStable`

to `~[]<><<World>>_x => ~<>[](x \in Goal)`

; we negated the implication's consequent `Stable`

. Unfortunately, this also holds! Even substituting the consequent with `23=42 /\ 23#42`

holds. The reason is that `Spec => FiniteWorldStable`

is vacuously true. We will discuss why it is vacuously true later. First we will shift our attention to `RareWorldResilient`

:

## RareWorldResilient

[...] to get resilience we don't need to require World to only happen finite times. Instead, we only need to guarantee it happens finite times while we're out of equilibrium. If eventually the world only kicks x out of Goal when it's already in Goal, then we're giving our machine enough time to return x to Goal and we have resilience.

Another way of looking at it: if World happens rarely enough, say one-tenth as often as Machine, then we'll return to Goal before the next World action pushes us out again.

This property holds.

This is translated into `RareWorldResilient`

:

```
Resilient == []<>Safe
RareWorldResilient == <>[][World => Safe]_x => Resilient
```

As with `FiniteWorldStable`

above, the prose is convincing and TLC once again confirms that `RareWorldResilient`

holds. But as we have now learned, implication can be tricky. Let us try and replace the consequent of the inner implication: `<>[][World => 23 # 42]_x => Resilient`

. Checking the modified property with TLC should make us suspicious.

## What is the problem?

At this point we will zoom out and discuss the underlying problem. For that we will first need Lamport's (see "The Temporal Logic of Actions" (section 4.1)) definition of what an *action-step* is. Let `s,t`

be a pair of states (a state `s`

and its successor state `t`

such as `x=8,x=7`

). The pair `s,t`

is a `World`

step iff `s[[World]]t`

equals true. We obtain `s[[World]]t`

by replacing each unprimed variable `x`

in `World`

with the value of `x`

in state `s`

and each primed occurrence of the variable `x`

with the value of `x`

in the successor state `t`

. For example, the pair `x=8,x=7`

is a `World`

step because `7 \in TotalInterval`

equals true. However, the same argument applies to `s[[Machine]]t`

because `7 = 8 - 1`

. Thus, the pair `x=8,x=7`

is both a `World`

and a `Machine`

step. The pair `x=8,x=7`

is not the only pair which is a `World`

and `Machine`

step. Due to the definitions of the `World`

and `Machine`

actions, the spec implies that any `Machine`

step is also a `World`

step (but not vice versa)! The only behaviors satisfying the spec and the property `~[]<><<World>>_x`

are ones with a (finite) prefix of `World`

steps followed by infinite stuttering. The spec's *fairness* conjunct `WF_x(Machine)`

however, asserts that there are infinitely many (non-stuttering) `Machine`

steps. Thus `~[]<><<World>>_x`

never holds and `Spec => FiniteWorldStable`

is vacuously true.

To approach this from a different angle, we will substitute for the identifier `World`

its definition `x' \in TotalInterval`

in `FiniteWorldStable`

to obtain `~[]<><<x' \in TotalInterval>>_x => Stable`

. This immediately looks funny. But can we verify it too? We can use TLC to verify that `[][x' \in TotalInterval]_x`

(always) holds for all behaviors of spec (the type invariant `TypeInvariant`

confirms it too) and therefore `~[]<><<x' \in TotalInterval>>_x`

has to be always false. To repeat, all steps of all behaviors satisfying `Spec`

are always `World`

steps regardless of whether a step "occurred" by taking a `Machine`

or `World`

*action*.

The property `RareWorldResilient`

suffers from the same flaw. By replacing `World`

with its definition we obtain `<>[](x' \in TotalInterval => Safe]_x => []<>Safe`

. By what we discussed in the previous paragraph, `RareWorldResilient`

thus equals `<>[][TRUE => Safe]_x => []<>Safe`

which can be reduced to `<>[][Safe]_x => []<>Safe`

. The formula `<>[]P => []<>P`

is a tautology in temporal logic. Semantically, if `Safe`

is eventually always true - in any behavior of `Spec`

- then it follows that `Safe`

is also repeatedly true.

How can we address this problem? We somehow have to make sure that `Spec`

no longer implies that `Machine`

steps are also `World`

steps.

## How do we fix this?

One approach would be to modify the definition of `World`

to `x' \in (TotalInterval \ Goal)`

. This causes `Machine`

steps in the `Goal`

set to not be `World`

steps. For example `x=2,x=3`

would be a `Machine`

but not a `World`

step. We can verify this by checking that `[][World]_x`

is violated.

Unfortunately, this approach has drawbacks: The original definition of `World`

allows for behaviors (satisfying the spec) where the variable `x`

non-deterministically changes in the `Goal`

interval. Especially it allows behaviors with e.g. a `World`

step such as `x=3,x=2`

(all `Machine`

steps are `World`

steps but not vice versa). Semantically this models an adversary/environment that perturbs our `Machine`

but that we define to "play by the rules" in certain states. This is upside down because we want to make as few assumptions about the adversary as possible. On a tangent, we could even argue that `x' \in TotalInterval`

should be broadened to `x' \in Nat`

to model all (type-correct) values and - for model checking - be redefined with a *Definition Override* to a *finite* set of e.g. `0..6`

.

There is a more subtly problem with changing the `World`

action though. Let us use TLC to generate all steps of `Spec`

that are not `World`

steps. We again check the property `[][World]_x`

but in order for TLC to print *all* violations (which correspond to the steps we are interested in), we use the -continue parameter[^1]. TLC reports the following five violations:

- x=2,x=3
- x=3,x=2
- x=4,x=3
- x=1,x=2
- x=5,x=4

As we can see, this is only a subset of the possible `Machine`

steps of the specification. In other words, by modifying the `World`

action to `x' \in (TotalWorld \ Goal)`

, the `FiniteWorldStable`

property *only* asserts that `Spec`

is guaranteed to reach a `Goal`

state if it already is in a `Goal`

state (1,2,3) or "on the brink" of it (4,5). The property still does not assert that - given no `World`

steps from some point forward - `Spec`

will eventually converge to stability when starting from *any* state in `TotalInterval`

. If we for example are in a state `x=6`

, the property `FiniteWorldStable`

does not assert that the `Machine`

will converge (we just happen to know it will because this is a toy example).

To make this more explicit, we will temporarily introduce a "bug" in our specification by changing the `Machine`

action to not converge to a stable state if the value of the variable `x`

is greater than 6 (the upper bound `x < 10`

prevents the model from being infinite).

```
ChangeX ==
/\ IF x < 3 THEN x' = x + 1 ELSE
IF x > 6 /\ x < 10 THEN x' = x + 1
ELSE x' = x - 1
```

The property `FiniteWorldStable`

still holds. Our "programming bug" above will go unnoticed!

One might also try and remove `Machine`

from the next-state relation to show that `FiniteWorldState`

does not reveal violations even with the `TotalInteval \ Goal`

modification. After all, if no `Machine`

steps are possible and `World`

no longer gets us into a stable state, `FiniteWorldState`

should be violated. This however does not work for reasons beyond what this post can cover (Lamport calls this a "weird spec").

## Another fix?

In "Auxiliary Variables in TLA+" (section 3.4) Lamport and Merz discuss how *history variables* serve as a "helper" to state properties that we want to show are satisfied by a spec. While being primarily used in the scope of refinement mappings, "misusing" an auxiliary (history) variable will help us distinguish `World`

from `Machine`

steps. For `Spec`

we simply amend the invariant `TypeInvariant`

, the *state predicate* `Init`

and the two `World`

and `Machine`

actions:

```
VARIABLES x,h
TypeInvariant ==
/\ x \in TotalInterval
/\ h \in {"i","m","w"}
Machine ==
/\ IF x < 3 THEN x' = x + 1 ELSE x' = x - 1
/\ h' = "m"
Init ==
/\ x \in TotalInterval
/\ h = "i"
World ==
/\ x' \in TotalInterval
/\ h' = "w"
```

With this modification e.g. one behavior allowed by `Spec`

will be `<x=7,h="i">, <x=6,h="m">, <x=0,h="w">, <x=1,h="m">, <x=0,h="w">, ...`

where `....`

represents infinite more steps. The first step of the behavior has to be an `Init`

step because `<x=7,h="i">[[Init]]<x=6,h="m">`

equals true and false for `World`

and `Machine`

. Likewise the pair `<x=0,h="m">, <x=1,h="m">`

is a `Machine`

and not a `World`

step and so on. With this change in place `FiniteWorldStable`

and `RareWorldResilient`

finally assert what is stated in their comments.

```
FiniteWorldStable == ~[]<><<(x' \in TotalInterval /\ h' = "w")>>_x => Stable
RareWorldResilient == <>[][(x' \in TotalInterval /\ h' = "w")) => Safe]_x => Resilient
```

Having learned from past mistakes, we now experiment with the properties by e.g. negating the antecedent and/or the consequent to gain some confidence in their correctness.

In general a history variables comes with an increase in the size of the state space. For the given spec the introduction of the history variable `h`

almost triples the size from 11 to 31 (distinct) states. While this is not a problem here, it can be for real-world specs. Since we are checking liveness properties, we are barred from using the symmetry reduction technique; liveness checking and symmetry reduction are - for now - incompatible (TLC and the Toolbox will emit a warning if we try this)[^2].

One might also argue that the introduction of a history variable has a second undesired consequence: How can we possibly expect an adversary to be so nice and properly "mark" his steps? Instead of assuming the adversary to mark steps, we can instead consider the history variable to represent message authentication codes i.e. `Machine`

"messages" include an (unforgeable) signature (modeled above as `"m"`

). We will stop here and ignore this problem because I wanted to constrain this post to leave the original properties untouched. The goal was to make the properties assert what the comments says. We might revisit modeling adversaries in a future post though to discuss alternative ways to state the properties.

The primary purpose of this post is to remind us of two things: First, we should always be suspicious of success! "*Model checking [...] is most informative when it returns a counter-example, not when it says that the property holds, which may be vacuous*". Second, decomposing a next-state relation into (sub) actions makes a spec more comprehensible for a human. It however does not give us a higher-level concept to reason about. Or as Lamport puts it: "Processes are in the eye of the beholder".

We end this post with a little puzzler: What hack does the linked BlockingQueue spec below (gist with better syntax highlighting) "pull" to make the `NoPStarvation`

property reveal that the spec does not guarantee starvation-freedom for *producers*? Why doesn't the same hack work for *consumers*, i.e. `NoCStarvation`

is not violated? Let me know what you find.

Thanks to Leslie Lamport and Ron Pressler for reviews of this post.

([^1]: The *-continue* parameter is not yet supported by the TLA Toolbox.)

([^2]: What does work for this spec is to use a *view* function `<<x>>`

to retain the original size of the state space even with a history variable. This is however due to the particular shape of the state space and does not generalize. A *view* is great to shoot yourself in the foot!)

```
--------------------------- MODULE BlockingQueueBlog ---------------------------
EXTENDS Naturals, Sequences, FiniteSets, TLC
C == {"a", "b"} \* Two consumers and producers...
P == {"c", "d"} \* ...which are usually better model as sets of model values.
K == 1 \* Fix queue size to 1 element
\* We assume the following properties even though the spec doesn't use CONSTANTS.
ASSUME /\ IsFiniteSet(C) /\ IsFiniteSet(P) \* Finite sets
/\ C \intersect P = {} \* A producer is no consumer and vice versa
ASSUME K \in (Nat \ {0}) \* K is a natural number without zero
------------------------------------------------------------------
\* Nondeterministically notify (wake) a process from the given waitset.
Notify(w) == IF w # {}
THEN \E i \in w: w' = w \ {i}
ELSE UNCHANGED << w >>
------------------------------------------------------------------
VARIABLES store, \* The (internal) storage for the queue
waitP, \* A waitset for producers
waitC \* A waitset for consumers
vars == << store, waitP, waitC >>
\* A type correctness invariant asserting that...
TypeOK == /\ waitP \subseteq P \* only producers are in waitP
/\ waitC \subseteq C \* only consumers are in waitC
/\ DOMAIN store \subseteq 0..K \* store is a sequence for which the
\* elements of the codomain are undefined.
(***************************************************************************)
(* Initially consumers and processes are alive and the queue is empty. *)
(***************************************************************************)
Init == /\ store = <<>>
/\ waitP = {}
/\ waitC = {}
(***************************************************************************)
(* A producer either waits if the queue is full or appends an element. *)
(***************************************************************************)
p1(self) == /\ Len(store) = K
/\ waitP' = (waitP \cup {self})
/\ UNCHANGED << store, waitC >>
p2(self) == /\ Len(store) # K
/\ Notify(waitC)
/\ store' = Append(store, self)
/\ UNCHANGED << waitP >>
(***************************************************************************)
(* A consumer either waits if the queue is empty (c1) or picks one element *)
(* (c2). c2 is the critical section from the perspective of the consumer. *)
(***************************************************************************)
c1(self) == /\ Len(store) = 0
/\ waitC' = (waitC \cup {self})
/\ UNCHANGED << store, waitP >>
c2(self) == /\ Len(store) # 0
/\ Notify(waitP)
/\ store' = Tail(store)
/\ UNCHANGED << waitC >>
(***************************************************************************)
(* Let the scheduler non-deterministically schedule consumer and *)
(* producers processes. *)
(* *)
(* Correct Next to not schedule waiting processes. Thanks to Hillel for *)
(* making me aware of this issue. *)
(***************************************************************************)
Next == \/ (\E self \in (P \ waitP): p1(self) \/ p2(self))
\/ (\E self \in (C \ waitC): c1(self) \/ c2(self))
Spec == Init /\ [][Next]_vars /\ WF_vars(Next)
THEOREM Spec => []TypeOK
------------------------------------------------------------------
\* Not all consumers and producers wait at once.
NoDeadLock == (waitP \cup waitC) # (C \cup P)
THEOREM Spec => []NoDeadLock
\* All consumers eventually dequeue elements.
NoCStarvation == \A c \in C: []<>(<<c2(c)>>_<<store>>)
\* All producers eventually enqueue elements.
NoPStarvation == \A p \in P: []<>(<<p2(p)>>_<<store>>)
=============================================================================
```