Getting Our Examples In Order

Most of the previous posts have focussed on \mathsf{Set}-monads and their algebraic presentations. This was done to avoid pulling in lots of abstract machinery, whilst allowing us to still build up good intuitions about what is going on.

Unfortunately, some phenomena only truly reveal themselves when we move away from the category of sets and functions. To develop our understanding, we need additional proving grounds for the ideas we discuss. In the current post, we shall beginning discussing monads on ordered structures. This is a baby step away from \mathsf{Set}, where new possibilities emerge, but we still work with simple mathematical objects that are relatively easy to describe and visualise.

Basics of Posets and Preorders

Posets and preorders are two related mathematical structures that abstract the idea of an ordered collection of objects. A preorder P consists of a set, which we shall also denote P, and a reflexive, transitive binary relation \leq on P. That is, a relation satisfying both:

  • Reflexivity: p_1 \leq p_1.
  • Transitivity: p_1 \leq p_2 and p_2 \leq p_3 implies p_1 \leq p_3.

A poset is a preorder for which the relation \leq is also irreflexive. That is:

p_1 \leq p_2 \;\wedge\; p_2 \leq p_1 \;\Rightarrow\; p_1 = p_2

Example: The set \mathbb{N} of natural numbers is a poset, with their usual ordering.

Example: For any set X, the powerset \mathcal{P}(X) is a poset, with the subset inclusion relation.

Example: For any set X, the the finite powerset \mathcal{P}_\omega(X) can be ordered with U \leq V if the number of elements of U is less than equal to the number of elements of V. This gives the finite powerset of X the structure of a preorder, but not a poset, as both \{x \} and \{ y \} are finite subsets of \{ x, y \} with the same number of elements, but they are not equal.

A morphism of preorders or posets h : (P, \leq_P) \rightarrow (Q, \leq_Q) is a function h : P \rightarrow Q such that:

p_1 \leq_P p_2 \Rightarrow h(p_1) \leq_Q h(p_2)

Such functions are referred to a monotone. The categories of preorders and monotone maps, and partial orders and monotone maps will be denoted \mathsf{Pre} and \mathsf{Pos} respectively.

As well as considering the categories of preorders and posets, it is worth thinking about these objects as special sorts of categories themselves. From this perspective:

  • A preorder is a category in which there is at most one morphism between a pair of objects. Such as category is sometimes described as being thin.
  • A poset is a thin category in which the only isomorphisms are the identities.

(Strictly speaking we should say that preorders and posets are small categories, but we’ll steer clear of discussing size issues.)

Remark: The fact that preorders and posets are “baby” categories is a useful tool for studying category theory. Such examples are often introduced very early in introductory category theory courses, but are nevertheless powerful aides to understanding. If a categorical construction is proving hard to grasp in full generality, considering it on preorders will mean that any valid diagrams will automatically commute. This means that any equational axioms will be satisfied, once you’ve identified morphisms of the right type. Even simpler, if you consider the special case of posets, any structural isomorphisms, for example those that occur in the definition of a monoidal category, will collapse to being equalities, so again lots of things become trivial. These simplifications can remove a lot of distractions whilst playing around with unfamiliar constructions, or trying to tease out the correct abstraction of a phenomenon of interest.

Relatives of the Maybe Monad

Previously, we introduced the maybe monad on sets and functions. Recall the functor action maps a set X to the set X + 1, with an extra element added. Concretely, this operation is the coproduct with (a choice of) terminal object, so up to isomorphism the monad action is the disjoint union:

X + 1 = X \uplus \{ \star \} =  \{ (1, x) \mid x \in X \} \cup \{ (2,\star) \}

If we think of this as a construction involving binary coproducts and terminal objects, we can define a maybe monad in any category where this structure exists. In the category \mathsf{Pos} the coproduct (P, \leq_P) + (Q, \leq_Q) has underlying set P \uplus Q, and order relation x \leq y if and only if either:

  • x = (1, p_1) and y = (1, p_2) and p_1 \leq_P p_2.
  • x = (2, q_1) and y = (2, q_2) and q_1 \leq_Q q_2.

That is, we just glue together the two components, and inherit the order from the two parts.

The terminal object in \mathsf{Pos} is also similar to that in \mathsf{Set}. The underlying set is \{ \star \}, and the order relation is the inevitable \star \leq \star.

With these concrete constructions in place, on objects the maybe monad in \mathsf{Pos} simply adds an extra element to a poset, unrelated with respect to the elements of the original structure.

From a computational point of view, it is sometime useful to interpret an order relation on a structure as an information or approximation ordering. In this way, p_1 \leq p_2 is interpreted as p_1 being less informative than p_2.

If we adopt this point of view, the maybe monad is not entirely satisfactory when interpreting \star as computation failure or divergence. This should yield no information, and therefore be a minimal element, rather than simply incomparable with everything else.

This suggests considering a construction on poset (P, \leq_P) with universe (P, \leq_P) \uplus \{ \star \} as before. The order relation extends that of the maybe monad with the additional relations:

(2, \star) \leq (1,p)

for all p \in P. That is, \star is now placed at the bottom of the resulting structure. In fact, this yields a new monad on \mathsf{Pos}, often termed the lift monad. The name is derived from the idea that the original structure is lifted up above the new element. Notice how this additional construction doesn’t even make sense on mere sets, so the presence of order allows us to introduce new features – this will be a recurring theme.

Relatives of the List Monad

The list monad is such a standard example that we should investigate list like constructions in our new setting. For a set X, we shall write L(X) for the set of lists of elements from X.

For a poset (P, \leq_P), we would like to build a monad \mathbb{L} such that \mathbb{L}(P, \leq) has underlying set L(P). We shall also require the unit morphism

\eta_{(P,\leq)} (P, \leq_P) \rightarrow \mathbb{L}(P, \leq)

to map elements to singleton lists p \mapsto [p] as usual, and the multiplication to be list concatenation.

As the unit morphism must be a monotone map, we then must have for all p_1, p_2 such that p_1 \leq p_2:

[p_1] \leq [p_2]

in \mathbb{L}(P,\leq). We shall call this the singleton axiom. By reflexivity, we also need for every list l that l \leq l. In fact, these are all the relations we need, and the resulting construction yields a monad, which we shall denote \mathbb{L}_1 for convenience.

So we have found a list monad structure on \mathsf{Pos}, but it’s a bit ugly. If we think of these lists as data structures, why would we only order singleton lists, this seems a bit unnatural? Instead, it might be more natural for list of the same length [p_1,\ldots,p_n] and [q_1,\ldots,q_n] to have

[p_1,\ldots,p_n] \leq [q_1,\ldots,q_n]

if

p_1 \leq_P q_1 \; \wedge\; \ldots \; \wedge \; p_n \leq_P q_n

We shall call this the pointwise ordering axiom. Adding pointwise ordering to the construction \mathbb{L}_1 yields a second monad, which we shall denote \mathbb{L}_2.

We could order lists based on their endpoints. So p_n \leq q_1 implies:

[p_1, \ldots, p_n] \leq [q_1,\ldots,q_m]

We shall call this the endpoints axiom. Adding this axiom to the constructions for \mathbb{L}_1 or \mathbb{L}_2 yields two more monads \mathbb{L}_3 and \mathbb{L}_4.

Varying things in another direction, we could restrict our attention to lists

[p_1, \ldots p_n]

such that for all consecutive elements p_i, p_{i + 1}

p_i \leq_p p_{i + 1}

Clearly, we must still require the singleton axiom, and the relations implied by reflexivity. In fact, this construction yields another monad, which we shall denote \mathbb{L}_5. Adding the endpoint axiom to this construction yields a further monad \mathbb{L}_6.

Notice that we do have to be careful about these variations though. Adding in the pointwise ordering axiom to \mathbb{L}_5 or \mathbb{L}_6 does not result in a valid monad, as list concatenation won’t be monotone so the multiplication is not well-defined. For example, for natural numbers with their usual ordering, pointwise ordering yields

[0,3] \leq [1,4]

but the concatenation of the ordered list

[[0,3],[1,4]]

is

[0,3,1,4]

which is not correctly ordered.

Conclusion

Moving to categories of ordered structures, even whilst restricting our attention to constructions originating with the maybe and list monads, we found a plethora of new possibilities. This newfound freedom all felt slightly wild, with many options for extending and varying constructions that were more straightforward in \mathsf{Set}. Not all the combinations of features yielded valid monads, so care was needed to (hopefully!) avoid mistakes. It is natural to wonder if we can make things more systematic? For example, where did the six monads \mathbb{L}_1, \ldots \mathbb{L}_6 come from? Are there more? What goes on with other constructions such as the multiset monad?

We shall look at further examples, and the question of whether there are more instructive methods we can bring to bear, in later posts.

Coproducts of Monads

Often the simplest way to form a combination of two objects in category theory is to take their coproduct. In this post, we are going to investigate coproduct of monads, beginning as usual from an algebraic perspective.

Sums of Theories

We begin by considering a pair of \mathsf{Set}-monads given by equational presentations. We shall investigate how a suitable method of combining those presentations yields a corresponding operation on monads.

Given two equational presentations, (\Sigma_1, E_1) and (\Sigma_2, E_2), we can define a new equational presentation with:

  • Operations the disjoint union \Sigma_1 \uplus \Sigma_2.
  • Equations given by the union E_1 \cup E_2.

We shall refer to this as the sum (\Sigma_1, E_1) + (\Sigma_2, E_2) of the two component theories.

This combines the two equational theories such that there is “no interaction between them”. This can be seen by considering the algebras of the resulting theory. An algebra for (\Sigma_1, E_1)  + (\Sigma_2, E_2) consists of

  1. A fixed choice of universe A.
  2. A (\Sigma_1, E_1)-algebra structure on A.
  3. A (\Sigma_2, E_2)-algebra structure on A.

The sum of theories induces an operation \mathbb{T}_1 \oplus \mathbb{T}_2 on the monads induced by these equational presentations. In fact, the resulting monad is the coproduct of \mathbb{T}_1 and \mathbb{T}_2.

Coproducts of Free Monads

Forming coproducts of monads via an operation on their algebraic presentations as we have just described has some limitations. In particular:

  1. To exploit it, we need to identify presentations for our monads of interest.
  2. We don’t get a description of the monad \mathbb{T}_1 \oplus \mathbb{T}_2 directly in terms of the component monads \mathbb{T}_1 and \mathbb{T}_2.
  3. Realistic applications will probably involve categories other that \mathsf{Set}. We would prefer constructions that generalise smoothly to other settings.

To address some of these problems, we shall begin by examining the simple case of coproducts of free monads. For a pair of signatures \Sigma_1 and \Sigma_2, the free monads for the corresponding signature functors must satisfy:

\Sigma_1^* \oplus \Sigma_2^* \cong (\Sigma_1 + \Sigma_2)^*

This follows immediately from the universal property, or via a simple direct calculation. We then recall that free monads can be constructed using least fixed points, so we have:

(\Sigma_1^* \oplus \Sigma_2^*)(X) \cong \mu Z. X + \Sigma_1(Z) + \Sigma_2(Z)

We can think of this as building the coproduct in stages. In the first stage, we begin with the set of variables X. In subsequent stages, we:

  1. Apply every operation symbol in \Sigma_1 to terms constructed in the previous stage.
  2. Apply every operation symbol in \Sigma_2 to terms constructed in the previous stage.
  3. Add back in the set of variables X again.

Example: Let \Sigma_1 and \Sigma_2 both contain a single unary operation symbol, \sigma_1 and \sigma_2 respectively. The terms over set of variables \{ x \} will be:

  1. x.
  2. x, \sigma_1(x), \sigma_2(x).
  3. x, \sigma_1(x), \sigma_2(x), \sigma_1(\sigma_1(x)), \sigma_2(\sigma_2(x)), \sigma_1(\sigma_2(x)), \sigma_2(\sigma_1(x))

This lets us build up the elements of the coproduct of free monads, using the underlying signature functor. This is a step in the right direction, but ideally we would use the monads themselves in the construction. To try and achieve this, we notice that each term consists of layers of operations from one signature, followed by layers of operations from the other signature, eventually applied to variable symbols.

Example: Continuing the previous example, we end up with terms like:

\sigma_2(\sigma_2(\sigma_1(x))

We can see this as a layer of two \Sigma_2 operations, followed by a layer with one \Sigma_1 operation applied to a variable.

Can we build up these layers directly, using the free monads \Sigma^*_1 and \Sigma^*_2? A naive first attempt would be to try the fixed point construction:

\mu Z. X + \Sigma^*_1(Z) + \Sigma^*_2(Z)

If we consider the first few approximants to this fixed point, we get:

  1. X + \Sigma_1^*(\emptyset) + \Sigma_2^*(\emptyset).
  2. X + \Sigma_1^*(X + \Sigma_1^*(\emptyset) + \Sigma_2^*(\emptyset)) + \Sigma_2^*(X + \Sigma_1^*(\emptyset) + \Sigma_2^*(\emptyset)).

This is not quite what we want. For example, \Sigma_1^*(X + \Sigma_1^*(\emptyset) + \Sigma_2^*(\emptyset)) will contain a copy of the variables. So we are going to end up with two many copies of the variables in our fixed point. Intuitively, \Sigma_1^*(X) muddles together both bare variables and non-trivial terms.

Hopefully this rings a bell. We have previously seen that free monads are ideal monads. Recall that an ideal monad abstracts the idea of “keeping variables separate”. An ideal monad \mathbb{T} has a subfunctor \mathbb{T}' such that:

\mathbb{T}(X)  \cong X + \mathbb{T}'(X)

In the case of free monads {\Sigma^{*}}'(X) contains all the guarded terms. These are exactly the terms that are not bare variables. As a second attempt at a fixed point construction, we consider

\mu Z. X + {\Sigma^*}'_1(Z) + {\Sigma^*}'_2(Z)

Unfortunately, this is still not quite right. The first few approximants to this fixed point are:

  1. X + {\Sigma^*}'_1(\emptyset) + {\Sigma^*}'_2(\emptyset).
  2. X + {\Sigma^*}'_1(X + {\Sigma^*}'_1(\emptyset) + {\Sigma^*}'_2(\emptyset)) + {\Sigma^*}'_2(X + {\Sigma^*}'_1(\emptyset) + {\Sigma^*}'_2(\emptyset)).

The problem now is that this construction will build up some terms in more than one way, resulting in redundant duplicate copies.

Example: Continuing our previous example, we can consider \sigma_1(\sigma_1(x)) as arising either directly as a term in {\Sigma^*}'_1(X), or in two steps by applying {\Sigma^*}'_1 to {\Sigma^*}'_1(X), by applying \sigma_1 to \sigma_1(x).

To address this, we must be more careful to build up terms in alternating layers of operations from each signature, in a unique way. To achieve this explicit alternation, we instead solve a fixed point equation in two variables:

(\Sigma^\dagger_1(X), \Sigma^\dagger_2(X)) = \mu (Z_1, Z_2). ({\Sigma^*}'_1(X + Z_2),{\Sigma^*}'_2(X + Z_1))

This will produce terms with alternating layers of operations from the two signatures. \Sigma^\dagger_1(X) will contain terms with a \Sigma_1 operation on top, and dually for \Sigma^\dagger_2. We then recover:

(\Sigma_1^* \oplus \Sigma_2^*)(X) \cong X + \Sigma^\dagger_1(X) + \Sigma^\dagger_2(X)

Coproducts of Ideal Monads

So far, we have rearranged the construction of a coproduct of free monads into language which involves fixed points. Has all this reshuffling got us anywhere?

Well we could consider applying the same construction to arbitrary ideal monads. Firstly, we calculate the fixed point:

(\mathbb{T}^\dagger_1(X), \mathbb{T}^\dagger_2(X)) = \mu (Z_1, Z_2). (\mathbb{T}'_1(X + Z_2), \mathbb{T}'_2(X + Z_1))

Assuming the required fixed points exist, we can then form the coproduct as:

(\mathbb{T}_1 \oplus \mathbb{T}_2)(X) \cong X + \mathbb{T}^\dagger_1(X) + \mathbb{T}^\dagger_2(X)

It is worth considering what this says. Firstly, from the algebraic perspective we’re now building up equivalence classes of terms under provable equality, not just raw terms. The intuitive reason that this construction works is that the layers cannot interact. We can prove equalities within a layer, but once a layer of terms from the other theory is inserted, this layer is insulated from interacting with anything else. Hence, we can build up the equivalence classes of \mathbb{T}_1 \oplus \mathbb{T}_2, not just the underlying terms, inductively in layers, which is rather a pleasing outcome.

In fact, this is true for arbitrary categories and ideal monads, as long as the required fixed points exist. For example, we can also apply this result to the completely iterative monads that we saw previously.

Summing Up

The ideas in this post mainly follows Ghani and Uustalu’s “Coproducts of Ideal Monads”, which is strongly recommended for further reading. There are other explicit constructions for coproducts of comonads, for example in the case of a free monad and an arbitrary monad, as described in Hyland, Plotkin and Power’s “Combining Effects: Sum and Tensor”. “Coproducts of Monads on Set” by Adámek, Milius, Bowler and Levy gives a detailed analysis in the case of \mathsf{Set}-monads. We should also point out that even in the case of \mathsf{Set}-monads, coproducts of pairs of monads do not always exist.

Monads with special presentations

Previously, we introduced strong monads, and their associated double strength maps. By considering how a monad interacts with the double strengths, other important classes of monads emerge.

Recall that for a strong monad \mathbb{T}, their are two double strength morphisms:

\mathsf{dst}, \mathsf{dst'} : \mathbb{T}(X) \otimes \mathbb{T}(Y) \rightarrow \mathbb{T}(X \otimes Y)

which are natural in X and Y. By restricting attention to the case where the monoidal structure is given by categorical products, we then notice that by combining the double strength and product structure, we can form the following composites:

  1. \mathbb{T}(X) \times \mathbb{T}(Y) \xrightarrow{\mathsf{dst}} \mathbb{T}(X \times Y) \xrightarrow{\langle \mathbb{T}(\pi_1), \mathbb{T}(\pi_2) \rangle} \mathbb{T}(X) \times \mathbb{T}(Y).
  2. \mathbb{T}(X \times Y) \xrightarrow{\langle \mathbb{T}(\pi_1), \mathbb{T}(\pi_2) \rangle} \mathbb{T}(X) \times \mathbb{T}(Y) \xrightarrow{\mathsf{dst}} \mathbb{T}(X \times Y).

As these composites are both endomorphisms, it is natural to ask when they are equal to the identity morphism of the same type.

Remark: This is a recurring theme, which we have commented on before. When doing category theory, we are often in situations when certain morphisms are assumed to be part of the available structure. It is always worthwhile considering when composites of these morphisms of the same type are actually equal. For example, the axioms of a monoidal category ensure that all such diagrams commute (in a precise technical sense). On the other hand, for a strong monad, exploring when the two double strength maps coincide leads us to the notion of commutative monad.

A monad is said to be:

  1. Affine if \mathbb{T}(X) \times \mathbb{T}(Y) \xrightarrow{\mathsf{dst}} \mathbb{T}(X \times Y) \xrightarrow{\langle \mathbb{T}(\pi_1), \mathbb{T}(\pi_2) \rangle} \mathbb{T}(X) \times \mathbb{T}(Y) = \mathsf{id}.
  2. Relevant if \mathbb{T}(X \times Y) \xrightarrow{\langle \mathbb{T}(\pi_1), \mathbb{T}(\pi_2) \rangle} \mathbb{T}(X) \times \mathbb{T}(Y) \xrightarrow{\mathsf{dst}} \mathbb{T}(X \times Y) = \mathsf{id}.
  3. Cartesian if it is both affine and relevant. That is, when it preserves binary products, up to isomorphism.

We should immediately be a bit suspicious of these definitions. Do we get different notions if we consider interaction with \mathsf{dst}' rather than \mathsf{dst}? Fortunately, life is simple in this case, and the choice to use \mathsf{dst} or \mathsf{dst}' makes no difference, although it does take a little bit of work to establish this fact.

Affineness algebraically?

As usual, when we encounter new classes of monads, we shall examine them from the point of view of algebra.

A bit of calculation shows that being affine is equivalent to requiring the component of the monad at the terminal object is an isomorphism. Algebraically, \mathbb{T}(\{ x \}) consists of equivalence classes of terms:

[x], [t(x, x)], [t(t'(x, x), x))], \ldots

The set of these terms is isomorphic to the terminal object if and only if it has a single element. This can only be the case if all these equivalence classes are equal. This is the same as requiring that every term built from operation symbols and a single variable is equal to the variable itself. This will be the case if and only if every operation o satisfies:

o(x,\ldots,x) = x

That is, all the operations are idempotent.

Example: The non-empty (finite) powerset monad is affine.

Example: If we consider consider any algebraic presentation such that we can derive x = y, then everything can be proved equal to everything else. Such a presentation is said to be inconsistent. We consider the presented monad \mathbb{T}. There are two cases:

  1. The signature doesn’t contain a constant symbol. In this case \mathbb{T}(\emptyset) = \emptyset, and for every other X, \mathbb{T}(X) \cong 1.
  2. The signature contains a constant symbol. In this case, for all X, \mathbb{T}(X) \cong 1.

So the monads corresponding to inconsistent presentations are affine. These pathological monads, which we shall refer to as inconsistent monads, are a good source of counterexamples for naive conjectures about monads.

Counterexample: For an algebraic presentation (\Sigma, E) which is not inconsistent, if \Sigma contains a constant symbol, then the presented monad is not affine. For example, the ordinary finite powerset monad is not affine.

Relevance algebraically

Unfortunately, the condition for being relevant does have such a convenient simplification. In this case, to understand what is going on algebraically, we shall have to look the the original definition directly, using the algebraic description of \mathsf{dst} discussed previously. For arbitrary element

[t((x_1,y_1),\ldots,(x_n,y_n))] \in \mathbb{T}(X \times Y),

the relevance condition becomes:

[t((x_1,y_1),\ldots,(x_n,y_n))]  = [t(t((x_1,y_1),\ldots,(x_1,y_n)),\ldots, t((x_n,y_1),\ldots,(x_n,y_n)))]

which can only be the case if the following equation is derivable for every term t:

t((x_1,y_1),\ldots,(x_n,y_n)) = t(t((x_1,y_1),\ldots,(x_1,y_n)),\ldots, t((x_n,y_1),\ldots,(x_n,y_n)))

Using pairs everywhere is getting hard to read. The previous equation is equivalent to:

t(x_1,\ldots,x_n) = t(t(x_1,y_{1,2}, \ldots, y_{1,n}),\ldots, t(y_{n,1},\ldots, y_{n,n-1}, x_n))

Example: If we only have constant elements in our presentation, then the equation above holds trivially. Therefore for any set E, the exception monad (-) + E is relevant.

As the relevance condition is still rather hard to read, we specialise it to a single binary operation \times:

x_1 \times x_2 = (x_1 \times y_1) \times (y_2 \times x_2)

The equation should be familiar from our discussion of the reader monad for a binary bit.

Example: The equation above is exactly the condition satisfied by the lookup operation for the binary bit reader monad, and inductively by all terms. The binary bit reader monad is relevant. In fact, any instance of the reader monad is relevant, as can be verified by direct calculation.

Counterexample: To confirm a monad is relevant, it is not sufficient to establish the condition for operation symbols only. For example, if we have two binary operations s,t satisfying the relevance condition above, it is not necessarily the case that the composite t(s(w,x),s(y,z)) will.

Example: Unsurprisingly, the inconsistent monads introduced above are also relevant. This is trivially true, as they satisfy every possible equational condition.

Summing Up

By considering the interaction between a strong monad and its double strength maps, we were lead to new classes of monads. In algebraic examples, these conditions have natural interpretations.

We should ask ourselves, are these monads useful for anything? Gladly, the answer is yes. We briefly sketch a couple of examples:

  1. For a commutative monad on a symmetric monoidal category, under tame conditions, the symmetric monoidal structure lifts to the Eilenberg-Moore category. If the monad is affine or relevant, further structure such as projection and diagonal maps also lift to the Eilenberg-Moore category.
  2. Beck’s distributive laws are a vital tool for composing monads. If a monad is commutative monad, there are useful positive results about the existence of distributive laws for combining it with other monads. Recent work has shown that affine and relevant monads imply distributive laws in more general situations.

To discuss either of these topics properly requires more background than we have currently introduced. We may return to them in later posts.

For further background on affine and relevant monads, Jacobs “Semantics of Weakening and Contraction” is an excellent place to start reading (as it is for many other topics in monad theory!). The importance of affine and relevant monads for positive results about distributive laws can be found in the PhD thesis of Louis Parlant.

Completely Iterative Monads

In the previous post, we saw that free monads could be described in terms of initial algebras. We also saw that the free completely iterative monad could be constructed in a similar way, but using final coalgebras. That post completely skipped explaining what a completely iterative monad is, and why they are interesting. That will be the objective of the current post. Along the way, we shall encounter several familiar characters from previous posts, including ideal monads, free algebras, Eilenberg-Moore algebras and final coalgebra constructions.

Solving Equations

Solving equations is a central topic in algebra, that we deal with right from the mathematics we are taught in school. Fortunately, for this post, we shall be interested in solving more interest classes of equations than we typically encounter in those early days. For a signature \Sigma, and \Sigma-algebra A, we are interested in solving systems of equations of the form

  • x_1 = t_1
  • x_2 = t_2

where each t_i is a \Sigma-term over X \uplus A. The idea is that the t_i are terms that can refer to the unknowns, and also to some constant elements in A. The first thing to note about such systems of equations is that we can define infinite cycles.

Example: Let \Sigma = \{ \triangleleft, \triangleright \} be a signature with two binary operation symbols, and a A a \Sigma-algebra . Consider the two mutually recursive equations:

  1. x_1 = a_1 \triangleleft x_2.
  2. x_2 = a_2 \triangleright x_1.

These equations contain the three main components we are interested in, variables expressing unknowns x_1,x_2, operation symbols \triangleleft, \triangleright, and constant value a_1, a_2 in the structure of interest. Intuitively, the solution to this system of equations should be the value of a “term” in

  1. x_1 = a_1 \triangleleft a_2 \triangleright a_1 \triangleleft a_2 \triangleright \ldots.
  2. x_2 = a_2 \triangleright a_1 \triangleleft a_2 \triangleright a_1 \triangleleft \ldots

(Really we should bracket everything to the right, but that would hurt readability. We shall assume that all such terms implicitly bracket to the left.) Of course, these are infinite objects, and so there is no genuine term we can evaluate to define these values in this way.

In an ideal world, each such system of equations would have a unique solution. Clearly, the trivial equation x = x can be solved by any element in the underlying algebra. To avoid our desire for unique solutions forcing all our algebras to become trivial, we must restrict the class of equations we can solve. Instead, we shall demand unique solutions for all systems of equations where the t_i are not bare variables. These are referred to as guarded systems of equations.

The following example shows that we can actually consider simpler sets of guarded equations, and still retain the same expressive power.

Example: Using the same signature as the previous example, consider the following pair of equations:

  1. x_1 = a_1 \triangleleft a_2 \triangleleft x_2.
  2. x_2 = a_3 \triangleright a_4 \triangleright x_1.

The terms on the right hand side are more complex than the previous example, as they contain multiple operation symbols. By introducing further unknowns, we can describe a system of equations only using one operation in each term.

  1. x_1 = a_1 \triangleleft x'_1.
  2. x'_1 = a_2 \triangleleft x_2.
  3. x_2 = a_3 \triangleright x'_2.
  4. x'_2 = a_4 \triangleright x_1.

Assuming a unique solutions exists for both sets of equations, the values for x_1 and x_2 must coincide.

Our terms now involve only one operation symbol, and a mixture of variables and constants from A. We can go further in trying to standardise the form of our equations, by separating out the constants. Again, we add more unknowns, adjusting the previous set of equations as follows:

  1. x_1 = a_1 \triangleleft x'_1.
  2. x'_1 = a_2 \triangleleft x_2.
  3. x_2 = a_3 \triangleright x'_2.
  4. x'_2 = a_4 \triangleright x_1.
  5. x''_1 = a_1.
  6. x''_2 = a_2.
  7. x''_3 = a_3.
  8. x''_4 = a_4.

Assuming a unique solution exists for these equations, the values for x_1 and x_2 must coincide with those of the previous formulations. If we view the elements a_i as extra constant symbols in our signature, this is simply a set of equations using guarded terms.

Let’s call a term flat if it contains only one operation symbol. The strategy in the previous example can be generalised to convert an arbitrary family of equations into an equivalent family involving only flat terms or constants. For a signature \Sigma, such a family of equations can be expressed as a function:

e: X \rightarrow \Sigma(X) + A

We are interested in algebras (A, a : \Sigma(A) \rightarrow A) such that for every such e there exists a unique e^{\dagger} : X \rightarrow A such that:

e^{\dagger} = [a \circ \Sigma e^{\dagger}, \mathsf{id}] \circ e

A \Sigma-algebra for which such unique solutions exist is referred to as a completely iterative algebra. The adjective “completely” emphasises that we have solutions for arbitrary sets of equations. An iterative algebra is an algebra where there exist solutions for finite sets of equations.

Most of the algebraic objects that are encountered in ordinary mathematics are not completely iterative.

Example: For the natural numbers with addition, the system of equations:

  1. x_1 = x'_1 + x_1.
  2. x'_1 = 1.

has no solutions.

Example: For the subsets of the natural numbers, with binary unions, the system of equations:

  1. x_1 = x'_1 \cup x_1.
  2. x'_1 = \emptyset.

has multiple solutions. In fact, any subset of natural numbers will do for x_1. The issue here is that \cup has a unit element.

We could also consider the equation x = x \cup x, this also has multiple solutions. The essential problem now is that \cup is idempotent.

We could thinks of both of these examples as sneakily allowing us to circumvent the restriction to guarded terms.

The free algebra over X is simply the collection of all terms of X, with formal syntactic operations. Term are finite object. The intuition from our first example is that in order to solve potentially mutually recursive equations, what we need is the set of “finite and infinite terms”.

As we have seen last time, the free algebra at X is given by the initial algebra (\mu(\Sigma + X), \mathsf{in}), which we think of as a least fixed point. To add in the infinite terms, we instead consider the final coalgebra (\nu(\Sigma + X), \mathsf{out}). For an algebraic signature, the elements of \nu(\Sigma + X) are trees (both finite and infinite) such that:

  1. Each internal n-ary node is labelled by an n-ary operation symbol.
  2. Each leaf is labelled by either a constant symbol, or an element of X.

Intuitively, the finite trees are the ordinary terms, and the infinite trees are the extra elements that allow us to solve genuinely mutually recursive systems of equations.

If we consider the full subcategory of \mathsf{Alg}(\Sigma) consisting of the completely iterative algebras, and the corresponding restriction of the forgetful functor to \mathsf{Set}, then (\nu(\Sigma + X), \mathsf{out}^{-1} \circ \kappa_1) is the free completely iterative algebra over X. Here \kappa_1 denotes the coproduct injection.

When all these final coalgebras exist, so we have a free completely iterative algebra functor, the adjunction induces a monad. This is the monad that we described as the free completely iterative monad in the previous post.

Completely Iterative Monads

We’ve now recovered the monad known as the free completely iterative monad from the previous post, via an analysis of unique solutions for certain families of mutually recursive equations. What we still haven’t done is sort out the question of exactly what a completely iterative monad is. That is the problem we turn to now. Again, solutions of mutually recursive guarded equations will be important.

We will now remove the restriction to equations involving only flat terms, and consider solutions to systems of equation involving more general terms. Abstractly such a system of equations will be encoded as a function:

e : X \rightarrow \mathbb{T}(X + Y)

Here, \mathbb{T} is some monad, X is the object of unknowns, and Y are a set of parameters. We will look for solutions to this system of equations in an Eilenberg-Moore algebra (A, a). A solution for e, given an assignment for the parameters f : Y \rightarrow A is a function e^{\dagger} : X \rightarrow A such that:

e^{\dagger} = a \circ \mathbb([e^{\dagger}, f]) \circ e

We would like unique solutions to such families of equations. As we noted previously, equations such as x = x will cause us problems. This presents the question of how to identify the guarded terms that aren’t just bare variables.

We previously encountered ideal monads, which informally are monads that “keep variables separate”. These are exactly the gadget that we need in order to talk about guarded equations abstractly. We recall an ideal monad is a monad (\mathbb{T},\eta,\mu) such that:

  1. There exists an endofunctor \mathbb{T}^+ with \mathbb{T} = \mathsf{Id} + \mathbb{T}^+.
  2. The unit is the left coproduct injection.
  3. There exists \mu' : \mathbb{T}^+ \circ \mathbb{T} \Rightarrow \mathbb{T}^+ such that \mu \circ \kappa_2 \mathbb{T} = \kappa_2 \circ \mu'.

A system of equations e : X \rightarrow \mathbb{T}(X + Y) is said to be guarded if it factors through [\kappa_2, \eta \circ \kappa_2] : \mathbb{T}'(X + Y) + Y \rightarrow \mathbb{T}(X + Y).

Finally, we are in a position to say what a completely iterative monad is!

A completely iterative monad is an ideal monad such that every guarded system of equations e : X \rightarrow \mathbb{T}(X + Y) has a unique solution in the free algebra (\mathbb{T}(Y), \mu_Y), with \eta : Y \rightarrow \mathbb{T}(Y) the valuation for the parameters.

Example: Unsurprisingly, the free completely iterative monad, constructed using final coalgebras, is a completely iterative monad.

Freedom!

We’re not quite done. We claimed that the final coalgebra construction yields the free completely iterative monad. You should always be suspicious of a claim that something is a free construction, if you’re unsure about the categories and functors involved. We now pin down these details.

Firstly, we admit to being a bit naughty in a previous post. We introduced ideal monads, but didn’t specify their morphisms. We need to address that omission first.

A morphism of ideal monads:

(\mathbb{S},\eta,\mu,\mathbb{S}', \mu') \rightarrow (\mathbb{T},\eta,\mu,\mathbb{T}', \mu')

is a monad morphism \sigma : \mathbb{S} \rightarrow \mathbb{T} such that there exists a \sigma' : \mathbb{S}' \Rightarrow \mathbb{T}' with:

\sigma \circ \kappa_2 = \kappa_2 \circ \sigma'

Intuitively, these a monad morphisms that don’t muddle up bare variables with guarded terms.

Let \mathsf{CIM}(\mathcal{C}) be the category of completely iterative monads on \mathcal{C}, and ideal monad morphisms between them. There is an obvious forgetful functor to the endofunctor category:

\mathsf{CIM}(\mathcal{C}) \rightarrow [\mathcal{C}, \mathcal{C}]

For an ideal monad \mathbb{T}, a natural transformation \alpha : \Sigma \Rightarrow \mathbb{T} is ideal if it facts through the coproduct inclusion \mathbb{T}' \Rightarrow \mathbb{T}'.

A completely iterative monad \mathbb{T} is free over function \Sigma if there is a universal ideal natural transformation \iota : \Sigma \Rightarrow \mathbb{T}. That is, for every completely iterative monad \mathbb{S} and ideal natural transformation \alpha : \Sigma \Rightarrow \mathbb{T} there exists a unique ideal monad morphism \hat{\alpha} : \mathbb{T} \Rightarrow \mathbb{S} such that

\alpha = \hat{\alpha} \circ \iota

You should be a bit suspicious at this point. This doesn’t look like a normal free object definition, in particular, the restriction to ideal natural transformations is a bit odd. This is a legitimate universal property, but possibly using the term free is at odds with the usual convention, relative to the forgetful functor above. In particular, as pointed out in the original paper, the existence of all such free objects does not imply the existence of an adjoint functor. Be careful when somebody on the internet tells you something is free!

Wrapping Up

As usual, we have focussed on sets and functions, and conventionally algebraic examples as far as possible. This was for expository reasons, the theorems in this area work in much greater generality.

A good starting point for further information is “Completely Iterative Algebras and Completely Iterative Monads” by Milius. The paper has plentiful examples, and many more technical results and details than those we have sketched in this post. You will also find pointers to iterative monads and algebras in the bibliography of that paper.

Initial Ideas and Final Thoughts

We encountered algebras for endofunctors in the previous post. In this post, shall introduce the dual notion, endofunctor coalgebras, and shall look at some special endofunctor algebras and coalgebras. These constructions have wide applications, but we shall be particularly interested in their usefulness for systematically constructing certain classes of monads.

Initial Algebras

For an endofunctor F, an initial F-algebra is simply an initial object in \mathsf{Alg}(F). Explicitly, this is an F-algebra (\mu(F),\mathsf{in}) such that for every F-algebra (A,a) there is a unique morphism ! : \mu(F) \rightarrow A such that

! \circ \mathsf{in} = a \circ F(!)

The first crucial fact to know about initial algebras, known as Lambek’s Lemma, is that \mathsf{in} is always an isomorphism. This highlights that initial algebras generalise the notion of least fixed points for monotone maps.

Lambek’s lemma immediately tells us that certain initial algebras cannot exist.

Example: The powerset functor does not have an initial algebra. as there is no set X such that X \cong \mathcal{P}(X).

As usual, we will motivate constructions by considering algebraic examples of initial algebras.

Example: Consider a signature \Sigma. We inductively define sets T_i, with T_0 = \emptyset, and:

  1. If c is a constant symbol in \Sigma, c \in T_{n + 1}.
  2. If o is an n-ary operation symbol, and t_1,\ldots,t_n \in T_n then o(t_1,\ldots,t_n) \in T_{n + 1}.

It is hopefully clear that if the signature contains no constant symbols, all of the T_i will be empty. To see what happens with a signature with constants, let \Sigma = \{ 0, + \}, with 0 a constant, and + a binary operation. We then have:

  1. T_1 = \{ 0 \}.
  2. T_2 = \{ 0, 0 + 0 \}.
  3. T_3 = \{ 0, 0 + 0, 0 + (0 + 0), (0 + 0) + 0, (0 + 0) + (0 + 0) \}.
  4. And so on …

The least fixed point of this process is the set of ground terms over the signature, and is \mu(\Sigma) for the corresponding signature functor. The algebra structure map \mathsf{in} : 1 + \mu(\Sigma)^2 \rightarrow \mu(T) maps the left component to the term 0, and on the second component:

(t_1, t_2) \mapsto t_1 + t_2

The general case for an arbitrary signature is formally identical.

So the initial algebra of a signature functor is the set of ground terms, equipped with the obvious purely syntactic operations that just form new terms. As usual, we are actually interested in monads, but these don’t involve just ground terms, we need to get the variables involved. So we’re really interested in free algebras over some set, not just the initial algebra.

Example: Let \Sigma be some signature. To build the free algebra over the set X, we construct sets T_i, with T_0 = \emptyset, and:

  1. If c is a constant symbol then c \in T_{n + 1}.
  2. If x \in X then x \in T_{n + 1}.
  3. If o is an n-ary operation symbol, and t_1,\ldots,t_n \ in T_n then o(t_1,\ldots,t_n) \in T_{n + 1}.

If we take \Sigma = \{ 0, + \} as in the previous example, and X = \{ x \} for simplicity, we have:

  1. T_1 = \{ 0,  x \}.
  2. T_2 = \{ 0, x, 0 + x, x + 0, x + x, 0 + 0 \}.
  3. T_3 = \{ 0, x, 0 + x, x + 0, x + x, 0 + 0, 0 + (0 + x), 0 + (x + 0), \ldots \}.
  4. And so on…

The least fixed point of this process is the set of all such terms over X, which is \mu(\Sigma + X), where \Sigma is the corresponding signature functor. We can think of this as forming the initial algebra for a signature extended with a constant symbol for each element of X. Recalling the notation for the free monad, we have deduced that, at least at the level of objects \Sigma^*(X) = \mu(\Sigma + X).

In fact, this approach works in great generality. For an endofunctor F : \mathcal{C} \rightarrow \mathcal{C} on a category with binary coproducts, if the functor F(-) + X has an initial algebra, then \mathsf{in} \circ \kappa_1 : F(\mu(F + X)) \rightarrow \mu(F + X) is the free F-algebra over X.

From the above, it follows that if all the functors F + X have initial algebras, the free monad on F is given by:

F^*(X) = \mu(F +X)

If we restrict our attention to signature functors, this simply yields another perspective on the description of free monads we saw in the previous post. For other functors, it does yield new monads.

Example: Although there is no initial algebra for the full powerset functor, life is much better for the finite powerset functor \mathcal{P}_{\omega}. Each of the functors \mathcal{P}_{\omega} + X has an initial algebra. The elements of \mathcal{P}^*_{\omega}(\emptyset) are the hereditarily finite sets. These are finite sets, with all elements also hereditarily finite. Similarly the elements of \mathcal{P}^*_{\omega}(X) are hereditarily finite sets with atoms from X. That is, finite sets, with all elements either elements of X, or hereditarily finite sets with atoms from X.

Final Coalgebras

For an endofunctor F : \mathcal{C} \rightarrow \mathcal{C}, a coalgebra is a pair (A,a) consisting of a \mathcal{C}-object A, and a \mathcal{C}-morphism a : A \rightarrow F(A). This is the dual notion to that of an F-algebra. These form a category \mathsf{Coalg}(F), with morphisms (A,a) \rightarrow (B,b) being \mathcal{C}-morphisms h : A \rightarrow B such that

F(h) \circ a = b \circ h

Coalgebras are interesting objects, to which we could devote a great deal of discussion. If algebras are about composing things, coalgebras are about taking them apart again. In compute science, they are often used to model objects with some sort of observable behaviour. We consider one very simple example to illustrate the idea.

Example: A coalgebra for the endofunctor (-) + 1 is simply a function of the form a : A \rightarrow A + 1. We could interpret this as a device, for which when we press the “go button” either jumps to a new state, ready to go again, or moves to a failure state.

We will resist the temptation to discuss coalgebra theory more generally at this point, and move rapidly on to the construction we are interested in.

A final coalgebra (also sometime termed a terminal coalgebra) for an endofunctor F is a final object in \mathsf{Coalg}(A,a). Explicitly, this is an object (\nu(F), \mathsf{out}) such that for every coalgebra (A,a) there is a unique map ! : A \rightarrow \nu(F) such that

\mathsf{out} \circ ! = F(!) \circ a

Final coalgebras generalise the notion of greatest fixed point for a monotone map, and satisfy a dual form of Lambek’s lemma.

Example: There is no final coalgebra for the powerset functor, as Lambek’s lemma for coalgebras would lead to a contradiction about cardinalities of powersets.

For our purposes in this post, we are not going to delve deeply into questions about the terminal coalgebra, although it does have great conceptual and mathematical significance. Informally, it can be seen as abstracting all possible behaviours that a coalgebra can exhibit, but we won’t pursue this remark now.

Instead, we’re going to ask a very naive question. Assuming they exist, the objects \mu(F + X) yielded a monad, the free monad F^*. Do the objects of the form \nu(F + X) also carry the structure of a monad as well?

This wouldn’t make a very satisfying end to this blog post if the answer were no, so unsurprisingly, the switch to considering terminal coalgebras does yield a monad. For an endofunctor F : \mathcal{C} \rightarrow \mathcal{C}, if all the required final coalgebras exist, we construct a monad in Kleisli form as follows:

  1. The object map is X \mapsto \nu(F +X).
  2. By Lambek’s lemma \mathsf{out} : \nu(F + X) \rightarrow F(\nu(F + X)) + X is an isomorphism. We take as our monad unit the composite \mathsf{out}^{-1} \circ \kappa_2 : X \rightarrow \nu(F + X).
  3. For a morphism f : X \rightarrow \nu(F + Y), we can define its Kleisli extension f^* : \nu(F + X) \rightarrow \nu(F + Y) as the unique F-algebra morphism such that f^* \circ \eta = f. That there is such an f^* requires some more theory, that hopefully we can discuss another time. There is a more explicit description of this morphism, but unfortunately it is cumbersome to describe in this format.

Obviously we need to check the appropriate axioms hold for a Kleisli triple, but this works out. The resulting monad is known as the free completely iterative monad on F. Furthermore, it is hopefully not too hard to see that, similarly to the free monad construction, this monad “keeps variables separate”. That is, the resulting monad is another example of the ideal monads that we discussed in the previous post. There is a lot more to say about completely iterative monads, and what the various adjectives mean, but we shall defer that to a later post. Instead, we shall conclude with an example.

Example: Let \Sigma be an algebraic signature. For the corresponding signature functor, the final coalgebra for all of the functors of the form \Sigma + X exists. Therefore the free completely iterative monad on \Sigma exists. The elements of \nu(\Sigma + X) are \Sigma-trees. That is, trees such that:

  1. Every internal node of arity n is labelled by an n-ary operation symbol.
  2. Every leaf node is labelled by either a constant symbol, or an element of X.

Note, these trees can be either finite or infinite, unlike for any of our previously examples, where we work with \Sigma-terms, which are equivalently finite \Sigma-trees. Intuitively, we should think of the extra elements in \nu(\Sigma + X) as “infinite \Sigma-terms”.

Remark: In this post, we have simply introduced various initial algebras and final coalgebras. This sort of “guess and check” approach to finding the right construction is ok, but there are a lot of more systematic strategies for establishing both the existence and a description of these objects. This is a large subject, and involves more technical background that we have introduced so far. A later post may return to this topic.

Being Free is Certainly Ideal

In this post we shall introduce two particular classes of monad that arise in practice. As usual, we shall endeavour keep things simple, often restricting our attention to an algebraic perspective on \mathsf{Set} monads.

Free Monads

We shall write \mathsf{Mnd}(\mathcal{C}) for the category of monads on a category \mathcal{C}, and monad morphisms between them. We shall also write [\mathcal{C},\mathcal{C}] for the category of endofunctors on \mathcal{C} and natural transformations between them. There is an obvious forgetful functor:

\mathsf{Mnd}(\mathcal{C}) \rightarrow [\mathcal{C},\mathcal{C}]

This functor does not in general have a left adjoint, so we have to consider existence of free objects individually. Let F be an endofunctor. The monad F^* is the free monad over F if there exists a universal natural transformation \theta : F \Rightarrow F^* such that for every monad \mathbb{T} and natural transformation \alpha : F \Rightarrow \mathbb{T} there exists a unique monad morphism \hat{\alpha} : F^* \Rightarrow \mathbb{T} such that

\hat{\alpha} \circ \theta = \alpha

Note this is the usual categorical notion of free object, relative to the forgetful functor above. It is important to emphasise that there are other forgetful functors from the category of monads for which we could consider free objects. This setting is normally what is intended when the term is used without qualification.

The notation F^* is fairly common, and relates to the similar notation used for free monoids, as there are many analogies.

Example: For a set E, the exception monad (-) + E is the free monad over the constant endofunctor K_E mapping everything to E . For the usual encoding of coproducts in \mathsf{Set} we have been using, the universal morphism is

\theta(e) = (2,e)

and for \alpha : K_E \Rightarrow \mathbb{T} the unique fill-in morphism is:

\hat{\alpha}(t) = \begin{cases} \eta(x), & \mbox{if } t = (1,x)  \\ \alpha(e), & \mbox{if } t = (2,e) \end{cases}

Having seen the abstract definition, and one example, it is natural to look for a more systematic strategy yielding free monads. For the approach we shall adopt, we are going to need a bit more machinery first.

For an endofunctor F : \mathcal{C} \rightarrow \mathcal{C}, there is a category of endofunctor algebras \mathsf{Alg}(F), with objects pairs (A, a : F(A) \rightarrow A), and a morphism h : (A,a) \rightarrow (B,b) is a \mathcal{C}-morphism h : A \rightarrow B such that:

h \circ a = b \circ F(h)

Composition and identities are as in \mathcal{C}. These conditions are similar to those we saw in discussion of the Eilenberg-Moore category in an earlier post. In fact, the category \mathcal{C}^{\mathbb{T}} is the full subcategory of \mathsf{Alg}(\mathbb{T}) consisting of the objects satisfying the unit and multiplication axioms for an Eilenberg-Moore algebra.

There is a forgetful functor R : \mathsf{Alg}(F) \rightarrow \mathcal{C}. If this functor has a left adjoint L : \mathcal{C} \rightarrow \mathsf{Alg}(F), then this induces a monad R \circ L on \mathcal{C}. Furthermore, we have an equivalence of categories

\mathsf{Alg}(F) \simeq \mathcal{C}^{R \circ L}

Recall that such an equivalence is certainly not automatic. This returns us to the topic of monadicity, that we touched upon in an earlier post. We have not introduced enough background on monadicity to discuss this properly yet, so we take this as a fact.

If a monad is of the form R \circ L as in the discussion above, it is referred to as the algebraically free monad over F. We can now state the main facts that we need:

  1. If \mathbb{T} is the algebraically free monad over F, then it is the free monad over F.
  2. If \mathcal{C} is a complete category, and for endofunctor F : \mathcal{C} \rightarrow \mathcal{C}, F^* exists, then the forgetful functor \mathsf{Alg}(F) \rightarrow \mathcal{C} has a left adjoint, and \mathcal{C}^\mathbb{F^*} \simeq \mathsf{Alg}(F).

So in the case of \mathsf{Set} monads, as the base category is complete, we can reduce the task of finding free monads to that of finding algebraically free monads.

To exploit this connection, we note that every algebraic signature \Sigma induces a signature endofunctor, which we shall also write as \Sigma : \mathsf{Set} \rightarrow \mathsf{Set}. This functor is given by the follow coproduct:

\Sigma(X) = \coprod_{o \in \Sigma} X^{\mathsf{ar}(o)}

where \mathsf{ar}(o) denotes the arity of operation symbol o.

Example: Let \Sigma be a signature with a single binary operation and a constant. Then the corresponding endofunctor is:

\Sigma(X) = X \times X  + 1

A \Sigma-algebra is simply a pair (A, a : A \times A + 1 \rightarrow A). The function a is equivalent to giving a binary function A \times A \rightarrow A, corresponding to the binary operation symbol, and an element of A, corresponding to the constant element. That is the same thing as specifying an algebra for the signature \Sigma.

More generally, it is hopefully clear that for a signature \Sigma, there is an isomorphism between the category of algebras for that signature, and the category of endofunctor algebras for the corresponding signature functor.

For a fixed signature, the free algebra over a set X exists, and is given by the set of all \Sigma-terms over X. From the isomorphism above, this means that the left adjoint to the forgetful functor \mathsf{Alg}(\Sigma) \rightarrow \mathsf{Set} exists, and so therefore does the free monad over \Sigma, with \Sigma^*(X) is the set of all terms over the signature. As usual, it is important to consider examples:

Example: As we have seen in earlier posts, the exception monad (-) + E is induced by an equational presentation with a signature consisting of just constant elements for each exception, and no equations. The signature functor is then the constant functor K_E, and so as we saw earlier, the exception monad is the free monad over this functor.

Example: Consider a signature with a single binary operation. This induces a functor:

\Sigma(X) = X \times X

For the free monad over this functor, if we write + for the binary operation, the set \Sigma^*(X) consists of terms such as

x, x_1 + x_2, (x_1 + x_2) + (x_3 + x_4), \ldots

We can identify these with binary trees, with the leaves labelled by elements of X. So we have constructed a binary tree monad as the free monad over (-) \times (-).

Example: Extending the previous example, for an arbitrary signature \Sigma, the resulting free monad will have \Sigma^*(X) consisting of trees such that:

  1. The leaves are labelled with elements of X, or constant symbols from the signature.
  2. The internal nodes of arity n are labelled with an operation symbol of the same arity.

Of course, we could consider \mathsf{Set} endofunctors that are not of the form of signature functors. There are certainly plenty of examples where the left adjoint we require exists. In order to get a concrete description of the induced monad, we would need an explicit description of these adjoints. This requires more technical machinery than we have had chance to introduce so far, so we shall defer this topic to a later post.

Ideal Monads

The free monad constructions in the previous section have some nice properties. If we fix an algebraic signature, and consider the free monad over the corresponding signature functor, as we observed, the elements of \Sigma^*(X) can be thought of as certain labelled trees. Looking a bit more carefully:

  1. The trees in the image of the monad unit are the trivial trees with just a leaf labelled by some element of X.
  2. The functor action \Sigma^*(h) just relabels the leaves of trees, according to the function h.
  3. The functor \Sigma^* can be restricted to the non-trivial trees, as these are preserved by the functor action. We shall denote this functor \Sigma^+.
  4. There is a natural isomorphism \Sigma^*(X) \cong X + \Sigma^+(X).
  5. Using this isomorphism, the monad unit is the left coproduct injection.
  6. The monad multiplication restricts to non-trivial terms as a natural transformation \mu' : \Sigma^+ \circ \Sigma^* \Rightarrow \Sigma^+ in the sense that \mu \circ \kappa_2 \Sigma^* = \kappa_2 \circ \mu'.

So free monads “keep the variables separate” in a well behaved way. The categorical structure above allow us to separate the variables from the non-trivial terms. We now abstract these observations to recover another well-studied class of monads.

An ideal monad is a monad (\mathbb{T},\eta,\mu) such that:

  1. There exists an endofunctor \mathbb{T}^+ with \mathbb{T} = \mathsf{Id} + \mathbb{T}^+.
  2. The unit is the left coproduct injection.
  3. There exists \mu' : \mathbb{T}^+ \circ \mathbb{T} \Rightarrow \mathbb{T}^+ such that \mu \circ \kappa_2 \mathbb{T} = \kappa_2 \circ \mu'.

Example: Free monads are ideal monads. This is clearly true for those induced by signature functors as discussed above, but in fact this result holds for free monads yielded by more general constructions.

Obviously there would be no need for a separate notion if the only examples were free monads.

Example: The list monad we have discussed previously restricts to a non-empty list monad, and this monad is ideal. The list monad is not ideal. If we consider the list monad as the free monoid monad, we can see this algebraically, as x = x * 1, so the unit is not a coproduct injection. Similarly, the non-ideal multiset monad restricts to the non-empty multiset monad, which is ideal.

We do need to be a little bit careful though. The non-empty finite powerset monad is not ideal. Algebraically this monad is the free join semilattice (without unit element) monad. The idempotence equation x = x \vee x then ensures the monad unit cannot be a coproduct injection.

Algebraically, the ideal monads are the ones where no non-trivial equation of the form x = t is provable, as we have seen in the example above.

There are other interesting classes of ideal monads, making conceptual use of this idea of being able to separate the non-trival terms from the variables. These examples are a whole topic in themselves, and we shall reserve them for a later post.

Further background: For those interested in the mathematics of free monads, a good place to start is the final chapter of Barr and Well’s book “Toposes, Triples and Theories”. A good source for background on ideal monads is Ghani and Uustalu’s paper “Coproducts of Ideal Monads”.

Acknowledgements: Ralph Sarkis pointed out several typos in a previous version of this post, which has greatly improved the content.

Monad bits and pieces

Previous posts have focussed quite a lot on general theory. In this post, we are going to examine a couple of specific monads in a fair amount of detail. Both can be seen as dealing with computations parameterised by a notion of external state.

The Reader or Environment Monad

For any set R, there is a monad with:

  • Endofunctor: The monad endofunctor is (-)^R.
  • Unit: The unit at X maps its input to the constant function x \mapsto (\lambda r.\; x).
  • Multiplication: The multiplication acts as follows \mu_X(f)(r) = f(r)(r).

This monad is called the reader or environment monad. The computational interpretation of this monad is that a function X \rightarrow Y^R produces an output that depends on some state or environment it can read. Note the environment is fixed, computations cannot make modifications to it – this will be addressed later.

Aside: This monad arises via an adjunction involving slice categories. There is an obvious forgetful functor U : \mathsf{Set} / R \rightarrow \mathsf{Set}. This has a right adjoint (-)^*, with action on objects A^* = A \times R \xrightarrow{\pi_2} R. In turn, this functor has a further right adjoint \Pi. The composite \Pi \circ (-)^* induces the reader monad.

At this level of abstraction, it is not immediately obvious that this adjunction induces the right monad. To at least convince ourselves that the endofunctors agree, we note there is then a series of bijections between hom sets:

  1. A \rightarrow \Pi(B^*)
  2. A^* \rightarrow B^*
  3. U(A^*) \rightarrow B

We then note that U(A^*) = A \times R, and via Cartesian closure functions A \times R bijectively correspond to functions A \rightarrow A^R. Therefore, \Pi \circ (-)^* \cong (-)^R. (The abstract argument generalises well beyond \mathsf{Set}).

Our main interest is to consider the reader monad from an algebraic perspective. To keep things as simple as possible, we are going to choose R = \{ 0, 1 \}. So computations have access to a single binary bit of external data, which can be set low (0) or high (1). As usual, we would like to have an algebraic presentation of this monad. It is natural to introduce a binary lookup operation \mid, such that informally we interpret

x \mid y

as a computation which lookups up what to do depending on the environment value, continuing as x if the bit is low, or y if it is high. We now need to identify some suitable equational axioms that this operation should respect. An immediate idea is to require the operation to be idempotent, that is

x \mid x \;=\; x

Intuitively, if we do x regardless of whether the environment bit is high or low, that is the same as behaving exactly as x. A second axiom that springs to mind is:

(w \mid x) \mid (y \mid z) \;=\; w \mid z

Here the idea is that:

  1. If the bit is low, we will act as the left component of the left component.
  2. If the bit is high, we will act as the right component of the right component.

In fact, it will turn out that these two axioms are exactly what we need. A simple consequence of the axioms is that:

x \mid (y \mid z) \;=\; (x \mid x) \mid (y \mid z) \;=\; (x \mid z) \;=\; (x \mid y) \mid (z \mid z) \;=\; (x \mid y) \mid z

So the operation | is associative. So we can now avoid the annoyance of writing brackets everywhere. We also notice that:

x \mid y \mid z \;=\; x \mid y \mid z \mid z \;=\; x \mid z

From the above equations, if we see a sequence of \mid operations, we can eliminate everything but the end points. Combining this with idempotence, it is hopefully not two hard to convince yourself that every term is equal to a unique term of the form x \mid y. So the equivalence classes of terms over X can be identified with pairs of elements from X, equivalently elements of X^{\{ 0, 1\}}.

Finally, we note that

x \mid y \mid x \;=\; x \mid x \;=\;  x

In fact, the following are equivalent for a structure over a binary operator \mid:

  1. The equations x \mid x \;=\; x and (w \mid x) \mid (y \mid z) \;=\; w \mid z hold.
  2. The equations x \mid (y \mid z) \;=\; (x \mid y) \mid z and (x \mid y) \mid x \;=\; x hold.

For the bottom to top direction, using associativity to ignore brackets:

x \mid z \;=\; x \mid y \mid x \mid z \mid y \mid z \;=\; x \mid y \mid z

Using this ability to insert arbitrary elements twice, we calculate:

w \mid z \;=\; w \mid x \mid z \;=\; w \mid x \mid y \mid z

Finally, to show idempotence, reusing the first equation we proved:

x \;=\; x \mid y \mid x \;=\; x \mid x

The first set of equations we derived from computational intuitions, the second set of equations identify these structures as naturally occurring algebraic objects known as rectangular bands. (A band is an associative idempotent binary operation.) So the Eilenberg-Moore category of the reader monad on a bit is the category of rectangular bands and their homomorphisms.

The State Monad

We have already encountered the state monad, as the monad induced by the Cartesian closed structure on \mathsf{Set}. More precisely, we should refer to this as the global state monad.

Explicitly, for a fixed set of states S, this monad has:

  • Endofunctor: S \Rightarrow (- \times S).
  • Unit: \eta_X(x)(s) = (x,s).
  • Multiplication: \mu_X(f)(s) = \mathsf{let}\; (g,s') = f(s) \; \mathsf{in} \; g(s').

Again, we are going to keep things as simple as possible, and consider a one bit state space, so S = \{ 0, 1 \}. We would like to find an equational presentation for this monad. An obvious place to start is to extend the presentation for the one bit reader monad above, which gives us the infrastructure to decide how to proceed based on the current state. A natural next step is to add operations to manipulate the bit. We could add two unary update operations, say with \mathop{\uparrow}(x) and \mathop{\downarrow}(x) meaning set the state high (low) and proceed as x. We can actually make do with a single bit flip operation, which we shall denote \mathop{\updownarrow}(x) with the intuitive reading of flip the state to its opposite value, and proceed as x. We quickly note that we can define:

\mathop{\uparrow}(x) \;:=\; \mathop{\updownarrow}(x) \mid x \quad\mbox{ and }\quad \mathop{\downarrow}(x) \;:=\; x \mid \mathop{\updownarrow}(x)

So the explicit bit setting operations are definable by only flipping the bit if it is the wrong value. Obviously, we expect that

\mathop{\updownarrow}\mathop{\updownarrow}(x) = x

as flipping the bit twice should leave us back where we started. We also expect flipping a bit to reverse how computation proceeds:

\mathop{\updownarrow}(x \mid y) = \mathop{\updownarrow}(y) \mid \mathop{\updownarrow}(x)

In fact, these equations give a presentation of the one bit state monad. Using the chosen equations, every term is equal to one of the form

s \mid t

where both s and t are either a variable, or of the form \mathop{\updownarrow}(x). This encodes a function taking a bit, saying left or right, and return a variable, and applying a flip to the state if appropriate. That is, an element of \{ 0, 1 \} \Rightarrow X \times \{ 0, 1 \}.

Generalising

To be more realistic, we really should consider larger state spaces or environments. Algebraically, this means instead of a binary lookup operation \mid specifying which of two choices to take depending on a bit value, for an environment with n possible values, we require an n-ary operation. As infix notation is no longer appropriate, we shall write lookup as:

l(x_1,\ldots,x_n)

The two axioms that induce the reader monad are the obvious extensions of those for a binary bit.

You may be wondering what to do if the environment is infinite in size. There are well-defined generalisations of universal algebra where operations of these bigger arities are permitted. As long as the maximum size of the operations is bounded, this all pretty much works as in the finite case, so we shall stick with that to keep the notation clearer.

For the state monad, we also need to think about how to change the state. Generalising the bit flip operation seems unnatural, for example should we rotate through the states in some arbitrary order? A better choice is to make the update operations primitive. Without loss of generality, it is notationally convenient to fix a state space S = \{1,\ldots,n\} of natural numbers. We introduce a unary operation u_i for each i \in S. The intuitive reading is that u_i(x) updates the state to i and continues as u. The equations we require, with their intuitive readings are:

  1. l(u_1(x),\ldots, u_n(x)) = x. If we don’t depend on the state, we can ignore it.
  2. l(l(x_{1,1},\ldots,x_{1,n}),\ldots,l(x_{n,1},\ldots,x_{n,n})) = l(x_{1,1},\ldots,x_{n,n}). If the state isn’t modified, we keep making the same choices.
  3. u_i(u_j(x)) = u_j(x). A second state change overwrites the first.
  4. u_i(l(x_1,\ldots,x_n)) = u_i(x_i). Update decides a subsequent lookup.

You may have expected the first axiom to be

l(x,\ldots,x) = x

In fact, this equation is derivable using the first two. The binary case is:

x = l(u_1(x), u_2(x)) = l(l(u_1(x), u_2(x)),l(u_1(x),u_2(x))) = l(x,x)

The general calculation is just notationally a bit harder to read.

The equations above appear in “Notions of Computation Determine Monads” by Plotkin and Power. In fact, they consider a further generalisation, reflecting a more realistic computational setting. We now consider a situation in which we have a set of memory addresses A, and each state has its own independent state value.

If we assume each memory location can take n possible values, we require an n-ary lookup operation l_a for each a \in A, and update operations u_{a,i}, intuitively setting address a to value i. We require the previous equations for each address:

  1. l_a(u_{a,1}(x),\ldots, u_{a,n}(x)) = x. If we don’t depend on the state at an address, we can ignore it.
  2. l_a(l_a(x_{1,1},\ldots,x_{1,n}),\ldots,l_a(x_{n,1},\ldots,x_{n,n})) = l_a(x_{1,1},\ldots,x_{n,n}). If the state at an address isn’t modified, we keep making the same choices when looking up at that address.
  3. u_{a,i}(u_{a,j}(x)) = u_{a,j}(x). A second state change at the same address overwrites the first.
  4. u_{a,i}(l_a(x_1,\ldots,x_n)) = u_{a,i}(x_i). Update decides a subsequent lookup at the same address.

We also require some additional axioms, which encode that the states at different addresses don’t interfere with each other.

  1. l_a(l_{a'}(x_{1,1},\ldots,x_{1,n}),\ldots,l_{a'}(x_{n,1},\ldots,x_{n,n})) = l_{a'}(l_a(x_{1,1}, \ldots, x_{1,n}),\ldots,l_a(x_{1,n},\ldots,x_{n,n})) where a \neq a'. Choices based on different addresses commute with each other.
  2. u_{a,i}(u_{a',j}(x)) = u_{a',j}(u_{a,i}(x)) where a \neq a'. Consecutive updates at different addresses commute with each other.
  3. u_{a,i}(l_{a'}(x_1,\ldots,x_n)) = l_{a'}(u_{a,i}(x_1),\ldots, u_{a,i}(x_n)) where a \neq a'. Updates and lookups at different addresses commute with each other.

The resulting monad is the state monad on A \times \{ 1,\ldots,n \}. Although the algebraic perspective on the reader and state monads was somewhat involved, this does pay off. For example, we now have the data needed to introduce algebraic operations and effect handlers for these monads.

As usual, there are lots more details to take care of in realistic work, such as moving beyond the category \mathsf{Set}, but the underlying intuitions transfer well. Interested readers should read the Plotkin and Power paper mentioned above, which also covers more advanced topics, such as a monad for local state.

The ins and outs of monads

In previous posts we have seen a lot of the basic mathematical infrastructure of monads. The aim of this post is to adopt a computational view on monads, viewing \mathbb{T}(X) as effectful computations over an object X. From this perspective, we shall consider two natural questions:

  1. How can we build such computations? In other words, how do we get things into \mathbb{T}(X)?
  2. What can we do with computations once have have built them? In other words, how to we get things back out of \mathbb{T}(X)?

There is a lot of discussion of these topics in the literature. As usual, we will try and keep things simple, restricting attention to simple monads on \mathsf{Set}, and concentrating on the key ideas.

Building computations

So far, the only way have have seen to build elements of \mathbb{T}(X) is to apply the monad unit \eta_X.

Example: For the exception monad, for set of exceptions E, the value \eta(x)  = (1,x) \in X + E is a trivial computation that returns the value x, and doesn’t throw an exception.

Example: The finite powerset monad \mathbb{P} can be seen as a form of bounded non-determinism. \eta(x) = \{ x \} \in \mathbb{P}(X) is a trivially non-deterministic computation that always results in the value x.

Example: The list monad \mathbb{L} can be seen as a form of non-determinism which can be easier to implement in a programming language. It is unusual as there is an order on computations, and repeated values are allowed. Obviously another perspective is to see such computations as living in a list data structure. Again \eta(x) = [x] is the trivial singleton list containing only the value x.

The intuition we derive from these examples is that \eta(x) yields a trivially effectful computation in some suitable sense. To get any real use out of these computational effects, we need to be able to build less boring examples as well. To solve this problem, as usual we shall adopt an algebraic perspective. Assume we have an equational presentation (\Sigma, E). An n-ary operation o \in \Sigma induces a function \mathbb{T}^n(X) \mapsto \mathbb{T}(X) as follows:

([t_1],\ldots,[t_n]) \mapsto [o(t_1,\ldots,t_n)]

Continuing the examples above:

Example: The exception monad has a presentation consisting of a constant e for each exception in E. These induce functions \mathsf{raise}_e : 1 \rightarrow \mathbb{X} with action \star \mapsto e, which we can think of as raising the exception e.

You may wonder where the notion of catching or handling an exception appears. This is a good question, and will be addressed in the next section.

Example: The finite powerset monad has a presentation with a binary operation \vee and a constant element \bot. The constant induces a function \bot : 1 \rightarrow \mathbb{P}(X) with action \star \mapsto \emptyset, which we can think of as picking out a diverging computation.

The binary operation induces an operation \vee : \mathbb{T}(X)^2 \rightarrow \mathbb{X} with action (U,V) \mapsto U \cup V, which takes two non-deterministic computations, and combines their possible behaviours. For example \eta(x_1) \vee \eta(x_2) is a computation yielding either x_1 or x_2.

Example: Similarly to the previous example, the list monad has a presentation with binary operation \times, and a constant element 1. The constant induces a function 1 \rightarrow \mathbb{T}(X) with action \star \mapsto [].

The binary operation induces a function \mathbb{T}(X)^2 \rightarrow \mathbb{T}(X) with action ([x_1,\ldots,x_m],[x_{m + 1}, \ldots, x_{n}]) \mapsto [x_1,\ldots,x_n] concatentating two lists. We can interpret these operations as simply manipulating data structures, or more subtly as a form of non-determinism in which the operations respect the order on choices.

Rather than describe these operations directly in terms of substitutions, it would be better to phrase things in terms of the available monad structure. To do so, we abuse notation and let m denote the set \{1,\ldots,m \}. A family of m equivalence classes of terms over X, ([t_i])_{1 \leq i \leq m}, can be identified with a morphism t: m \rightarrow \mathbb{T}(X), and a single m-ary term with a map o : 1 \rightarrow \mathbb{T}(m). The composite t^* \circ o : 1 \mapsto \mathbb{T}(X), where t^* denotes the Kleisli extension of t, corresponds to the substitution we are interested in:

([t_1],\ldots,[t_n]) \mapsto [o(t_1,\ldots,t_n)]

In fact, we have generalised slightly, as o is now an arbitrary term, not just an operation appearing in the signature.

Of course, we expect these mapping \mathbb{T}(X)^n \rightarrow \mathbb{T}(X) induced by Kleisli morphism to behave in a suitably uniform way. That is, they should be natural in some sense. This actually requires a modicum of care to get right. We should really consider our mapping as being of type U_{\mathbb{T}}(X)^n \rightarrow U_{\mathbb{T}}(X). Here, U_{\mathbb{T}} : \mathsf{Set}_{\mathbb{T}} \rightarrow \mathsf{Set} is the forgetful functor from the Kleisli category. This has action on objects X \mapsto \mathbb{T}(X), so this rephrasing makes sense. We then note that the induced maps:

U_{\mathbb{T}}^n(X) \rightarrow U_{\mathbb{T}}(X)

are natural. This is sometimes phrased as being natural with respect to Kleisli morphisms. They also trivially interact well with the strength morphisms in \mathsf{Set}. In more general categories, there is a bijection between:

  • Families of morphisms U_{\mathbb{T}}^n(X) \rightarrow U_{\mathbb{T}}(X) Kleisli natural in X, and interacting well with monad strength. These are referred to as algebraic operations.
  • Morphisms 1 \rightarrow \mathbb{T}(m), referred to as generic effects.

To give a denotational semantics for a serious programming language, a lot more details need filling in. The category \mathsf{Set} is not suitable for realistic work, but is certainly a helpful way to build up the basic intuitions.

Evaluating computations

Now we have some tools for building up elements of \mathbb{T}(X), what can we do with them? The unit and multiplication of a monad have types \mathsf{Id} \Rightarrow \mathbb{T} and \mathbb{T}^2 \Rightarrow \mathbb{T}, so so they give us no way “out of the monad”.

What we are looking for is a systematic way of building well-behaved maps of type \mathbb{T}(X) \rightarrow Y for some Y. The key idea is to recall the Eilenberg-Moore adjunction. For a set X, the free algebra over X is:

F^{\mathbb{T}}(X) = (\mathbb{T}(X), \mu_X)

The universal property of this adjunction means that if we can identity another Eilenberg-Moore algebra (Y,\xi), and a homomorphism

h : X \rightarrow Y

these bijectively corresponds to algebra homomorphisms:

F^{\mathbb{T}}(X) \xrightarrow{\hat{h}} (Y,\xi)

That is, morphisms

\mathbb{T}(X) \rightarrow Y

respecting the algebraic structure. It is worth considering the roles of these components:

  1. The homomorphism h : X \rightarrow Y specifies how values should be interpreted.
  2. The algebra (Y, \xi) specifies how the structure used to build up an effectful computation should be evaluated, in a manner respecting the structure encoded by the monad.

As usual, this is probably best understood by considering some examples.

Example: For the exception monad over E, an Eilenberg-Moore algebra (Y, \xi : Y + E \rightarrow Y) must act as the identity on the left component of the coproduct, and send each element of the right component to any value of its choosing in Y. In other words, we specify a default value to take for each of the exceptions that might be raised. Given a function h : X \rightarrow Y, the induced algebra morphism will map values according to h, and otherwise chose the value specified by \xi when it encounters a raised exception.

Example: For the finite powerset monad \mathbb{P}, giving an Eilenberg-Moore algebra is equivalent to specifying a join semilattice structure. For example, we can consider the Eilenberg-Moore algebra on the natural numbers \mathbb{N} specified by taking the maximum of pairs of elements, and bottom value 0. The resulting homomorphism \mathbb{P}(\mathbb{N}) \rightarrow \mathbb{N} induced by the identity will map a finite set to its maximum value (inevitably defaulting to 0 if the set is empty).

Example: The pattern for the list monad is similar. An Eilenberg-Moore algebra is a monoid, so we must specify a multiplication and unit element. For example if X is the set of two by two matrices over the natural numbers, this carries a monoid structure under matrix multiplication. The identity on X then induces a map \mathbb{L}(X) \rightarrow X that multiplies a list of matrices together from left to right. Note that this example fundamentally exploits the fact that the multiplication operation is not forced to be commutative.

The ideas in this section are the basics of what are known as effect handlers. As with algebraic operations, to produce a denotational semantics for a realistic programming language, and more complex handlers, there are many details to take care of, and we must move beyond \mathsf{Set}. However, this simplified setting is sufficient to illustrate the key ideas, without dealing with distracting technical details.

Summary

We have have seen the basic mathematical machinery to build effectful computations within a monad, and also how to evaluate these computation to values. The main points are:

  • The morphisms that allow us to build elements of \mathbb{T}(X) arise naturally from an algebraic presentation of the monad.
  • The morphisms that allow us to evaluate computations in \mathbb{T}(X) require us to make choices as to how that evaluation should proceed, specifically a suitable Eilenberg-Moore algebra, and then exploit the universal property of free algebras.

The literature in this area contains many nice examples, and is enjoyable to read. For algebraic operations and generic effects, search for work of Plotkin and Power, and for effect handlers, the work of Pretnar, Plotkin and Bauer are good places to start reading. There is a lot more to these topics than the details we have sketched here, so it is well worth exploring further.

We have also been limited in our example by the relatively simple monads we have introduced so far. We will see more interesting monads from a computational perspective in later posts.

Commutativity Algebraically

Recall the substitution notation we have been using:

t[t_x / x \mid x \in X]

denotes the term t, with each x \in X simultaneously replaced by the term t_x. For an equational presentation (\Sigma, E), inducing a monad \mathbb{T}, consider a pair:

([s],[t]) \in \mathbb{T}(X) \times \mathbb{T}(Y)

where [t] denotes an equivalence class with representative term t. We can then write the action of the first double strength map in terms of representatives and substitutions as:

\mathsf{dst}([s],[t]) = [t[s[(x,y)/x \mid x \in X]  / y \mid y \in Y]]

and the second:

\mathsf{dst}'([s],[t]) = [s[t[(x,y)/y \mid y \in Y] / x \mid x \in X]]

Therefore, the resulting monad is commutative if and only if for all s,t, the following equality is provable in equational logic:

t[s[(x,y)/x \mid x \in X]  / y \mid y \in Y] = s[t[(x,y)/y \mid y \in Y] / x \mid x \in X]

We shall call this the commutativity condition.

The substitutions can be made a bit easier to read with a change of notation. For two terms s,t, let x_1,\ldots,x_m and y_1,\ldots,y_n be a choice of enumeration of the variables appearing in the two terms. We can then write them as

s(x_1,\ldots, x_m) \quad\mbox{ and }\quad t(y_1,\ldots,y_n)

and writing substitution in the natural way, the commutativity condition becomes:

t(s((x_1,y_1),\ldots,(x_m,y_1)),\ldots, s((x_1,y_n),\ldots,(x_m,y_n))) = s(t((x_1,y_1),\ldots,(x_1,y_n)),\ldots,t((x_m,y_1),\ldots,(x_m,y_n)))

That’s an awful lot of formal notation and brackets to deal with. Let’s consider some implications of this condition to build intuition for what it means in practice.

Example: Consider a presentation with binary operations + and \times. The action of the double strengths on [x + x'] \in \mathbb{T}(X) and [y \times y'] \in \mathbb{T}(Y) are:

\mathsf{dst}([x + x'], [y \times y']) = [((x,y) + (x',y)) \times ((x,y') + (x',y'))]

and

\mathsf{dst}'([x + x'], [y \times y']) = [((x,y) \times (x,y')) + ((x',y) \times (x',y'))]

If the monad \mathbb{T} is commutative, the following equality must be provable in equational logic:

((x,y) + (x',y)) \times ((x,y') + (x',y')) = ((x,y) \times (x,y')) + ((x',y) \times (x',y'))

As a special case of this observation, any monad presented by a binary operation must satisfy the following equation:

((x,y) + (x',y)) + ((x,y') + (x',y')) = ((x,y) + (x,y')) + ((x',y) + (x',y'))

If + is associative, this boils down to:

(x,y) + (x',y) + (x,y') + (x',y') = (x,y) + (x,y') + (x',y) + (x',y')

The only difference between the two terms is the order of the middle two constants, so this equality will certainly hold for any associative commutative binary operation. This condition is satisfied in many natural algebraic structures, and does exhibit a very weak relationship between commutative operations and commutative monads.

An instructive boundary case is the following.

Example: Consider two terms s,t in which no variables appear. These are constants in our equational theory. The commutativity condition implies that:

s = t

as all the variable substitutions become trivial. Therefore any equational presentation of a commutative monad can have at most one distinct constant term. We have already encountered this phenomenon. The exception monad is only commutative when there is at most one exception constant.

“Constant counting” can be a quick way to discount the possibility that a monad for an equational presentation is commutative. For example the monads corresponding to commutative rings or rigs (semirings) cannot be commutative as they have distinct constant symbols for the additive and multiplicative structures. These are further natural examples of monads with all the binary operations in the signature commutative, but the resulting monads are not commutative.

Another simple case is worth considering.

Example: Let s \in \mathbb{T}(X) and t \in \mathbb{T}(Y) be terms in which only one variable appears, say x_0 and y_0 respectively. Then the left hand side of the commutativity condition is:

t[s[(x,y)/x \mid x \in X]  / y \mid y \in Y] = t(s((x_0,y_0)))

and the right hand side is:

s[t[(x,y)/y \mid y \in Y] / x \mid x \in X] = s(t((x_0,y_0)))

So for \mathbb{T} to be commutative, each pair of unary terms must satisfy s(t(x)) = t(s(x)).

As an application of this special case, we introduce another commonly considered monad. For a fixed monoid (M,\times,1) the writer monad has:

  • Endofunctor: The endofunctor is the product M \times (-).
  • Unit: \eta(x) = (1,x).
  • Multiplication: \mu(m,(n,x)) = (m \times n, x).

Computationally, this monad can be seen as encoding computations that also record some additional output such as logging. The monoid structure combines the outputs from successive computations. Algebraically, it corresponds to actions of the monoid M.

The writer monad has an equational presentation with one unary operational symbol for each element of the monoid, and equations:

  • 1(x) = x.
  • p(q(x)) = r(x) if and only if p \times q = r in the monoid.

Using our observation above, the writer monad is commutative if and only if the monoid M is commutative. So in this case, we do see a strong connection between the monadic and algebraic notions of commutativity.

Another useful boundary case is to consider what the commutativity condition means for variables.

Example: Consider a variable x_0 and an an arbitrary term t. The left hand side of the commutativity condition is:

t[x_0[(x,y)/x \mid x \in X]  / y \mid y \in Y] = t[(x_0,y) \mid y \in Y]

and the right hand side is:

x_0[t[(x,y)/y \mid y \in Y] / x \mid x \in X] = t[(x_0,y) \mid y \in Y]

So variables always satisfy the commutativity condition with respect to any other term. With hindsight, maybe we should not find this too surprising.

We will now consider an important example, which will point the way to getting a better handle on the unpleasant looking general commutativity condition we deduced above.

Example: We now consider an equational presentation with a constant term 0 and a binary term x + x' and a unary term h. The equations yielded by the commutativity condition for the pairs ([h],[0]) and ([h],[x + x']) are:

h(0) = 0 \quad\mbox{ and }\quad h((x,y) + (x',y)) = h((x,y')) + h((x',y'))

we can simplify the second condition by renaming variables, and we arrive at the conditions:

h(0) = 0\quad\mbox{ and }\quad h(x + x') = h(x) + h(x')

These conditions should hopefully look familiar, they are exactly the equations we require for h to be a homomorphism with respect to + and 0.

We now aim to make the connection with homomorphisms from the previous example precise by making two observations:

  1. For positive natural k and set Z, the term s(x_1,\ldots,x_m) induces an m-ary operation on \mathbb{T}^k(Z) with action (([t_{1,1}],\ldots,[t_{1,k}]),\ldots,([t_{m,1},\ldots,t_{m,k}])) \mapsto ([s(t_{1,1},\ldots,t_{m,1})],\ldots,[s(t_{1,k},\ldots,t_{m,k})]).
  2. Similarly, the term t(y_1,\ldots,y_m) induces an n-ary function \mathbb{T}(Z)^{n} \rightarrow \mathbb{T}(Z) with action ([t_1],\ldots, [t_n]) \mapsto [t(t_1,\ldots,t_n)]

The homomorphism condition is equivalent to saying that the n-ary function \mathbb{T}(X \times Y)^{n} \rightarrow \mathbb{T}(X \times Y) induced by t is a homomorphism with respect to the m-ary operation induced by s. In this sense, the commutativity conditions can be rephrased as all the terms are homomorphisms with respect to each other.

From the algebraic perspective a monad is commutative in the sense that all terms can be commuted past each other as homomorphisms.

Being commutative has nothing to do with being commutative

In explanations about commutative monads, it is common to see a passing remark such as “as we’d expect, the Abelian monoid monad is commutative”. Now I might be over-interpreting these comments, but they seem to imply that the monad being commutative is related to the algebraic structure having a commutative binary operation. Now it is true that the Abelian monoid (a.k.a. multiset) monad is commutative, and several other commutative monads are induced by algebraic structures with commutative binary operations, but how well does this intuition hold together more generally?

Firstly, we consider if being a commutative monad implies commutativity of the binary operations in equational presentations.

Counterexample: We saw last time that the finite distribution monad is commutative. This monad is isomorphic to the monad presented by a family of binary operations \{+^r  \mid r \in (0,1) \} satisfying the following equations:

  • Idempotence: For all r \in (0,1), x +^r x = x.
  • Pseudo-commutativity: For all r \in (0,1), x +^r y = y +^{1 - r} x.
  • Pseudo-associativity: For all r,s \in (0,1), x +^r (y +^s z) = (x +^{\frac{r}{r + (1-r)s}} y) +^{r + (1-r)s} z

Algebras for this theory are called convex algebras. Intuitively, the operation x +^r y forms the probabilistic mixture rx + (1-r)y and the axioms can be understood from this point of view. The pseudo-associativity axiom is particularly unpleasant to look at, but is easier to understand in terms of re-bracketing weighted binary combinations.

The axioms presenting convex algebras don’t have explicit commutativity equations for all the binary operations, but they might be derivable. To make sure this isn’t the case, we consider the concrete description of the free algebras that we get from the isomorphism to the finite distribution monad.

Specifically, the free algebra on X has universe finitely supported convex sums over X, and the operations are the obvious weighted sums:

(\sum_i p_i x_i) +^r (\sum_j q_j y_j) = \sum_i (r \times p_i) x_i + \sum_j ((1-r) \times q_j) y_j

This operation is clearly not commutative, except when r = \frac{1}{2}. So there is a commutative monad presented by an algebraic theory with (many different) non-commutative algebraic operations.

(I thank Maaike Zwart for suggesting this natural concrete counterexample)

What about the other direction, does an equational presentation where all the binary operations are commutative imply commutativity?

Counterexample: We have already seen that the exception monad (-) + E is only commutative if E is a singleton set. The exception monad is isomorphic to the monad for an equational theory with constant symbols the elements of E, and no equations. So there are both commutative and non-commutative monads with presentations containing no binary operations at all.

The previous counterexample is a bit too trivial to be satisfactory, as there are no binary operations involved at all, and we’re quantifying over the empty set when making statements about commutativity in the algebraic sense.

As a second attempt, we shall synthesise an equational presentation in which the only components involved are commutative binary operations.

Counterexample: Consider an equational presentation with two binary operations \oplus and \otimes, with the only equations being those requiring both operations are commutative. We consider the action of the double strengths on the equivalence classes [w \oplus x] and [y \otimes z]. For the first:

\mathsf{dst}([w \oplus x], [y \otimes z]) = [((w,y)\oplus(x,y)) \otimes ((w,z)\oplus(x,z))]

and for the second:

\mathsf{dst}'([w \oplus x], [y \otimes z]) = [((w,y)\otimes(w,z)) \oplus ((x,y)\otimes(x,z))]

If this monad is commutative, we must have equality of equivalence classes:

[((w,y)\oplus(x,y)) \otimes ((w,z)\oplus(x,z))]  = [((w,y)\otimes(w,z)) \oplus ((x,y)\otimes(x,z))]

We note that the provable equalities t = t' in our theory must have the same number of occurrences of \oplus and \otimes on both sides. Therefore the two equivalence classes are distinct, and the monad is not commutative.

So we have seen that there is a monad that is not commutative, presented by an equational theory containing only commutative binary operations.

The previous two counterexamples reflect the fact that commutative binary operations should not have anything essential to do with monad commutativity. Monads presented by theories with just constants may or may not be commutative. Theories just involving just binary operations may or may not be commutative. Obviously there are other arities of operation to consider, but by this point hopefully it’s clear that there’s no tight relationship.

In fact, a monad being commutative does imply that certain equations must hold. In the case of binary operations, it is sufficient for the operation to be commutative to satisfy some of these equations. In simple cases, for example theories with a single binary operation, this might explain some of the misleading patterns that emerge.

We shall examine commutative monads from an algebraic perspective in a later post, and see exactly what it is that can be commuted that inspires the terminology.