What’s a monad (take two)?

We introduced monads on a category \mathcal{C} as a triple consisting of an endofunctor and unit and multiplication natural transformations:

(\mathbb{T} : \mathcal{C} \rightarrow \mathcal{C},\eta : \mathsf{Id} \Rightarrow \mathbb{T},\mu : \mathbb{T}^2 \Rightarrow \mathbb{T})

The unit and multiplication are required to satisfy three axioms (which specify that they form a monoid in a suitably abstract sense). There is actually quite a bit of work implied in establishing we have a monad this way. We have to verify that:

  • Our construction \mathbb{T} is a well-defined endofunctor on the base category.
  • Our candidate unit \eta is a legitimate natural transformation.
  • Our candidate multiplication \mu is a legitimate natural transformation.
  • The unitality and associativity axioms hold.

This actually can be quite fiddly to do in practice. For example, verifying the associativity axiom requires working with objects of the form \mathbb{T}^3(A). If our endofunctor is complicated, this can lead to cumbersome calculations, and scope for errors.

An Alternative Formulation

A monad in extension form (sometimes also referred to as Kleisli form) on a category \mathcal{C} is a triple consisting of:

  1. An operation on objects \mathbb{T} : \mathsf{obj}(\mathcal{C}) \rightarrow \mathsf{obj}(\mathcal{C}).
  2. An \mathsf{obj}(\mathcal{C})-indexed family of unit morphisms \eta_A : A \rightarrow \mathbb{T}(A).
  3. An extension operation on morphisms (-)^* : \mathcal{C}(A,\mathbb{T}(B)) \rightarrow \mathcal{C}(\mathbb{T}(A),\mathbb{T}(B)).

These must satisfy three axioms:

  1. (\eta_A)^* = \mathsf{id}_{\mathbb{T}(A)}.
  2. f^* \circ \eta_A = f.
  3. g^* \circ f^* = (g^* \circ f)^*.

Although they might seem less natural from the point of view of category theory, these axioms can be significantly easier to verify in practice.

If we refer to our previous definition of monad as a monad in monoid form, we can convert freely between the two at our convenience.

Given a monad in monoid form (\mathbb{T},\eta,\mu), we can produce a monad in Kleisli form as follows:

  1. The mapping \mathbb{T} is the object mapping of the functor.
  2. The unit components are those of \eta.
  3. The extension mapping is A \xrightarrow{f} \mathbb{T}(B) \mapsto \mathbb{T} \xrightarrow{\mathbb{T}(f)} \mathbb{T}^2(B) \xrightarrow \mu_{B} \mathbb{T}(B).

In the other direction, given a monad in extension form (\mathbb{T}, (\eta_A)_{A \in \mathsf{obj}(\mathcal{C}}, (-)^*), we construct a monad in monoid form as follows:

  1. We extend \mathbb{T} to an endofunctor by defining the action on morphisms as f : A \rightarrow B \mapsto (\eta_B \circ f)^* : \mathbb{T}(A) \rightarrow \mathbb{T}(B).
  2. The components of the unit are the \eta_A.
  3. The component of the multiplication at A is (\mathsf{id}_{\mathbb{T}(A)})^* : \mathbb{T}^2(A) \rightarrow \mathbb{T}(A).

Verifying that the required properties hold in each direction, and the these mappings are mutually inverse, is a routine if long-winded exercise. We shall omit the details.

The key new feature is the extension operation. As usual, it is interesting to consider what an operation is doing from the point of view of algebra. Consider a \mathsf{Set} monad with equational presentation (\Sigma,E). Recall the elements of \mathbb{T}(A) are equivalence classes of \Sigma-terms, quotiented by provable equality in equational logic. As we have observed before, a function

f : A \rightarrow \mathbb{T}(B)

can be identified with an A-indexed family of equivalence classes of term, which we shall denote:

([t^f_x])_{x \in X}

The action of f^* is defined on representatives of equivalence classes as follows:

[t] \mapsto [t[t^f/x \mid x \in X]]

So the Kleisli extension is a substitution operation on representatives of equivalence classes.

Different Perspectives

Aside from the possible practical benefits in terms of easier verification, there are other reasons to prefer one form over another:

  • The monad in monoid form formulation is entirely in terms of categories, functors and natural transformations. We can therefore easily generalize this definition to any other bicategory. Categories become 0-cells, functors 1-cells and natural transformations 2-cells. This is an extremely fruitful direction of generalization. For example, it immediately yields a definition of monad suitable for enriched category theory.
  • The monad in extension form formulation emphasizes different aspects. There is a generalization of monads called relative monads, in which we move beyond endofunctors. The required definition takes as its starting point the extension formulation.

We shall talk about various generalizations of monads in later posts, once we have more basics under our belts.

2 thoughts on “What’s a monad (take two)?”

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: