Cybernetic agents vs Cartesian Frames

Dec 9, 2020
math

LessWrong draft. Now lives on LW!!!

[New Title]

In this post, I try to

Put some more formalism to the relationship between a Cartesian frame and the so-called “cybernetic” model of an agent as an input/output system.
Connect this to the emergent field of “compositional game theory”

Cybernetic agents contra Cartesian frames

Let’s recall the basic idea of a Cartesian frame: We have an “agent” which interacts with an “environment”. This relationship is maximally abstracted, so that the only data is

A set \(A\) of “ways the agent can be” - we can sometimes think of these as strategies that the agent can adopt.
A set \(E\) of “ways the environment can be”
A function \(\cdot: A \times E \to W\) which gives the “state of the world” for each agent-environment pair - let’s write this as \(a \cdot e \in W\) for \(a \in A, e \in E\).

(This is the same thing as a Chu space) As discussed in the original post on Cartesian frames, it is interesting to contrast this view of “agency” with the cybernetic model, of an agent as an input-output system. The agent in a Cartesian frame does not necessarily “see” part of the environment, nor does it produce some “output” - it simply chooses a strategy, or a “thing to be”, which then interacts with the environment.

We could define a cybernetic agent as follows:

A set \(A\) of strategies, again.
An input set \(I\) and an output set \(O\)
A funtion \(\cdot: A \times I \to O\), matching an output to an input according to the strategy.

This definition essentially just describes the agent and its interface with the environment. What is an “environment” for such an agent? We can abstract it down to the following information: what input does the agent end up receiving, and what happens with the output? The most general version of this is just a function \(O \to W\), which assembles the world-state out of the output (you could also say that this should be a function \(A \times O \to W\) - since the agent is a part of the world. Here I take the viewpoint that the internal state of the agent is not supposed to be relevant except through the output).

So given a cybernetic agent \((A,I,O,\cdot)\), we obtain a Cartesian frame \((A, I \times (O \to W),\diamond)\), where \(a \diamond (i,c) = c(a \cdot i)\).

Composing agents

One interesting feature of cybernetic agents is that they can be composed in a very natural way. Let’s write \((A,\cdot): I \to O\) given a cybernetic agent \((A,I,O,\cdot)\), to sort of show that the agent is going “from \(I\) to \(O\)”.

Then given \((A,\square): X \to Y\) and \((B,\diamond): Y \to Z\), we obtain a composite cybernetic agent \((B,\diamond) \circ (A,\square): X \to Z\), given by the following data:

The set of strategies is \(A \times B\)
The function \(\cdot: A \times B \times X \to Z\) is simply \((a,b)\cdot x = b \diamond (a \square x)\).

This gives a “category of cybernetic agents”:

The objects are sets
The maps \(X \to Y\) are agents \((A,\cdot): X \to Y\).
Composition is as above
The identity is the trivial agent \((\{*\},\cdot): X \to X\) where \(* \cdot x = x\).

(Apart from some “coherence” issues, due to the fact that \(A \times (B \times C)\) is not literally equal to \((A \times B) \times C\), but just isomorphic. These can be resolved in various ways.)

How does this interact with our construction of Cartesian frames from agents? Intuitively, we are putting the two agents next to each other - saying to the first “the thing that’s going to happen to your output is that it’s going to be fed into this other agent”, and to the second “the way you’re gonna get your input is going to be from the output of this other agent”. There is a construction that corresponds to this in the Cartesian frames framework - namely, we might expect the frame of the composed agent to be a sub-tensor (maybe up to extensional equivalence) of the frame of the individual agents. Or to use another term, the two agents should be sisters in the frame given by the composite agent. Or to use yet another term, the two agents should be multiplicative subagents of the composite agent.

Let’s investigate this:

Fix two agents \((A,\square): X \to Y\) and \((B,\diamond) : Y \to Z\). These correspond, as noted, to Cartesian frames \(C = (A,X \times (Y \to W))\) and \(D= (B,Y \times (Z \to W))\). Their composite corresponds to the Cartesian frame \(E = (A \times B, X \times (Z \to W))\).

The tensor product under consideration is \(C \otimes D = (A \times B, Hom(C^*,D))\). I claim that \(E\) is a sub-tensor - so I want to identify the environment of \(E\) with a subset of the environment of \(C \tensor D\). This means that our goal will be to find a map \(X \times (Z \to W) \to Hom(C,D^*)\) with “the right properties”. Note that \(C^* \to D\) in turn corresponds to two maps \(A \to Y \times (Z \to W)\), \(B \to (X \times (Y\to W))\).

Let \(x \in X\) and \(c: Z \to W\). How do we build a map \(A \to Y \times (Z \to W)\)? Well, where should we send \(a \in A\)? There’s really only one way to get an \(y\), namely by \(a \square x\), so we can output \((a \square x, c) \in Y \times (Z \to W)\). Okay, how do we build a map \(B \to (X \times (Y \to W))\)? Again, let’s take \(b \in B\). Well we already have an \(x\), and we can build a map \(Y \to W\), by taking \(y\) to \(c(b \diamond y)\).

Now we need to check that this is indeed a map of Cartesian frames, but this is just an exercise in writing out definitions, which I will skip.

Now we have a map \(X \times (Z \to W) \to Hom(C,D^*)\). Clearly this map is an injection - the map \(B \to X \times (Y \to W)\) gives back our original \(x \in X\) no matter the input \(b\), and out map \(A \to Y \times (Z \to W)\) gives back the original \(f: Z \to W\) no matter the inpt \(a\). So we can identify \(X \times (Z \to W)\) with its image. It is another tedious exercise to verify that this preserves the action, which is to say that the resulting map \(C \otimes D \to E\) is actually a morphism of Cartesian frames.

The second component of being a sub-tensor is that \(C \simeq (A, B \times X \times (Z \to W))\) (and analogously for \(D\)). This is not necessarily true in our case. This is essentially because the original frame \(C\) has “too many environments”. In \(C\), any function \(Y \to W\) is possible. But in the frame \((A, B \times X \times (Z \to W))\), \(w \in W\) is only allowed to depend on \(y \in Y\) through the value \(b \diamond y \in Z\). Suppose as an extreme example that \(B \times Y \to Z\) is constant - for all \(b\) and \(y\), \(b \diamond y = z_0\) for some constant \(z_0\). Then there is nothing to be done - in the composite agent, the output of the first agent will be irrelevant.

Thus, composing agents in this way constrains the set of environments in a way that’s not allowed for multiplicative subagents.

Open games

I came up with the above while thinking about how to connect Cartesian Frames to open games. Open games are a “compositional” approach to game theory. You can read a bit about these in this blog post or this paper.

An open game is a bit like a more complicated cybernetic agent. It consists of the following data:

A set \(\Sigma\) of strategies
A set \(X\) of states and a set \(Y\) of responses or moves
A “play” function \(p: \Sigma \times X \to Y\)
A set \(R\) of utilities
A set \(S\) of coutilities
A “coplay” function \(c: \Sigma \times X \times R \to S\)
A “best response” relation between \(x \in X, f: Y \to R\), and \(\sigma,\sigma' \in \Sigma\).

We write this as \((\Sigma,p,c,B): \begin{pmatrix}X \ S\end{pmatrix} \to \begin{pmatrix} Y \ R\end{pmatrix}\)

In other words \(B(x,f,\sigma,\sigma')\) can either be true or false - we say “\(\sigma'\) is a best response to \(\sigma\) given that we’re in state \(x\) and “context” \(f: Y \to R\)”.

Okay, so what the hell is going on here? An open game is a cybernetic agent which cares about something - but may contain subagents which are at odds with each other.

The first three items are just an ordinary cybernetic agent, although I’m now calling the collection of strategies \(\Sigma\). The set \(R\) of “utilities” should be thought of as a “generalized reward signal”. This is the “viewport” where we see what effect our actions have on the world. Depending on our strategy and the input, this then flows backwards via the “coplay” function to possibly become the view of some other agent in the past.

The “best response” relation basically says “For each subagent, given that we’re in state \(x\), and that the correspondence between our output and the reward signal is given by \(f: Y \to R\), and that every other subagent is going to move as in strategy \(\sigma\), is it optimal for me to move as in strategy \(\sigma'\)?”.

A basic example of an open game is given by a utility-optimizing agent - this is an open game of type \(\begin{pmatrix}X \ \{*\}\end{pmatrix} \to \begin{pmatrix}Y \ \mathbb{R}\end{pmatrix}\), for some sets \(X,Y\). The set of strategies is \(\Sigma = X \to Y\), simply the set of functions. The play function \(p: X \to Y \times Y \to Y\) is the obvious evaluation map. The coplay function is trivial, since \(S = \{*\}\). So we only have to specify the best response relation. This is given by “\(B(x,f,\sigma,\sigma')\) is true if \(\sigma' \in \Sigma\) maximized the function \(\Sigma \to \mathbb{R}\) given by \(f(p(-,x))\) - i.e, if \(\sigma'\) is the utility-maximizing strategy in the current state. This does not depend on \(\sigma\) at all because there are no “other subagents” to account for.

Given open games \((\Sigma_1,p_1,c_1,B_1): \begin{pmatrix} X \ T \end{pmatrix} \to \begin{pmatrix} Y \ S \end{pmatrix}\) and \((\Sigma_2,p_2,c_2,B_2): \begin{pmatrix} Y \ S \end{pmatrix} \to \begin{pmatrix} Z \ R \end{pmatrix}\), how can we compose them to an open game \(\begin{pmatrix} X \ T \end{pmatrix} \to \begin{pmatrix} Z \ R \end{pmatrix}\) ?

The intuition here is that we put the two agents next to each other, as subagents of a larger “agent”. The output of one agent becomes the input to the next, and the “coutility”, the information that the second agent passes backwards, becomes the utility of the first agent.

Just an in the case of cybernetic agents, the strategy set is simply the product: \(\Sigma = \Sigma_1 \times \Sigma_2\)

The play function of the composite agent is the obvious thing: given \((\sigma,\sigma',x) \in \Sigma \times \Sigma' \times X\), we apply the first play function, then the second one: \(p(\sigma_1,\sigma_2,x) = p_2(\sigma_2,p_1(\sigma_1,x))\). The coplay is slightly more convoluted: we have \(\sigma_1,\sigma_2,x\) as before, and also \(r \in R\). To get \(t \in T\), clearly we need to use the coplay function of the first game - but to use that, we need an \(s \in S\). We get this by using the play function of the first game to get a \(y\), then using the coplay of the second game. All in all, \(c(\sigma_1,\sigma_2,x,r) = c_1(\sigma_1,x,c_2(\sigma_2,p(\sigma_1,x),r))\) (don’t worry of this is confusing).

The best-response relation is the most convoluted - and the most interesting. We have \(x \in X\), \(f: Z \to R\) and \(\sigma_1,\sigma_2\). What is the best response? Now we should think of each component game as a subagent - what is the best response \(\sigma_1' \in \Sigma_1\)? Well, the structure of the first open game will tell us that - if we can supply it with \(x \in X\) (which we already have) and \(f: Y \to S\). What is the function \(Y \to S\)? It’s determined by the second game! Given that the other subagents move as in the existing strategy (that is, according to \(\sigma_2\)), what happens if the first agent chooses output \(y \in Y\)? The second agent will make a choice \(p(\sigma_2,y) \in Z\) based on that, which will become a \(f(p(\sigma_1,y)) \in R\), which will then pass backwards to the first agent as \(c(\sigma_2,y,f(p(\sigma_2,y)))\). So that is the function \(Y \to S\) that the first agent has as its context. Let’s call this function \(k\).

The context for the second agent is easier - the function \(Z \to R\) is given, and \(y \in Y\) should simply be whatever the first agent is currently passing forward (since we are choosing “best response if nobody else changes their strategy”), i.e \(p(\sigma_1,y)\).

Putting it all together, we have \(B(x,f,(\sigma_1,\sigma_2),(\sigma_1',\sigma_2'))\) if and only if \(B_1(x,k,\sigma_1,\sigma_1')\) and \(B_2(p(\sigma_1,x),f,\sigma_2,\sigma_2')\).

The point of all this is to describe game-theoretic games “compositionally”, i.e as built up out of smaller parts. Here’s a figure from the paper I linked above, “Compositional Game Theory”:

nil

This diagram shows how to build a “simultaneous move game” out of the following components:

Two utility-maximingizing open games \(\begin{pmatrix} \{*\}\ \{*\} \end{pmatrix} \to \begin{pmatrix} X_1 \ \mathbb{R} \end{pmatrix}\), \(\begin{pmatrix} \{\} \\ \{\}\end{pmatrix} → \begin{pmatrix}

X_2 \\ \mathbb{R} \end{pmatrix}\) and a function \(q: X_1 \times X_2 \to \mathbb{R} \times \mathbb{R}\) which calculates the utility that each agent receives given the moves \((x_1,x_2)\).

Clippings

\(\boxtimes\)

Subtensor stuff

How does this interact with our construction of Cartesian frames from agents? Intuitively, we are putting the two agents next to each other - saying to the first “the thing that’s going to happen to your output is that it’s going to be fed into this other agent”, and to the second “the way you’re gonna get your input is going to be from the output of this other agent”. There is a construction that corresponds to this in the Cartesian frames framework - namely, we might expect the frame of the composed agent to be a sub-tensor (maybe up to extensional equivalence) of the frame of the individual agents. Or to use another term, the two agents should be sisters in the frame given by the composite agent. Or to use yet another term, the two agents should be multiplicative subagents of the composite agent.

Let’s investigate this:

The tensor product under consideration is \(C \otimes D = (A \times B, Hom(C^*,D))\). I claim that \(E\) is a sub-tensor - so I want to identify the environment of \(E\) with a subset of the environment of \(C \otimes D\). This means that our goal will be to find a map \(X \times (Z \to W) \to Hom(C,D^*)\) with “the right properties”. Note that \(C^* \to D\) in turn corresponds to two maps \(A \to Y \times (Z \to W)\), \(B \to (X \times (Y\to W))\).

Now we need to check that this is indeed a map of Cartesian frames, but this is just an exercise in writing out definitions, which I will skip.

The second component of being a sub-tensor is that \(C \simeq (A, B \times X \times (Z \to W))\) (and analogously for \(D\)). This is not necessarily true in our case. This is essentially because the original frame \(C\) has “too many environments”. In \(C\), any function \(Y \to W\) is possible. But in the frame \((A, B \times X \times (Z \to W))\), \(w \in W\) is only allowed to depend on \(y \in Y\) through the value \(b \diamond y \in Z\). Suppose as an extreme example that \(B \times Y \to Z\) is constant - for all \(b\) and \(y\), \(b \diamond y = z_0\) for some constant \(z_0\). Then there is nothing to be done - in the composite agent, the output of the first agent will be irrelevant.

Thus, composing agents in this way constrains the set of environments in a way that’s not allowed for multiplicative subagents.