Capability Myths Demolished

Posted in Uncategorized by Mike Stay on 2016 February 16

A nice summary of erights’ Capability Myths Demolished paper.

the morning paper

Capability Myths Demolished – Miller et. al 2003

Pretty much everyone is familiar with an ACL-based approach to security. Despite having been around for a very long time, the capabilities approach to security is less well-known. Today’s paper choice provides an excellent introduction to the capabilities model and how it compares to ACLs. Along the way we’ll learn about 7 fundamental properties of security systems, and which combinations of those are required to offer certain higher-level guarantees. Capabilities are central to the type system of the Pony language which we’ll be looking at tomorrow.

Let’s start out by looking at one of the fundamental differences between ACLs and capabilities, the direction of the relationship between subject and resource. Consider a classic access matrix such as the one below. Each row is a subject, and each column a resource. The entry in a given cell describes the permissions the subject has…

View original post 949 more words

Serializing Javascript Closures

Posted in Uncategorized by Mike Stay on 2016 January 21

In Javascript, the eval function lets you dynamically look up variables in scope, so you can do a terrible hack like this to sort-of serialize a closure if you’re in the same scope as the variables being closed over:

Given StringMap.js and atLeastFreeVarNames.js from SES, one can define the following:

var s = function(f) {
  // Find free vars in f. (This depends, of course, on
  // Function.prototype.toString being unchanged.)
  var code = f.toString();
  var free = ses.atLeastFreeVarNames(code);
  // Construct code that evaluates to an environment object.
  var env = ["({"];
  for (var i = 0, len = free.length; i < len; ++i) {
    env.push('":(function(){try{return eval("(');
    env.push(')")}catch(_){return {}}})()');
  return "({code:" + JSON.stringify(code) + ",env:" + env.join("") + "})";

// See or
// for versions of stringify that handle cycles.
var t = function(x) { return '(' + JSON.stringify(x) + ')'; }

Then you can use these definitions to serialize inline definitions that only close over “stringifiable” objects or objects behind cut points (see deserialization below):

var baz = {x: 1};
var serializedClosure = t(eval(s(
  function bar(foo) { console.log('hi'); return ++(baz.x)+foo /*fiddle*/; }
// serializedClosure === '({"code":"function bar(foo) { console.log('hi'); return ++(baz.x)+foo /*fiddle*/; }","env":{"function":{},"bar":{},"foo":{},"console":{},"log":{},"hi":{},"return":{},"baz":{"x":1},"x":{},"fiddle":{}}})'

The string serializedClosure can then be stashed somewhere.  When it’s time to deserialize, do the following:

var d = function(closure, localBindings) {
  localBindings = localBindings || {};
  return function() {
    with(closure.env) {
      with (localBindings) {
        return eval("(" + closure.code + ")").apply(this, arguments);
var closure = eval(serializedClosure);
// Hook up local values if you want them.
var deserializedFn = d(closure, {console: console});
// Prints "2foo" to the console.
// Prints "3bar" to the console.

If you want to store the updated state, just re-stringify the closure:

var newSerializedClosure = t(closure);
// newSerializedClosure === '({"code":"function bar(foo) { console.log('hi'); return ++(baz.x)+foo /*fiddle*/; }","env":{"function":{},"bar":{},"foo":{},"console":{},"log":{},"hi":{},"return":{},"baz":{"x":3},"x":{},"fiddle":{}}})'
// Note that baz.x is now 3.

As I said, a very ugly hack, but still might be useful somewhere.

HDRA part 1a

Posted in Uncategorized by Mike Stay on 2015 May 30

Again, comments are appreciated.

2-categories and lambda calculus

This is a follow up post, but not the promised sequel, from A 2-Categorical Approach to the Pi Calculus where I’ll try to correct various errors I made and try to clarify some things.

First, lambda calculus was invented by Church to solve Hilbert’s decision problem, also known as the Entscheidungsproblem. The decision problem was third on a list of problems he presented at a conference in 1928, but was not, as I wrote last time, “Hilbert’s third problem”, which was third on a list of problems he laid out in 1900.

Second, the problem Greg Meredith and I were trying to address was how to prevent reduction under a “receive” prefix in the pi calculus, which is intimately connected with preventing reduction under a lambda in lambda calculus. All programming languages that I know of, whether eager or lazy, do not reduce under a lambda.

There are two approaches in literature to the semantics of lambda calculus. The first is denotational semantics, which was originally concerned with what function a term computes. This is where computability and type theory live. Denotational semantics treats alpha-beta-eta equivalence classes of lambda terms as morphisms in a category. The objects in the category are called “domains”, and are usually “\omegaCPOs“, a special kind of poset. Lambek and Scott used this approach to show that alpha-beta-eta equivalence classes of lambda terms with one free variable form a cartesian closed category where composition is given by substitution.

The type of a term describes the structure of its normal form. Values are terms that have no beta reduction; they’re either constants or lambda abstractions. The types in lambda calculus are either base types or lambda abstractions, respectively. (While Lambek and Scott introduced new term constructors for products, they’re not strictly necessary, because the lambda term \lambda z.ztu can be used for the pair \langle t,u\rangle with projections \lambda p.p(\lambda xy.x) and \lambda p.p(\lambda xy.y).)

The second is operational semantics, which is more concerned with how a function is computed than what it computes. All of computational complexity theory and algorithmic analysis lives here, since we have to count the number of steps it takes to complete a computation. Operational semantics treats lambda terms as objects in a category and rewrite rules as morphisms. It is a very syntactical approach; the set of terms is an algebra of some endofunctor, usually presented in Backus-Naur Form. This set of terms is then equipped with some equivalence relations and reduction relations. For lambda calculus, we mod out terms by alpha, but not beta. The reduction relations often involve specifying a reduction context, which means that the rewrites can’t occur just anywhere in a term.

In pi calculus, terms don’t have a normal form, so we can’t define an equivalence on pi calculus terms by comparing their normal forms. Instead, we say two terms are equivalent if they behave the same in all contexts, i.e. they’re bisimilar. Typing becomes rather more complicated; pi calculus types describe the structure of the current term and rewrites should do something weaker than preserve types on the nose; it suggests using a double category, but that’s for next time.

Seely suggested modeling rewrites with 2-morphisms in a 2-category and showed how beta and eta were lax adjoints.  We’re suggesting a different, more operational way of modeling rewrites with 2-morphisms.  Define a 2-category L with

  • V and T as generating objects,
  • -, -(=), and \lambda as generating 1-morphisms,
  • alpha equivalence as an identity 2-morphism, and
  • beta reduction as a generating 2-morphism.

A closed term is one with no free variables, and the hom category L(I, T) consisting of closed lambda terms and beta reductions between them is a typical category you’d get from looking for the operational semantics of lambda calculus.

The 2-category L is a fine semantics for the variant of lambda calculus where beta can apply anywhere within a term, but there are too many morphisms if you want to model the lambda calculus without reduction under a prefix. Given closed terms t, t'\colon I \to T such that t beta reduces to t', we can whisker beta by \lambda x to get

\displaystyle \lambda x.\beta\colon \lambda x.t \Rightarrow \lambda x.t'.

To cut down the extra 2-morphisms, we need to model reduction contexts. The difference between the precontexts above and the contexts we need is the notion of the “topmost” context, where we can see enough of the surrounding term to determine that reduction is possible. To model a reduction context, we make a new 2-category L' by introducing a new generating 1-morphism

\displaystyle [-]\colon T \to T

and say that a 1-morphism with signature T^{\otimes n} \to T that factors as a precontext followed by [-] is an n-hole term context. We also interpret beta with a 2-morphism between reduction contexts rather than between precontexts. The hom category L'(I, T) is the operational semantics of lambda calculus without reduction under lambda.

In the next post, I’ll relate Seely’s 2-categorical approach and our 2-categorical by extending to double categories and using Melliès and Zeilberger’s notion of type refinement.


Posted in Uncategorized by Mike Stay on 2015 May 18

Greg Meredith and I have a short paper that’s been accepted to Higher-Dimensional Rewriting and Applications
(HDRA) 2015
on modeling the asynchronous polyadic pi calculus with 2-categories. We avoid domain theory entirely and model the operational semantics directly; full abstraction is almost trivial. As a nice side-effect, we get a new tool for reasoning about consumption of resources during a computation.

It’s a small piece of a much larger project, which I’d like to describe here in a series of posts. This post will talk about lambda calculus for a few reasons. First, lambda calculus is simpler, but complex enough to illustrate one of our fundamental insights. Lambda calculus is to serial computation what pi calculus is to concurrent computation; lambda calculus talks about a single machine doing a computation, while pi calculus talks about a network of machines communicating over a network with potentially random delays. There is at most one possible outcome for a computation in the lambda calculus, while there are many possible outcomes in a computation in the pi calculus. Both the lazy lambda calculus and the pi calculus, however, have as an integral part of their semantics the notion of waiting for a sub-computation to complete before moving onto another one. Second, the denotational semantics of lambda calculus in Set is well understood, as is its generalization to cartesian closed categories; this semantics is far simpler than the denotational semantics of pi calculus and serves as a good introduction. The operational semantics of lambda calculus is also simpler than that of pi calculus and there is previous work on modeling it using higher categories.


Alonzo Church invented the lambda calculus as part of his attack on Hilbert’s third problem, also known as the Entscheidungsproblem, which asked for an algorithm to solve any mathematical problem. Church published his proof that no such algorithm exists in 1936. Turing invented his eponymous machines, also to solve the Entscheidungsproblem, and published his independent proof a few months after Church. When he discovered that Church had beaten him to it, Turing proved in 1937 that the two approaches were equivalent in power. Since Turing machines were much more “mechanical” than the lambda calculus, the development of computing machines relied far more on Turing’s approach, and it was only decades later that people started writing compilers for more friendly programming languages. I’ve heard it quipped that “the history of programming languages is the piecemeal rediscovery of the lambda calculus by computer scientists.”

The lambda calculus consists of a set of “terms” together with some relations on the terms that tell how to “run the program”. Terms are built up out of “term constructors”; in the lambda calculus there are three: one for variables, one for defining functions (Church denoted this operation with the Greek letter lambda, hence the name of the calculus), and one for applying those functions to inputs. I’ll talk about these constructors and the relations more below.

Church introduced the notion of “types” to avoid programs that never stop. Modern programming languages also use types to avoid programmer mistakes and encode properties about the program, like proving that secret data is inaccessible outside certain parts of the program. The “simply-typed” lambda calculus starts with a set of base types and takes the closure under the binary operation \to to get a set of types. Each term is assigned a type; from this one can deduce the types of the variables used in the term. An assignment of types to variables is called a typing context.

The search for a semantics for variants of the lambda calculus has typically been concerned with finding sets or “domains” such that the interpretation of each lambda term is a function between domains. Scott worked out a domain D such that the continuous functions from D to itself are precisely the computable ones. Lambek and Scott generalized the category where we look for semantics from Set to arbitrary cartesian closed categories (CCCs).

Lambek and Scott constructed a CCC out of lambda terms; we call this category the syntactical category. Then a structure-preserving functor from the syntactical category to Set or some other CCC would provide the semantics. The syntactical category has types as objects and equivalence classes of certain terms as morphisms. A morphism in the syntactical category goes from a typing context to the type of the term.

John Baez has a set of lecture notes from Fall 2006 through Spring 2007 describing Lambek and Scott’s approach to the category theory of lambda calculus and generalizing it from cartesian closed categories to symmetric monoidal closed categories so it can apply to quantum computation as well: rather than taking a functor from the syntactical category into Set, we can take a functor into Hilb instead. He and I also have a “Rosetta stone” paper summarizing the ideas and connecting them with the corresponding generalization of the Curry-Howard isomorphism.

The Curry-Howard isomorphism says that types are to propositions as programs are to proofs. In practice, types are used in two different ways: one as propositions about data and the other as propositions about code. Programming languages like C, Java, Haskell, and even dynamically typed languages like JavaScript and Python use types to talk about propositions that data satisfies: is it a date or a name? In these languages, equivalence classes of programs constitute constructive proofs. Concurrent calculi are far more concerned about propositions that the code satisfies: can it reach a deadlocked state? In these languages, it is the rewrite rules taking one term to another that behave like proofs. Melliès and Zeilberger’s excellent paper “Functors are Type Refinement Systems” relates these two approaches to typing to each other.

Note that Lambek and Scott’s approach does not have the sets of terms or variables as objects! The algebra that defines the set of terms plays only a minor role in the category; there’s no morphism in the CCC, for instance, that takes a term t and a variable x to produce the term \lambda x.t. This failure to capture the structure of the term in the morphism wasn’t a big deal for lambda calculus because of “confluence” (see below), but it turns out to matter a lot more in calculi like Milner’s pi calculus that describe communicating over a network, where messages can be delayed and arrival times matter for the end result (consider, for instance, two people trying to buy online the last ticket to a concert).

The last few decades have seen domains becoming more and more complicated in order to try to “unerase” the information about the structure of terms that gets lost in the domain theory approach and recover the operational semantics. Fiore, Moggi, and Sangiorgi, Stark and Cattani, Stark, and Winskel all present domain models of the pi calculus that recursively involve the powerset in order to talk about all the possible futures for a term. Industry has never cared much about denotational semantics: the Java Virtual Machine is an operational semantics for the Java language.

What we did

Greg Meredith and I set out to model the operational semantics of the pi calculus directly in a higher category rather than using domain theory. An obvious first question is, “What about types?” I was particularly worried about how to relate this approach to the kind of thing Scott and Lambek did. Though it didn’t make it into the HDRA paper and the details won’t make it into this post, we found that we’re able to use the “type-refinement-as-a-functor” idea of Melliès and Zeilberger to show how the algebraic term-constructor functions relate to the morphisms in the syntactical category.

We’re hoping that this categorical approach to modeling process calculi will help with reasoning about practical situations where we want to compose calculi; for instance, we’d like to put a hundred pi calculus engines around the edges of a chip and some ambient calculus engines, which have nice features for managing the location of data, in the middle to distribute work among them.

Lambda calculus

The lambda calculus consists of a set of “terms” together with some relations on the terms. The set T of terms is defined recursively, parametric in a countably infinite set V of variables. The base terms are the variables: if x is an element of V, then x is a term in T. Next, given any two terms t, t' \in T, we can apply one to the other to get t(t'). We say that t is in the head position of the application and t' in the tail position. (When the associativity of application is unclear, we’ll also use parentheses around subterms.) Finally, we can abstract out a variable from a term: given a variable x and a term t, we get a term \lambda x.t.

The term constructors define an algebra, a functor LC from Set to Set that takes any set of variables V to the set of terms T = LC(V). The term constructors themselves become functions:

\displaystyle   \begin{array}{rll}  -\colon & V \to T &\mbox{variable}\\  -(-)\colon & T \times T \to T &\mbox{application}\\  \lambda\colon & V \times T \to T &\mbox{abstraction}  \end{array}

Church described three relations on terms. The first relation, alpha, relates any two lambda abstractions that differ only in the variable name. This is exactly the same as when we consider the function f(x) = x^2 to be identical to the function f(y) = y^2. The third relation, eta, says that there’s no difference between a function f and a “middle-man” function that gets an input x and applies the function f to it: \lambda x.f(x) = f. Both alpha and eta are equivalences.

The really important relation is the second one, “beta reduction”. In order to define beta reduction, we have to define the free variables of a term: a variable occurring by itself is free; the set of free variables in an application is the union of the free variables in its subterms; and the free variables in a lambda abstraction are the free variables of the subterm except for the abstracted variable.

\displaystyle   \begin{array}{rl}  \mathrm{FV}(x) = & \{x\} \\  \mathrm{FV}(t(t')) = & \mathrm{FV}(t) \cup \mathrm{FV}(t') \\  \mathrm{FV}(\lambda x.t) = & \mathrm{FV}(t) / \{x\} \\  \end{array}

Beta reduction says that when we have a lambda abstraction \lambda x.t applied to a term t', then we replace every free occurrence of x in t by t':

\displaystyle  (\lambda x.t)(t') \downarrow_\beta t\{t' / x\},

where we read the right hand side as “t with t' replacing x.” We see a similar replacement of y in action when we compose the following functions:

\displaystyle   \begin{array}{rl}  f(x) = & x + 1 \\  g(y) = & y^2 \\  g(f(x)) = & (x + 1)^2 \\  \end{array}

We say a term has a normal form if there’s some sequence of beta reductions that leads to a term where no beta reduction is possible. When the beta rule applies in more than one place in a term, it doesn’t matter which one you choose to do first: any sequence of betas that leads to a normal form will lead to the same normal form. This property of beta reduction is called confluence. Confluence means that the order of performing various subcomputations doesn’t matter so long as they all finish: in the expression (2 + 5) * (3 + 6) it doesn’t matter which addition you do first or whether you distribute the expressions over each other; the answer is the same.

“Running” a program in the lambda calculus is the process of computing the normal form by repeated application of beta reduction, and the normal form itself is the result of the computation. Confluence, however, does not mean that when there is more than one place we could apply beta reduction, we can choose any beta reduction and be guaranteed to reach a normal form. The following lambda term, customarily denoted \omega, takes an input and applies it to itself:

\displaystyle \omega = \lambda x.x(x)

If we apply \omega to itself, then beta reduction produces the same term, customarily called \Omega:

\displaystyle \Omega = \omega(\omega)

\displaystyle \Omega \downarrow_\beta \Omega.

It’s an infinite loop! Now consider this lambda term that has \Omega as a subterm:

\displaystyle (\lambda x.\lambda y.x)(\lambda x.x)(\Omega)

It says, “Return the first element of the pair (identity function, \Omega)”. If it has an answer at all, the answer should be “the identity function”. The question of whether it has an answer becomes, “Do we try to calculate the elements of the pair before applying the projection to it?”

Lazy lambda calculus

Many programming languages, like Java, C, JavaScript, Perl, Python, and Lisp are “eager”: they calculate the normal form of inputs to a function before calculating the result of the function on the inputs; the expression above, implemented in any of these languages, would be an infinite loop. Other languages, like Miranda, Lispkit, Lazy ML, and Haskell and its predecessor Orwell are “lazy” and only apply beta reduction to inputs when they are needed to complete the computation; in these languages, the result is the identity function. Abramsky wrote a 48-page paper about constructing a domain that captures the operational semantics of lazy lambda calculus.

The idea of representing operational semantics directly with higher categories originated with R. A. G. Seely, who suggested that beta reduction should be a 2-morphism; Barney Hilken and Tom Hirschowitz have also contributed to looking at lambda calculus from this perspective. In the “Rosetta stone” paper that John Baez and I wrote, we made an analogy between programs and Feynman diagrams. The analogy is precise as far as it goes, but it’s unsatisfactory in the sense that Feynman diagrams describe processes happening over time, while Lambek and Scott mod out by the process of computation that occurs over time. If we use 2-categories that explicitly model rewrites between terms, we get something that could potentially be interpreted with concepts from physics: types would become analogous to strings, terms would become analogous to space, and rewrites would happen over time.

The idea from the “algebra of terms” perspective is that we have objects V and T for variables and terms, term constructors as 1-morphisms, and the nontrivial 2-morphisms generated by beta reduction. Seely showed that this approach works fine when you’re unconcerned with the context in which reduction can occur.

This approach, however, doesn’t work for lazy lambda calculus! Horizontal composition in a 2-category is a functor, so if a term t reduces to a term t', then by functoriality, \lambda x.t must reduce to \lambda x.t'—but this is forbidden in the lazy lambda calculus! Functoriality of horizontal composition is a “relativity principle” in the sense that reductions in one context are the same as reductions in any other context. In lazy programming languages, on the other hand, the “head” context is privileged: reductions only happen here. It’s somewhat like believing that measuring differences in temperature is like measuring differences in space, that only the difference is meaningful—and then discovering absolute zero. When beta reduction can happen anywhere in a term, there are too many 2-morphisms to model lazy lambda calculus.

In order to model this special context, we reify it: we add a special unary term constructor [-]\colon T \to T that marks contexts where reduction is allowed, then redefine beta reduction so that the term constructor [-] behaves like a catalyst that enables the beta reduction to occur. This lets us cut down the set of 2-morphisms to exactly those that are allowed in the lazy lambda calculus; Greg and I did essentially the same thing in the pi calculus.

More concretely, we have two generating rewrite rules. The first propagates the reduction context to the head position in an application; the second is beta reduction restricted to a reduction context.

\displaystyle  [t(t')]\, \downarrow_{ctx}\, [[t](t')]

\displaystyle  [[\lambda x.t](t')]\, \downarrow_\beta\, [t]\, \{t'/x\}

When we surround the example term from the previous section with a reduction context marker, we get the following sequence of reductions:

\displaystyle   \begin{array}{rl}  & [(\lambda x.\lambda y.x)(\lambda x.x)(\Omega)] \\    \downarrow_{ctx}& [[(\lambda x.\lambda y.x)(\lambda x.x)](\Omega)] \\    \downarrow_{ctx}& [[[\lambda x.\lambda y.x](\lambda x.x)](\Omega)] \\    \downarrow_{\beta}& [[\lambda y.(\lambda x.x)](\Omega)]\\    \downarrow_{\beta}& [\lambda x.x] \\  \end{array}

At the start, none of the subterms were of the right shape for beta reduction to apply. The first two reductions propagated the reduction context down to the projection in head position. At that point, the only reduction that could occur was at the application of the projection to the first element of the pair, and after that to the second element. At no point was \Omega ever in a reduction context.

Compute resources

In order to run a program that does anything practical, you need a processor, time, memory, and perhaps disk space or a network connection or a display. All of these resources have a cost, and it would be nice to keep track of them. One side-effect of reifying the context is that we can use it as a resource.

The rewrite rule \downarrow_{ctx} increases the number of occurrences of [-] in a term while \downarrow_\beta decreases the number. If we replace \downarrow_{ctx} by the rule

\displaystyle   [t(t')]\, \downarrow_{ctx'}\, [t](t')

then the number of occurences of [-] can never increase. By forming the term [[\cdots[t]\cdots]], we can bound the number of beta reductions that can occur in the computation of t.

If we have a nullary constructor c\colon 1 \to T, then we can define [t] = c(t) and let the program dynamically decide whether to evaluate an expression eagerly or lazily.

In the pi calculus, we have the ability to run multiple processes at the same time; each [-] in that situation represents a core in a processor or computer in a network.

These are just the first things that come to mind; we’re experimenting with variations.


We figured out how to model the operational semantics of a term calculus directly in a 2-category by requiring a catalyst to carry out a rewrite, which gave us full abstraction without needing a domain based on representing all the possible futures of a term. As a side-effect, it also gave us a new tool for modeling resource consumption in the process of computation. Though I haven’t explained how yet, there’s a nice connection between the “algebra-of-terms” approach that uses V and T as objects and Lambek and Scott’s approach that uses types as objects that uses Melliès and Zeilberger’s ideas about type refinement. Next time, I’ll talk about the pi calculus and types.

Q, Jokers, and Clowns

Posted in Uncategorized by Mike Stay on 2014 August 5

Conor McBride has a beautiful functional pearl, “Clowns to the Left of me, Jokers to the Right”, in which he discusses the idea of suspending the computation of a catamorphism. I just wanted to point out the relation of his work to the q-derivative. Since Star Trek’s character Q is a trickster god like Loki, Anansi, or Coyote, q-derivatives also fit nicely with McBride’s theme.

The usual definition of the derivative is

\displaystyle\frac{df(x)}{dx} = \lim_{h \to 0} \frac{f(x + h) - f(x)}{(x + h) - x}.

Instead of translating x by an infinitesimal amount h, we can scale x by an infinitesimal amount q; these two definitions coincide in the limit:

\displaystyle\frac{df(x)}{dx} = \lim_{q \to 1} \frac{f(qx) - f(x)}{qx - x}.

However, when people talk about the q-derivative, they usually mean the operator we get when we don’t take the limit and q \ne 1. It should probably be called the “q-difference”, but we’ll see that the form of the difference is so special that it deserves the exceptional name.

The q-derivative, as one would hope, behaves linearly:

\begin{array}{rl} \displaystyle \frac{d_qf(x)}{d_qx} & \displaystyle = \frac{f(qx) - f(x)}{qx - x} \\ \\  \displaystyle \frac{d_q(f(x) + g(x))}{d_qx} & \displaystyle = \frac{(f(qx) + g(qx)) - (f(x) + g(x))}{qx - x} \\  & \displaystyle = \frac{(f(qx) - f(x)) + (g(qx)- g(x))}{qx - x} \\  & \displaystyle = \frac{d_qf(x)}{d_qx} + \frac{d_qg(x)}{d_qx}\end{array}

Even better, the q-derivative of a power of x is separable into a coefficient that depends only on q and a single smaller power of x:

\begin{array}{rl}\displaystyle \frac{d_q x^n}{d_qx} & \displaystyle = \frac{(qx)^n - x^n}{qx - x} \\  &\displaystyle = \frac{q^n - 1}{q - 1}\frac{x^n}{x} \\  &\displaystyle = [n]_q x^{n-1},\end{array}


\displaystyle [n]_q = \frac{q^n - 1}{q - 1} = 1 + q + \cdots + q^{n-1}.

Clearly, as q \to 1, [n]_q \to n. Whereas n counts the number of ways to insert a point into an ordered list of n-1 items, [n]_q counts the number of ways to insert a linearly independent ray passing through the origin into an (n-1)-dimensional vector space over a field with q elements with a given ordered basis.

The q-derivative even works when q is an operator rather than a number. Polynomials work in any rig, so if x is, say, a function instead of a number, q could be the Fourier transform.

Let’s lift this to types. The derivative of a datatype is the type of its one-holed contexts, so we expect the q-derivative to have a similar interpretation. When we take Q and X to be types, the Q-derivative of a tuple X^n is

\displaystyle [n]_Q X^{n-1} = X^{n-1} + (QX)X^{n-2} + (QX)^2X^{n-3} + \cdots + (QX)^{n-1}.

Each option factors into two parts: the ‘clowns’ are a power of QX to the left of the hole, followed by the ‘jokers’, a power of X after the hole. This type is the one we expect for the intermediate state when mapping a function f\colon X \to Q\times X over the tuple; any function g\colon X \to Q can be lifted to such a function f(x) = (g(x), x).

Similarly, the Q-derivative of the list datatype 1/(1-X) is 1/(1-QX)(1-X); that is, the ‘clowns’ form the list of the outputs of type Q \times X for the elements that have already been processed and the ‘jokers’ form the list of the elements yet to process.

When we abstract from thinking of q as a number to thinking of it as an operator on ring elements, the corresponding action on types is to think of Q as a functor. In this case, QX is not a pair type, but rather a parametric type Q[X]. One might, for example, consider mapping the function f(x) = 1/x over a list of real numbers. The resulting outputs will be real except when the input is zero, so we’ll want to adjoin an undefined value. Taking Q = Option, the Q-derivative of List[Int] is List[Option[Int]]\times List[Int], consisting of the list of values that have already been processed, followed by the list of values yet to process.

Symmetric monoidal closed objects

Posted in Uncategorized by Mike Stay on 2012 April 12

I conjecture that there’s a compact closed bicategory Th(SMCC) such that the 2-category hom(Th(SMCC), Prof) of

  • sylleptic monoidal functors (of bicategories),
  • braided monoidal transformations and
  • monoidal modifications

is equivalent as a 2-category to the 2-category SMCC of

  • symmetric monoidal closed categories,
  • braided monoidal closed functors, and
  • monoidal closed natural isomorphisms.

If we model Th(SMCC) in the compact closed bicategory 3Cob2, then we get

where I didn’t draw

  • the right unitor
  • the pentagon equation for the associator
  • the triangle equations for the unitors
  • the hexagon equations for the braiding
  • the yanking for internal hom
  • a,b,c,l,r composed with their formal inverses equals the identity

but I think they’re pretty obvious given this other stuff.


Posted in Uncategorized by Mike Stay on 2011 October 31

I used this Halloween as motivation to start working out a few months ago, because I wanted to be a passable Wolverine; I made a lot of progress, but not so much that I feel comfortable posting pictures of muscles. Fortunately, my mom is a seamstress and made me a great jacket. Here’s the result:

Monster manual

Posted in Uncategorized by Mike Stay on 2011 April 13

What magical creature lives deep in the earth’s crust and has a horrible lisp? A BATHOLITHK.

Better than anything 2

Posted in Uncategorized by Mike Stay on 2011 April 1

Better than This Week from Baez,
Better than Lou Kauffman’s knots,
Better than all of the string theory
Witten’s forgot,

Better than Feynman’s graffiti,
Better than Calabi-Yau,
Better than Pauli’s neutrino,
Better than mu and than tau,

Better than Curie, Poincaré,
And Niels and Albert at Solvay
Better than anything except being in love.

More views of the metapixel

Posted in Uncategorized by Mike Stay on 2011 March 10

Double spiral:

Single spiral the other way:

Double spiral the other way:

Hooped-up 24-cell

Posted in Uncategorized by Mike Stay on 2011 March 1

The 24-cell is really cool. It’s the only self-dual regular polytope that’s not a simplex or a polygon. Its vertices form the multiplicative group of units in the Hurwitz quaternion ring. It tiles Euclidean 4-space. Spheres inscribed in the octahedra give the closest packing of spheres in 4 dimensions. And, well, there are 24 of them.

We ran out of tape while making the inner cuboctahedron–one of the triangular sides is missing. But we came darn close! Compare:

Yudkowski interview

Posted in Uncategorized by Mike Stay on 2011 February 16

Queensland flooding

Posted in Uncategorized by Mike Stay on 2011 February 9

Polarization for breakfast

Posted in Uncategorized by Mike Stay on 2011 February 5

Legos this morning

Posted in Uncategorized by Mike Stay on 2011 February 5

This one ended up looking rather shark-like:

William chose–who else?–Yoda to pilot the green ship.

Willie’s fugue

Posted in Uncategorized by Mike Stay on 2011 January 23

Toccata in D minor

Posted in Uncategorized by Mike Stay on 2011 January 22

This is what about ten minutes every other day or so for a year gets you.


Posted in Uncategorized by Mike Stay on 2011 January 18

Apollo. adj 1. brave, without fear (from L. ab- not + pollo chicken).

Silicon carbide

Posted in Uncategorized by Mike Stay on 2011 January 17

I took this photo of my chunk of silicon carbide.

With some time I could probably find a crystal pure enough to emit light.

Doodling like Vi

Posted in Uncategorized by Mike Stay on 2011 January 15


Say you’re me and you just watched Vi Hart’s video on infinity elephants and you totally missed the joke about Mr. Tusks even though you read Dinosaur Comics all the time but you liked the bit about Apollonian gaskets, which don’t blow out in Battle Mountain like the one in your car did but rather on the way to the L1 point and then you need Richard Feynman to tell you why. You thought she was going to draw the tiniest camels going through the eye of a needle, but you suppose that would ruin the hyperbole in the parable, so the ellipsis was justified. Anyway, you decide to avoid circular reasoning and doodle rectangles instead, filling them up with squares.

Eventually you wonder which ones you can fill up with finitely many squares and which ones you need infinitely many for, and so you start with squares and build up some rectangles. One square can only make one rectangle, itself. Two squares can only make one rectangle, but it can be lying down or standing up so you decide to say they’re different. Three squares make four rectangles and four squares make eight rectangles, and then you start thinking about Vi Hart’s video on binary trees. So you put the numbers into a tree, but it looks kind of stern, so you add some fareys to cheer it up. Then you see that the height of a rectangle is just the sum of its neighbors’ heights, and similarly for the width. You see lots of nice patterns in the dimensions involving flipping things over or running them backwards, kind of like the Blues Brothers’ police car when it was being chased by the Neo Nazis or that V6 racecar that Johann sent to Frederick.

Now instead of breadth, you decide to go for depth. Making the rectangles very long or very tall is too boring, so you add one square each time, alternately making it longer and taller. 1 1 2 3 5 8 13 21… You get the Fibonacci numbers; the limiting ratio is the golden ratio, [1+sqrt(5)]/2 to 1. This rectangle is the worst at being approximated by repeated squares so it shows up in systems where repetition is bad, like the angle at which plant leaves grow so they overlap the least and gather the most sunlight or how sunflowers pack the most seeds into a flowerhead and Roger Penrose thinks the Fibonacci spirals in the microtubules in the neurons in your brain are doing quantum error correction.

You decide to look at other irrational numbers to see if they have any nice patterns. e does. pi doesn’t. square roots of positive integers give repeating palindromes! You wonder whether all palindromes occur and if not which of the lyrics to Weird Al’s song Bob are special that way. And maybe then you make up a palindrome with vi hart’s name in it and turn it into a square root.


Maybe you decide that you want to doodle some circles after all, so you start with this gasket and figure out where the circles touch the line. The numbers look very familiar. You wonder what the areas of the circles are and how the gasket relates to the modular group and Poincare’s half-plane model of the hyperbolic plane and wish you had time to just sit in math class and doodle…

Math doodles

Posted in Uncategorized by Mike Stay on 2010 December 30

Vi Hart‘s got some great stuff.


Posted in Uncategorized by Mike Stay on 2010 July 19

Scimitry: the property that something remains the same after part has been cut off.  The Serpinski triangle is scimitric, since you can cut off the bottom two triangles and be left with something isomorphic to the whole.

A first attempt at re-winding Escher’s “Ascending and Descending”

Posted in Uncategorized by Mike Stay on 2010 May 19

And he dreamed, and behold a ladder set up on the earth, and the top of it reached to heaven: and behold the angels of God ascending and descending on it.

Edit (May 20):

Even though it’s not a conformal transformation, this version looks better in a lot of ways.

Rather than cramming the whole picture into a single window frame, it presumes there’s a concentric set of these castles, each half as small as the previous, and built within its open internal patio. Doing it really well would involve extending the walls out to the edge of the outer wall that obscures them.

I’m not this guy

Posted in Uncategorized by Mike Stay on 2010 May 18


Posted in Uncategorized by Mike Stay on 2010 May 13

Blind faith is the best antisceptic.

Tag clouds

Posted in Uncategorized by Mike Stay on 2010 May 7

Tag clouds are cumulonymous.

The animated gif of Dorian Grey

Posted in Uncategorized by Mike Stay on 2010 May 3

My friend Jas wrote a Perl module, Perl::Visualize, that makes Perl/Gif polyglots. He later adapted his technique to Javascript/Gif polyglots. Some guy generalized a quine to print out the source code of the program together with a comment containing the next frame in Conway’s Game of Life. So we have programs that can age themselves, as well as programs that double as pictures. The “aging” process can repeat. So someone make a JS/Gif polyglot quine that renders consecutive frames of Dorian Grey!

NY Times article on conditional probability

Posted in Uncategorized by Mike Stay on 2010 April 27

There’s actually very good justification for this method of reasoning: it maximizes entropy.

5-axis mill

Posted in Uncategorized by Mike Stay on 2010 April 9

I really like the different tones on the metal, from polished to brushed, to whatever cool thing they did to get their name in lettering on the back.


Posted in Uncategorized by Mike Stay on 2010 April 8

Idea for an annotation engine:

  • Annotation has the form (search query, regular expression, content)
    • search query should be in a form where given the content and URL of a page you can tell if it ought to match the query.
    • execute the search query; for each hit
      • run the regex on the result; if it matches
        • attach the content to $1
  • When an annotation is created, cache the first n URLs that the search query and regex succeed on
  • When on a page and you want to know if it has annotations
    • if the URL is in the cache or the current page’s content matches the search query,
      • run the regexp; if it succeeds,
        • attach the content
        • cache the URL
  • automatic linkifying of references?
    • content post-processors
      • automatic linkifying
      • media inclusion
      • translation
      • whatever
  • notes on notes?
    • rather than keep a local database of annotations, publish them on a blog tagged as an annotation
    • then can annotate the blog
  • concept pages?
    • auto-linkify takes you to a concept page on the blog
  • links & trackbacks?
    • auto-linkify adds trackbacks to cached URLs

Ketamine and the near-death experience

Posted in Uncategorized by Mike Stay on 2010 April 5

“All features of a classic NDE [near-death experience] can be reproduced by the intravenous administration of 50 – 100 mg of ketamine… It can reproduce all features of the NDE, including travel through a dark tunnel into light, the conviction that one is dead, ‘telepathic communion with God’, hallucinations, out-of-body experiences and mystical states (see ketamine references above). If given intravenously, it has a short action with an abrupt end. Grinspoon and Bakalar (1981, p34) wrote of: ‘…becoming a disembodied mind or soul, dying and going to another world. Childhood events may also be re-lived. The loss of contact with ordinary reality and the sense of participation in another reality are more pronounced and less easily resisted than is usually the case with LSD. The dissociative experiences often seem so genuine that users are not sure that they have not actually left their bodies.'”


Posted in Uncategorized by Mike Stay on 2010 March 30

Monads for functional JavaScript programming

Posted in Uncategorized by Mike Stay on 2010 March 26


Philip Wadler’s excellent paper “Monads for Functional Programming” is about how to achieve some of the benefits of using side-effecting languages using a purely functional language. Wadler uses Haskell to illustrate his examples. To me, that has the effect of preaching to the choir—you can’t do anything in a purely functional language without a monad, so anyone proficient in Haskell will already have had to understand the point he’s trying to make.

“It’s easier to port programmers than to port programs”, so I’ll take a different tack: purely functional programs are far easier to test. If you are concerned about the reliability of your code, programming in a functional style is something to consider seriously. So in my opinion, the examples should be given in the impure language; I’m going to use JavaScript.

In one example, Wadler assumes that computing 1/0 throws an error. Since JavaScript says 1/0 === Infinity instead, I’m going to use silentmatt’s BigInteger library, which does throw an error.

In what follows, I simply give a direct translation from Wadler’s Haskell code in section 2 of his paper.


A Pietà for the modern age

Posted in Uncategorized by Mike Stay on 2010 March 26

Crochet your own action figures

Posted in Uncategorized by Mike Stay on 2010 March 23

She’s selling the patterns for $3.50 each on her Etsy page. See also this collection of Lord of the Rings characters.

Artificial photosynthesis

Posted in Uncategorized by Mike Stay on 2010 March 19

96% efficiency at turning CO2 and H20 into sugar.

PTSD in Haiti

Posted in Uncategorized by Mike Stay on 2010 March 17

Lucas Williams writes:

[This was written as a response to close friend of mine who is a social worker in the greater Hartford area. She asked me if I had any anecdotes from Haiti that she could share at an upcoming conference on mental health and trauma. I tried to jot down just a few thoughts, but found that I couldn’t stop typing. This is a letter to a friend, not an essay or paper, but I want to share it. I hope that it can be of help to someone else, either to a returning aid worker, or someone who hasn’t been to Haiti yet wants to understand, to feel a little bit of what it is like.]

Dear *****

I’m in Miami right now, I landed last night. I’ve been in Haiti, working as an Emergency Department technician in a field hospital for almost 6 weeks, and it’s more than a little weird to be lying on a friends couch watching TV. I head back to Port-au-Prince on Saturday for another 5 weeks. I tried to write you a short reply about post-trauma and it turned into a novel. By the second paragraph I realized I was really writing it for me, but it just kept on going. I don’t know if this is going to be much help, but I’m sending it anyways, because each time I try and clean it up it just gets messier, and longer.

I think about post-traumatic stress disorder a lot. Before I left for Haiti I thought of it as a Western illness. It wasn’t that I thought it wasn’t real — I just believed that you had to be emotionally “fragile” before the trauma in order to be severely affected by it afterwards. We had a psychologist on the plane with us, and my first thought when I heard she was coming was “what the hell is she going to do?” Haiti is the poorest country in this hemisphere. These people are tough as nails, they won’t need a shrink, and besides, there’s no way she could be effective dealing with such a monstrous language barrier. Less than 48 hours from the time we landed I would feel very differently.


Burninatin’ the countryside

Posted in Uncategorized by Mike Stay on 2010 March 15

With consummate Vs!

Pretty jewelry

Posted in Uncategorized by Mike Stay on 2010 March 9

The artist says the bands of color are due to a diffraction grating; the base is recycled titanium. Is it etched, or ground, or what? The page only says “a multistep process”.

Birdfeeders in UK are splitting Blackcaps into two species

Posted in Uncategorized by Mike Stay on 2010 March 5

There’s a mutation that sends blackcaps in the wrong direction when migrating for the winter, and they end up in Britain. Before humans started putting out bird feeders, they would merely die. But now, they’re surviving and living to reproduce. They spend part of their time in Europe, but they overlap less with the birds that migrate over the alps; the two groups are now more genetically distinct between groups than within them. This gap arose in around 50 years; if it holds up, the groups may cease to interbreed entirely and will become two different species.

Digital compositing

Posted in Uncategorized by Mike Stay on 2010 March 3

Stargate studios shows off:

Phort Tiger

Posted in Uncategorized by Mike Stay on 2010 February 25

When I was seven, my parents gave me the frame of a clubhouse for my birthday; it had two walls and a roof. I and the neighborhood kids added the other walls; when someone down the street replaced the shingles on their roof, we took the discarded ones and shingled ours. We did half of it wrong before figuring out how shingles are supposed to go (start at the bottom!) Someone else found the remains of an alphabet used to put a surname on a mailbox. It was missing the letter “F”, so the clubhouse became “Phort Tiger”, anticipating the F -> PH meme by a decade and a half. We seceded from the union and declared our backyard to be the sovereign nation of Tigeria.

Mechanistic creativity

Posted in Uncategorized by Mike Stay on 2010 February 25

Computers are better now at face recognition than humans. My brother Doug has written photoshop filters that can do a watercolor painting over a pencil sketch given a photo. And now, David Cope has produced really beautiful music from a computer; the genius of it is his grammatical analysis of music:

Again, Cope hit the books, hoping to discover research into what that something was. For hundreds of years, musicologists had analyzed the rules of composition at a superficial level. Yet few had explored the details of musical style; their descriptions of terms like “dynamic,” for example, were so vague as to be unprogrammable. So Cope developed his own types of musical phenomena to capture each composer’s tendencies — for instance, how often a series of notes shows up, or how a series may signal a change in key. He also classified chords, phrases and entire sections of a piece based on his own grammar of musical storytelling and tension and release: statement, preparation, extension, antecedent, consequent. The system is analogous to examining the way a piece of writing functions. For example, a word may be a noun in preparation for a verb, within a sentence meant to be a declarative statement, within a paragraph that’s a consequent near the conclusion of a piece.

This kind of endeavor is precisely what the science of teaching is about; if Cope can teach a computer to make beautiful music, he can teach me to make beautiful music. By abstracting away the particular notes and looking at what makes music Bach-like as opposed to Beethoven-like or Mozart-like, he has shown us where new innovation will occur: first, in exploring the space, and second, in adding new dimensions to that space.

Algorithmic thermodynamics

Posted in Uncategorized by Mike Stay on 2010 February 22

John Baez and I just wrote a paper entitled “Algorithmic Thermodynamics.” Li and Vitányi coined this phrase for their study of the Kolmogorov complexity of physical microstates; in their model, given an encoding x of a macrostate (a measurement of a set of observables of the system to some accuracy), the entropy S(x) of the system is a sum of two parts, the algorithmic entropy K(x) and the uncertainty entropy H(x). The algorithmic entropy is roughly the length of the shortest program producing x, while the uncertainy is a measure of how many microstates there are that satisfy the description x. So roughly the microstates in their model are outputs of Turing machines.

In our model, microstates are inputs to Turing machines, specifically inputs that cause the machine to halt and give an output. Then we specify a macrostate using some observables of the program (computable functions from bit strings to real numbers, like the length, or runtime, or output of the program). Once we’ve specified the macrostate by giving the average values \overline{C_i} of some observables C_i, we can ask what distribution on microstates (halting programs) maximizes the entropy; this will be a Gibbs distribution

\displaystyle p(x) = \frac{1}{Z} \exp\left(-\sum_i \beta_i C_i(x)\right),


\displaystyle Z = \sum_{x \in X} \exp\left(-\sum_i \beta_i C_i(x)\right)


\displaystyle -\frac{\partial}{\partial \beta_i} \ln Z = \overline{C_i}.

The entropy of this system is

\displaystyle S(p) = -\sum_{x \in X} p(x) \ln p(x);

from this formula we can derive definitions of the conjugates of the observables, just like in statistical mechanics.

If we pick some observable C_j—say, the runtime of the program—to play the role of the energy E, then its conjugate \beta_j plays the role of inverse temperature 1/T:

\displaystyle \frac{1}{T} = \left.\frac{\partial S}{\partial E}\right|_{C_{i \ne j}}.

Given observables to play the roles of volume and number of particles—say, the length and output, respectively—we can similarly define analogs of pressure and chemical potential. Given these, we can think about thermodynamic cycles like those that power heat engines, or study the analogs to Maxwell’s relations, or study chemical reactions–all referring to programs instead of pistons.

And since the observables are arbitrary computable functions of the program bit string, we can actually recover Li and Vitányi’s meaning for ‘algorithmic thermodynamics’ by interpreting the output as a description of a physical macrostate; so our use of the term includes theirs as a special case.

Tutorial on how to overlay tiles on Google maps

Posted in Uncategorized by Mike Stay on 2010 February 21

The Science of Benjamin Button

Posted in Uncategorized by Mike Stay on 2010 January 28

England on fasting

Posted in Uncategorized by Mike Stay on 2010 January 27


Posted in Uncategorized by Mike Stay on 2010 January 27


Posted in Uncategorized by Mike Stay on 2010 January 27


Posted in Uncategorized by Mike Stay on 2010 January 27