Renormalization and Computation 4
This is the fourth in a series of posts on Yuri Manin’s pair of papers. In the previous posts, I laid out the background; this time I’ll actually get around to his result.
A homomorphism from the Hopf algebra into a target algebra is called a character. The functor that assigns an action to a path, whether classical or quantum, is a character. In the classical case, it’s into the rig and we take an infimum over paths; in the quantum it’s into the rig
and we take an integral over paths. Moving from the quantum to the classical case is called Maslov dequantization.
Manin mentions that the runtime of a parallel program is a character akin to the classical action, with the runtime of the composition of two programs being the sum of the respective runtimes, while the runtime of two parallel programs is the maximum of the two. A similar result holds for nearly any cost function. He also points out that computably enumerable reals form a rig
He examines Rota-Baxter operators as a way to generalize what “polar part” means and extend the theorems on Hopf algebra renormalization to such rigs.
In the second paper, he looks at my work with Calude as an example of a character. He uses our same argument to show that lots of measures of program behavior have the property that if the measure hasn’t stopped growing after reaching a certain large amount with respect to the program size, then the density of finite values the measure could take decreases like Surprisingly, though he referred to these results as cutoffs, he didn’t actually use them anywhere for doing regularization.
Reading between the lines, he might be suggesting something like approximating the Kolmogorov complexity that he uses later by using a time cutoff, motivated by results from our paper: there’s a constant depending only on the programming language such that if you run the
th program
steps and it hasn’t stopped, then the density of times near
at which it could stop is roughly
Levin suggested using a computable complexity that’s the sum of the program length and the log of the number of time steps; I suppose you could “regularize” the Kolmogorov complexity by adding to the length of the program, renormalize, and then let
go to zero, but that’s not something Manin does.
Instead, he proposed two other constructions suitable for renormalization; here’s the simplest. Given a partial computable function define the computably enumerable
by
if
is defined, and 0 otherwise. Now define
When is undefined,
which has a pole at
When
is defined,
converges everywhere except
Birkhoff decomposition would separate these two cases, though I’m not sure what value
would take or what it would mean.
The other construction involves turning into a permutation
and inventing a function that has poles when the permutation has fixpoints.
So Manin’s idea of renormalizing the halting problem is to do some uncomputable stuff to get an easy-to-renormalize function and then throw the Brikhoff decomposition at it; since we know the halting problem is undecidable, perhaps the fact that he didn’t come up with a new technique for extracting information about the problem is unsurprising, but after putting in so much effort to understand it, I was left rather disappointed: if you’re going to allow yourself to do uncomputable things, why not just solve the halting problem directly?
I must suppose that his intent was not to tackle this hard problem, but simply to play with the analogy he’d noticed; it’s what I’ve done in other papers. And being forced to learn renormalization was exhilarating! I have a bunch of ideas to follow up; I’ll write them up as I get a chance.
Renormalization and Computation 3
This is the third in a series of posts on Yuri Manin’s recent pair of papers applying Hopf algebra renormalization to the Halting problem. Last time I talked about the way people usually do renormalization; this time I’ll talk about the recent work by Kreimer, Connes, and others in exposing the underlying Hopf algebra in this process.
A Hopf algebra is
- An
-module for a commutative rig
which means you can add vectors and multiply them by a scalar.
- An algebra, which means you can take two vectors and multiply them. This operation is associative; there’s also a unit vector that satisfies left- and right-unit laws.
- A bialgebra, which means there’s also a coassociative comultiplication and counit, and the structures all work together. When the tensor product is the cartesian product, the comultiplication duplicates the vector and the counit is the constant map to 1 in the base field. Even when the tensor product isn’t the cartesian product, it can still be useful to think of it that way.
- A bialgebra with an involution, called the antipode.
A group is very like a Hopf algebra; in fact, a group object in the category of vector spaces and linear maps is a cocommutative Hopf algebra. You can multiply group elements and there’s a multiplicative unit; you can duplicate and delete them in equations; and you can invert them.
It turns out that Feynman diagrams form a Hopf algebra if you poke yourself in one eye and squint. First, a cut of an oriented graph
(i.e directed graph with no parallel edges) picks an upper set
and a lower set
of vertices such that
- given an oriented wheel in the graph, its vertices either all belong to the upper set or all belong to the lower set, and
- any edge connecting a vertex
in the upper set to a vertex
in the lower set must be directed from
to
Now, given a set of Feynman diagrams, consider all formal linear combinations of graph cuts. This is a vector space because you can add these things pointwise and multiply them by a scalar. We can make it into a bialgebra by defining multiplication to be a linear map
with unit
and comultiplication to be a linear map
where ranges over all cuts of
with counit
It’s graded: just count the number of vertices. And we can turn it into a Hopf algebra by defining the antipode to be a linear map such that
Each algebra homomorphism (not necessarily preserving the Hopf algebra structure) from to an algebra
defines a way to assign a (generalized) probability amplitude to each diagram. The set
of such homomorphisms becomes a group when we note that the functor
is contravariant, so the comultiplication in
gets mapped to a multiplication.
Next: given a complex group (that is, a group that’s also a complex manifold so the multiplication and inverse are complex-analytic functions), a Birkhoff decomposition of a loop
is an analytic continuation of the loop to
- a holomorphic function
on the standard disk inside the circle
- a holomorphic function
on the complement of this disk in the projective complex plane
- such that on the unit circle the original loop is reproduced as
where the product and the inverse on the right are taken in the group
Notice that
is a well defined element of
Take Now imagine our regularization parameter is a complex number
and we have some map
that’s singular at
Then the Connes-Kreimer theorem says that the Birkhoff decomposition always exists and gives an explicit formula. Hopf algebra renormalization is simply rearranging the terms in the Birkhoff decomposition:
where is the convolution product.
As I understand this, is isomorphic to
Given a linear combination of graphs,
gives you back a Laurent polynomial in
which you can split into terms with negative exponents (the polar part) and those with positive exponents (the renormalized part).
Renormalization and Computation 2
This is the second in a series of posts covering Yuri Manin’s ideas involving Hopf algebra renormalization of the Halting problem. Last time I showed how perturbing a quantum harmonic oscillator gave a sum over integrals involving interactions with the perturbation; we can keep track of the integrals using Feynman diagrams, though in the case of a single QHO they weren’t very interesting.
One point about the QHO needs emphasis at this point. Given a wavefunction describing the state of the QHO, it must be the case that we get some value when we measure the energy; so if we sum up the norms of the probability amplitudes, we should get unity:
This is called the normalization condition.
When we perturb the QHO, the states are no longer the energy eigenvectors of the new Hamiltonian. We can express the new eigenvectors
in terms of the old ones:
where is the strength of the perturbation, and we reexpress our wavefunction in this new basis:
Since we’re working with a new set of coefficients, we have to make sure they sum up to unity, too:
This is the renormalization condition. So renormalization is about making sure things sum up right once you perturb the system.
I want to talk about renormalization in quantum field theory; the trouble is, I don’t actually know quantum field theory, so I’ll just be writing up what little I’ve gathered from reading various things and conversations with Dr. Baez. I’ve likely got some things wrong, so please let me know and I’ll fix them.
A field is a function defined on spacetime. Scalar fields are functions with a single output, whereas vector fields are functions with several outputs. The electromagnetic field assigns a single number, called the electric field, and a vector, called the magnetic field, to every point in spacetime. When you have two electrons and move one of them, it feels a reaction force and loses momentum; the other electron doesn’t move until the influence, traveling at the speed of light, reaches it. Conservation of momentum says that the momentum has to be somewhere; it’s useful to consider it to be in the electromagnetic field.
When you take the Fourier transform of the field, you get a function that assigns values to harmonics of the field; in the case of electromagnetism, the transformed field assigns a value to each color
of light. Quantizing this transformed field amounts to making
into a creation operator, just like
in the QHO example from last time. So we have a continuum of QHOs, each indexed by a color
(By the way—the zero-dimensional Fourier transform is the identity function, so the QHO example from last time can be thought of both as the field at the unique point in spacetime and the field at the unique frequency.)
When we move to positive-dimensional fields, we get more interesting pictures, like these from quantum electrodynamics:

Here, our coupling constant is the fine structure constant where
is the charge of the electron. For each vertex, we write down our coupling constant times
times a delta function saying that the incoming momentum minus the outgoing momentum equals zero. For each internal line, we write down a propagator—a function representing the transfer of momentum from one point to another; it’s a function of the four-momentum
—and multiply all this stuff together. Then we integrate over all four-momenta and get something that looks like
The trouble is, this integral usually gives infinity for an answer. We try to work around this in two steps: first, we regularize the integral by introducing a length scale This represents the point at which gravity starts being important and we need to move to a more fundamental theory. In the quantum field theory of magnetic domains in iron crystals, the length scale is the inter-atom distance in the lattice. Regularization makes the integral finite for
away from some singularity.
There are a few different ways of regularizing; one is to use as a momentum cutoff:
This obviously converges, and solutions to this are always a sum of three parts:
- The first part diverges as
either logarithmically or as a power of
- The second part is finite and independent of
- The third part vanishes as
Renormalization in this case amounts to getting rid of the first part.
These three parts represent three different length scales: at lengths larger than all quantum or statistical fluctuations are negligible, and we can use the mean field approximation and do classical physics. At lengths between
and
we use QFT to calculate what’s going on. Finally, at lengths smaller than
we need a new theory to describe what’s going on. In the case of QED, the new theory is quantum gravity; string theory and loop quantum gravity are the serious contenders for the correct theory.
The problem with this regularization scheme is that it doesn’t preserve gauge invariance, so usually physicists use another regularization scheme, called “dimensional regularization”. Here, we compute
which gives us an expression involving gamma functions of , where gamma is the continuous factorial function, and then we set
The solutions to this are also a sum of three terms—a divergent part, a finite part, and a vanishing part—and then renormalization gets rid of the divergent part.
Assume we have some theory with a single free parameter . We’d like to calculate a function
perturbatively in terms of
, where
represents some physical quantity, and we know
. We assume
takes the form
and assume that this definition gives us divergent integrals for the The first step is regularization: instead of
we have a new function
Now we get to the business of renormalization! We solve this problem at each order; if the theory is renormalizable, knowing the solution at the previous order will give us a constraint for the next order, and we can subtract off all the divergent terms in a consistent way:
- Order
Here,
Since it’s a constant, it has to match
so
In this approximation, the coupling constant takes the classical value.
- Order
Let
where
Plugging this into the definition of
we get
Using
we get
which diverges as
In the case of QED, this says that the charge on the electron is infinite. While the preferred interpretation these days is that quantum gravity is a more fundamental theory that takes precedence on very small scales (a Planck length is to a proton as a proton is to a meter), when the theory was first introduced, there was no reason to think that we’d need another theory. So the interpretation was that with an infinite charge, an electron would be able to extract an infinite amount of energy from the electromagnetic field. Then the uncertainty principle would create virtual particles of all energies, which would exist for a time inversely proportional to the energy. The particles can be charged, so they line up with the field and dampen the strength just like dielectrics. In this interpretation, the charge on the electron depends on the energy of the particles you’re probing it with.
So to second order,
A theory is therefore only renormalizable if the divergent part of
is independent of
In QED it is. We can now define
as the limit
- Higher orders.
In a renormalizable theory, the process continues, with the counterterms entirely specified by knowing
Renormalization and Computation 1
Yuri Manin recently put two papers on the arxiv applying the methods of renormalization to computation and the Halting problem. Grigori Mints invited me to speak on Manin’s results at the weekly Stanford logic seminar because in his second paper, he expands on some of my work.
In these next few posts, I’m going to cover the idea of Feynman diagrams (mostly taken from the lecture notes for the spring 2004 session of John Baez’s Quantum Gravity seminar); next I’ll talk about renormalization (mostly taken from Andrew Blechman’s overview and B. Delamotte’s “hint”); third, I’ll look at the Hopf algebra approach to renormalization (mostly taken from this post by Urs Schreiber on the n-Category Café); and finally I’ll explain how Manin applies this to computation by exploiting the fact that Feynman diagrams and lambda calculus are both examples of symmetric monoidal closed categories (which John Baez and I tried to make easy to understand in our Rosetta stone paper), together with some results on the density of halting times from my paper “Most programs stop quickly or never halt” with Cris Calude. I doubt all of this will make it into the talk, but writing it up will make it clearer for me.
Renormalization is a technique for dealing with the divergent integrals that arise in quantum field theory. The quantum harmonic oscillator is quantum field theory in 0+1 dimensions—it describes what quantum field theory would be like if space consisted of a single point. It doesn’t need renormalization, but I’m going to talk about it first because it introduces the notion of a Feynman diagram.
“Harmonic oscillator” is a fancy name for a rock on a spring. The force exerted by a spring is proportional to how far you stretch it:
The potential energy stored in a stretched spring is the integral of that:
and to make things work out nicely, we’re going to choose The total energy
is the sum of the potential and the kinetic energy:
By choosing units so that we get
where is momentum.
Next we quantize, getting a quantum harmonic oscillator, or QHO. We set taking units where
Now
If we define a new observable then
We can think of as
and write the energy eigenvectors as polynomials in
The creation operator adds a photon to the mix; there’s only one way to do that, so
The annihilation operator
destroys one of the photons; in the state
, there are
photons to choose from, so
Schrödinger’s equation says so
This way of representing the state of a QHO is known as the “Fock basis”.
Now suppose that we don’t have the ideal system, that the quadratic potential is only a good local approximation to the real potential
. Then we can write the total as
where
is a function of position and momentum, or equivalently of
and
and
is small.
Now we solve Schrödinger’s equation perturbatively. We know that
and we assume that
so that it makes sense to solve it perturbatively. Define
and
After a little work, we find that
and integrating, we get
We feed this equation back into itself recursively to get
So here we have a sum of a bunch of terms; the th term involves
interactions with the potential interspersed with evolving freely between the interactions, and we integrate over all possible times at which those interactions could occur.
Here’s an example Feynman diagram for this simple system, representing the fourth term in the sum above:
![]()
The lines represent evolving under the free Hamiltonian , while the dots are interactions with the potential
.
As an example, let’s consider and choose
so that
When
acts on a state
we get
So at each interaction, the system either gains a photon or changes phase and loses a photon.
A particle moving in a quadratic potential in -dimensional space gives the tensor product of
QHOs, which is QFT in a space where there are
possible harmonics. Quantum electrodynamics (QED) amounts to considering infinitely many QHOs, one for each possible energy-momentum, which forms a continuum. The diagrams for QED start to look more familiar:

The vertices are interactions with the electromagnetic field. The straight lines are electrons and the wiggly ones are photons; between interactions, they propagate under the free Hamiltonian.
Transplants without rejection
Cured diabetes in mice via pancreas transplant, but probably works on every other organ, too.
The Guinea Pigs’ Club
Stories of experimental facial reconstruction on burn victims in WWII. Also see “Billy Bishop’s Flying School“.
Colorblindness cured
Following up on my previous comments here, scientists have cured color blindness in monkeys:
Neitz’s team injected their monkeys’ eyes with viruses carrying a gene that makes L-opsin, one of three proteins released when color-detecting cone cells are hit by different wavelengths of light. Male squirrel monkeys naturally lack the L-opsin gene; like people who share their condition, they’re unable to distinguish between red and green.
At first, the two monkeys behaved no differently than before. Though quick to earn a grape juice reward by picking out blue and yellow dots from a background of gray dots on a computer screen, they banged the screen randomly when presented with green or red dots.
But after five months, something clicked. The monkeys picked out red and green, again and again. At the biological level, Neitz can’t say precisely what happened — the monkeys, named Sam and Dalton, are alive and healthy, their brains unscanned and undissected — but their actions left no doubt.
They think it will work identically in humans. If so, this means that we could do the same thing for the mutant version of L-opsin that tetrachromat women have, and make anyone (even a man) into a tetrachromat. Or, even more excitingly, a gene for infrared or ultraviolet light.
Monad for weakly monoidal categories
We’ve got free and forgetful functors Define
Given a category
the category
has
- binary trees with
labeled leaves as objects and
- binary trees with
labeled leaves together with the natural isomorphisms from the definition of a weakly monoidal category as its morphisms.
The multiplication collapses two layers of trees down to one. The unit
gives a one-leaf tree.
An algebra of the monad is a category together with a functor
such that
and
Define
Then the associator should be a morphism
However, it isn’t immediately evident that the associator that comes from does the job, since just applying
to
gives
for the source instead of
,
which we get by replacing with its definition above. We need an isomorphism
so we can define Now we use the equations an algebra has to satisfy to derive this isomorphism. Since
the following two objects are equal:
Therefore, the isomorphism we wanted is simply equality and
It also means that
satisfies the pentagon equation.
A similar derivation works for the unitors and the triangle equation.
A morphism of algebras is a functor such that
Now
and
so we have the coherence laws for a strict monoidal functor.
Also,
so it preserves the associator as well. The unitors follow in the same way, so morphisms of these algebras are strict monoidal functors that preserve the associator and unitors.
Theocosmology
A little article considering the implications of the assertion that spirit is matter.
Where the Wild Things Are
If I were to do a movie of where the wild things are, it would be dark. When the forest grows in his room, it’s a creepy scene: the paint bubbles up, discolors, and starts to peel; the strands of shag carpet turns into centipedes that burrow into the floor, which rots away into the detrius of the forest floor. The forest itself is dark and malicious. Max’s wolf suit becomes real wolf skins, with a bare wolf skull for a helmet. He’s both exhilarated and scared of his new power; he runs, and finds that he’s supernaturally fast, like a wolf. His fingernails are sharp and hard, like claws, and he leaves gashes in the trees as he runs by.
When he reaches the water, he summons a storm to drive him across the ocean, but as he gets nearer, it blows out of control, culminating in the water-thing:

Once Max gets past the water thing, he lands on the beach–perhaps barely surviving the storm and avoiding rocks. Bedraggled, soaked, and exhausted, he moves from the wind-lashed shore for what seems to be shelter in the forest. Then he hears the wild things and gets a glimpse of the terrible yellow eyes. The dog-like thing with a horn on its nose would look something like this:
http://www.crawley-creatures.com/conceptual/hound.htm
He’s chased back to the beach when he remembers his powers, turns, and his eyes burst into a bright yellow flame; he flashes his eyes quickly at each of the attacking Wild Things, and a shock wave knocks each one back. Then returning to the dog-like one, he stares it down and the Thing writhes in pain, yelping; the others begin to comprehend the extent of his power. He finally quenches his eye magic, and the dog lies panting, trying to recover its strength.
He leads the wild things to war and conquers the entire forest. He’s crowned with even more power, then has a night-on-bald-mountain kind of rumpus, until he chases down and kills (or just nearly?) a dog in the forest and realizes it’s his own. Horrified, he gives the command to stop; the wild things do, but not willingly, and by morning his power has ebbed to the point that he can’t hold them at bay any more. He flees to the boat and has just enough magic to launch himself back to the opposite shore, and the dream fades. He finds himself awaking from a fever, with a washcloth on his forehead and some soup waiting for him.
Two separate organisms on their way to becoming symbiotic
The single-celled Hatena and the algae Nephrosolmis live independent lives: Hatena has a “mouth” with which it eats smaller creatures and organic material; Nephrosolmis gets its food from sunlight. But when Hatena eats Nephrosolmis, the algae grows inside it, discards its organelles, changes its mouth into an eyespot, and swims toward light. The algae makes enough food to keep them both alive. When Hatena reproduces, one daughter keeps all the algae, and the other goes hunting again.
See also solar-powered sea slugs.
Functors and monads
In many languages you have type constructors; given a type A and a type constructor Lift, you get a new type Lift<A>. A functor is a type constructor together with a function
lift: (A -> B) -> (Lift<A> -> Lift<B>)
that preserves composition and identities. If h is the composition of two other functions g and f
h (a) = g (f (a)),
then lift (h) is the composition of lift (g) and lift (f)
lift (h) (la) = lift (g) (lift (f) (la)),
where the variable la has the type Lift<A>. Similarly, if h is the identity function on variables of type A
h (a: A) = a,
then lift (h) will be the identity on variables of type Lift<A>
lift (h) (la) = la.
Examples:
- Multiplication
Lift<>adjoins an extra integer to any type:Lift<A> = Pair<A, int>
The function
lift()pairs upfwith the identity function on integers:lift (f) = (f, id)
- Concatenation
Lift<>adjoins an extra string to any type:Lift<A> = Pair<A, string>
The function
lift()pairs upfwith the identity function on strings:lift (f) = (f, id)
- Composition
LetEnvbe a type representing the possible states of the environment andEffect = Env -> Env
Also, we’ll be explicit in the type of the identity function
id<A>: A -> A id<A> (a) = a,
so one possible
Effectisid<Env>, the “trivial side-effect”.Then
Lift<>adjoins an extra side-effect to any type:Lift<A> = Pair<A, Effect>
The function
lift()pairs upfwith the identity on side-effects:lift (f) = (f, id<Effect>)
- Lists
The previous three examples used the Pair type constructor to adjoin an extra value. This functor is slightly different. Here,Lift<>takes any typeAto a list ofA’s:Lift<A> = List<A>
The function
lift()is the function map():lift = map
- Double negation, or the continuation passing transform
In a nice type system, there’s theUnittype, with a single value, and there’s also theEmptytype, with no values (it’s “uninhabited”). The only function of typeX -> Emptyis the identity functionid<Empty>. This means that we can think of types as propositions, where a proposition is true if it’s possible to construct a value of that type. We interpret the arrow as implication, and negation can be defined as “arrowing intoEmpty“: letF = EmptyandT = F -> F. ThenT -> F = F(sinceT -> Fis uninhabited) andTis inhabited since we can construct the identity function of type F -> F. Functions correspond to constructive proofs. “Negation” of a proof is changing it into its contrapositive form:If A then B => If NOT B then NOT A.
Double negation is doing the contrapositive twice:
IF A then B => If NOT NOT A then NOT NOT B.
Here,
Lift<>is double negation:Lift<A> = (A -> F) -> F.
The function lift takes a proof to its double contrapositive:
lift: (A -> B) -> ((A -> F) -> F) -> ((B -> F) -> F) lift (f) (k1) (k2) = k1 (lambda (a) { k2 (f (a)) })
Monads
A monad is a functor together with two functions
m: Lift<Lift<A>> -> Lift<A> e: A -> Lift<A>
satisfying some conditions I’ll get to in a minute.
Examples:
- Multiplication
If you adjoin two integers,m()multiplies them to get a single integer:m: Pair<Pair<A, int>, int> -> Pair<A, int> m (a, i, j) = (a, i * j).
The function
e()adjoins the multiplicative identity, or “unit”:e: A -> Pair<A, int> e (a) = (a, 1)
- Concatenation
If you adjoin two strings,m()concatenates them to get a single string:m: Pair<Pair<A, string>, string> -> Pair<A, string> m (a, s, t) = (a, s + t).
The function
e()adjoins the identity for concatenation, the empty string:e: A -> Pair<A, string> e (a) = (a, "")
- Composition
If you adjoin two side-effects,m()composes them to get a single effect:m: Pair<Pair<A, Effect>, Effect> -> Pair<A, Effect> m (a, s, t) = (a, t o s),
where
(t o s) (x) = t (s (x)).
The function
e()adjoins the identity for composition, the identity function onEnv:e: A -> Pair<A, Effect> e (a) = (a, id<Env>)
- Lists
If you have two layers of lists,m()flattens them to get a single layer:m: List<List<A>> -> List<A> m = flatten
The function
e()makes any element of A into a singleton list:e: A -> List<A> e (a) = [a]
- Double negation, or the continuation passing transform
If you have a quadruple negation,m()reduces it to a double negation:m: ((((A -> F) -> F) -> F) -> F) -> ((A -> F) -> F) m (k1) (k2) = k1 (lambda (k3) { k3 (k2) })The function
e()is just reverse application:e: A -> (A -> F) -> F e (a) (k) = k (a)
The conditions that e and m have to satisfy are that m is associative and e is a left and right unit for m. In other words, assume we have
llla: Lift<Lift<Lift<A>>> la: Lift<A>
Then
m (lift (m) (llla)) = m (m (llla))
and
m (e (la)) = m (lift (e) (la)) = la
Examples:
- Multiplication:
There are two different ways we can use lifting with these two extra functionse()andm(). The first is applyinglift()to them. When we applylifttom(), it acts on three integers instead of two; but becauselift (m) = (m, id),
it ignores the third integer:
lift (m) (a, i, j, k) = (a, i * j, k).
Similarly, lifting
e()will adjoin the multiplicative unit, but will leave the last integer alone:lift (e) = (e, id) lift (e) (a, i) = (a, 1, i)
The other way to use lifting with
m()ande()is to applyto their input types. This specifiesLift<>AasPair<A', int>, so the first integer gets ignored:m (a, i, j, k) = (a, i, j * k) e (a, i) = (a, i, 1)
Now when we apply
m()to all of these, we get the associativity and unit laws. For associativity we getm (lift (m) (a, i, j, k)) = m(a, i * j, k) = (a, i * j * k) m (m (a, i, j, k)) = m(a, i, j * k) = (a, i * j * k)
and for unit, we get
m (lift (e) (a, i)) = m (a, 1, i) = (a, 1 * i) = (a, i) m (e (a, i)) = m (a, i, 1) = (a, i * 1) = (a, i)
- Concatenation
There are two different ways we can use lifting with these two extra functionse()andm(). The first is applyinglift()to them. When we apply lift tom(), it acts on three strings instead of two; but becauselift (m) = (m, id),
it ignores the third string:
lift (m) (a, s, t, u) = (a, s + t, u).
Similarly, lifting
e()will adjoin the empty string, but will leave the last string alone:lift (e) = (e, id) lift (e) (a, s) = (a, "", s)
The other way to use lifting with
m()ande()is to applyLift<>to their input types. This specifiesAasPair<A', string>, so the first string gets ignored:m (a, s, t, u) = (a, s, t + u) e (a, s) = (a, s, 1)
Now when we apply
m()to all of these, we get the associativity and unit laws. For associativity we getm (lift (m) (a, s, t, u)) = m(a, s + t, u) = (a, s + t + u) m (m (a, s, t, u)) = m(a, s, t + u) = (a, s + t + u)
and for unit, you get
m (lift (e) (a, s)) = m (a, "", s) = (a, "" + s) = (a, s) m (e (a, s)) = m (a, s, "") = (a, s + "") = (a, s)
- Composition
There are two different ways we can use lifting with these two extra functionse()andm(). The first is applyinglift()to them. When we apply lift tom(), it acts on three effects instead of two; but becauselift (m) = (m, id<Effect>),
it ignores the third effect:
lift (m) (a, s, t, u) = (a, t o s, u).
Similarly, lifting
e()will adjoin the identity function, but will leave the last string alone:lift (e) = (e, id<Effect>) lift (e) (a, s) = (a, id<Env>, s)
The other way to use lifting with
m()ande()is to applyLift<>to their input types. This specifiesAasPair<A', Effect>, so the first effect gets ignored:m (a, s, t, u) = (a, s, u o t) e (a, s) = (a, s, id<Env>)
Now when we apply
m()to all of these, we get the associativity and unit laws. For associativity we getm (lift (m) (a, s, t, u)) = m(a, t o s, u) = (a, u o t o s) m (m (a, s, t, u)) = m(a, s, u o t) = (a, u o t o s)
and for unit, you get
m (lift (e) (a, s)) = m (a, id<Env>, s) = (a, s o id<Env>) = (a, s) m (e (a, s)) = m (a, s, id<Env>) = (a, id<Env> o s) = (a, s)
- Lists
There are two different ways we can use lifting with these two extra functionse()andm(). The first is applyinglift()to them. When we apply lift tom(), it acts on three layers instead of two; but becauselift (m) = map (m),
it ignores the third (outermost) layer:
lift (m) ([[[a, b, c], [], [d, e]], [[]], [[x], [y, z]]]) = [[a, b, c, d, e], [], [x, y, z]]
Similarly, lifting
e()will make singletons, but will leave the outermost layer alone:lift (e) ([a, b, c]) = [[a], [b], [c]]
The other way to use lifting with
m()ande()is to applyLift<>to their input types. This specifiesAasList<A'>, so the *innermost* layer gets ignored:m ([[[a, b, c], [], [d, e]], [[]], [[x], [y, z]]]) = [[a, b, c], [], [d, e], [], [x], [y, z]] e ([a, b, c]) = [[a, b, c]]
Now when we apply
m()to all of these, we get the associativity and unit laws. For associativity we getm (lift (m) ([[[a, b, c], [], [d, e]], [[]], [[x], [y, z]]])) = m([[a, b, c, d, e], [], [x, y, z]]) = [a, b, c, d, e, x, y, z] m (m ([[[a, b, c], [], [d, e]], [[]], [[x], [y, z]]])) = m([[a, b, c], [], [d, e], [], [x], [y, z]]) = [a, b, c, d, e, x, y, z]and for unit, we get
m (lift (e) ([a, b, c])) = m ([[a], [b], [c]]) = [a, b, c] m (e ([a, b, c])) = m ([[a], [b], [c]]) = [a, b, c]
Monads in Haskell style, or “Kleisli arrows”
Given a monad (Lift, lift, m, e), a Kleisli arrow is a function
f: A -> Lift<B>,
so the e() function in a monad is already a Kleisli arrow. Given
g: B -> Lift<C>
we can form a new Kleisli arrow
(g >>= f): A -> Lift<C> (g >>= f) (a) = m (lift (g) (f (a))).
The operation >>= is called “bind” by the Haskell crowd. You can think of it as composition for Kleisli arrows; it’s associative, and e() is the identity for bind. e() is called “return” in that context. Sometimes code is less complicated with bind and return instead of m and e.
If we have a function f: A -> B, we can turn it into a Kleisli arrow by precomposing with e():
(e o f): A -> Lift<B> (e o f) (a) = e (f (a)) = return (f (a)).
Example:
- Double negation, or the continuation passing style transform
We’re going to (1) show that the CPS transform of a function takes a continuation and applies that to the result of the function. We’ll also (2) show that for two functionsr, s,CPS (s o r) = CPS (s) >>= CPS (r),
(1) To change a function
f: A -> Binto a Kleisli arrow (i.e. continuized function)CPS (f): A -> (B -> X) -> X, we just compose withe—or in the language of Haskell, wereturnthe result:CPS (f) (a) (k) = return (f (a)) (k) = (e o f) (a) (k) = e (f (a)) (k) = k (f (a))
(2) Given two Kleisli arrows
f: A -> (B -> F) -> F
and
g: B -> (C -> F) -> F,
we can bind them:
(g >>= f) (a) (k) = m (lift (g) (f (a))) (k) // defn of bind = lift (g) (f (a)) (lambda (k3) { k3 (k) }) // defn of m = f (a) (lambda (b) { (lambda (k3) { k3 (k) }) (g (b)) }) // defn of lift = f (a) (lambda (b) { g (b) (k) }), // applicationwhich is just what we wanted.
In particular, if
fandgare really just continuized functionsf = (e o r) g = (e o s)
then
(g >>= f) (a) (k) = f (a) (lambda (b) { g (b) (k) }) // by above = (e o r) (a) (lambda (b) { (e o s) (b) (k) }) // defn of f and g = (e o r) (a) (lambda (b) { k (s (b)) }) // defn of e = (e o r) (a) (k o s) // defn of composition = (k o s) (r (a)) // defn of e = k (s (r (a))) // defn of composition = (e o (s o r)) (a) (k) // defn of e = CPS (s o r) (a) (k) // defn of CPSso
CPS (s) >>= CPS (r) = CPS (s o r).
My talk at Perimeter Institute
I spent last week at the Perimeter Institute, a Canadian institute founded by Mike Lazaridis (CEO of RIM, maker of the BlackBerry) that sponsors research in cosmology, particle physics, quantum foundations, quantum gravity, quantum information theory, and superstring theory. The conference, Categories, Quanta, Concepts, was organized by Bob Coecke and Andreas Döring. There were lots of great talks, all of which can be found online, and lots of good discussion and presentations, which unfortunately can’t. (But see Jeff Morton’s comments.) My talk was on the Rosetta Stone paper I co-authored with Dr. Baez.
‘t Hooft, Baez on becoming a good theoretical physicist
Nobel Laureate Gerard ‘t Hooft: Stuff you need to know in order to do theoretical physics, with links to sites & papers that teach it
My advisor, John Baez: How to teach stuff, Lists of good books for learning the necessary math & physics
Good object oriented coding practices = object-capability security
| OOP idea | Corresponding attribute of the Principle of least authority |
| Separation of duties | Separation of Authority |
| Information hiding | Integrity |
| Message passing | Authorization |
| Dependency injection | Authority injection |
An analogy from Mike Samuel, the Caja Tech Lead.
Great flash tutorial
http://www.kongregate.com/labs
Also http://www.kongregate.com/forums/11/topics/23746 for doing it all with free tools.
One-liner webserver
mkfifo backpipe; while true; do head -1 backpipe | (echo -n .; cut -f2 -d\ ) | xargs -L1 cat | nc -l 8080 > backpipe; done;
This is for a mac; for netcat on linux you’d say nc -l -p 8080. Note that this is in no way secure! A directory traversal attack is the most obvious one; there’s probably an injection attack, too.
Syntactic string diagrams
I hit on the idea of making lambda a node in a string diagram, where its inputs are an antivariable and a term in which the variable is free, and its output is the same term, but in which the variable is bound. This allows a string diagram notation for lambda calculus that is much closer to the syntactical description than the stuff in our Rosetta Stone paper. Doing it this way makes it easy to also do pi calculus and blue calculus.
There are two types, V (for variable) and T (for term). I’ve done untyped lambda calculus, but it’s straightforward to add subscripts to the types V and T to do typed lambda calculus.
There are six function symbols:
Lambda takes an antivariable and a term that may use the corresponding variable.
This turns an antivariable “x” introduced by lambda into the term “x”.
(Application) This takes
and
and produces
.
These two mean we can duplicate and delete terms.
The rule is the real meat of the computation. The “P” is an arbitrary subdiagram. The effect is replacing the “A” application node with the “P” subdiagram, modulo some wiring.
I label the upwards arrows out of lambdas with a variable name in parentheses; this is just to assist in matching up the syntactical representation with the string diagram.
In the example, I surround part of the diagram with a dashed line; this is the part to which the rule applies. Within that, I surround part with a dash-dot line; this is the subdiagram P in the rule.
When I do blue calculus this way, there are a few more function symbols and the relations aren’t confluent, but the flavor is very much the same.

String diagrams for untyped lambda calculus

An example calculation
Imaginary time
| Statics (geometric = no time): | ||
| [xlength] | x coordinate | |
| [X] | proportionality constant | |
| [ylength] | y coordinate | |
| [ylength / xlength] | slope | |
| [X ylength^2 / xlength^2] | proportional to curvature | |
| [X ylength^2 / xlength^2] | original shape | |
| [X ylength^2 / xlength] | distortion | |
| Statics (with energy): | ||
| [xlength] | x coordinate | |
| [mass xlength / time^2] | force due to stretching spring by dx | |
| [ylength] | y coordinate | |
| [ylength / xlength] | slope at s | |
| [mass ylength^2 / time^2 xlength] | stretching energy density | |
| [mass ylength^2 / time^2 xlength] | gravitational energy density | |
| [mass ylength^2 / time^2] | energy (dS = 0 at equilibrium) | |
| Dynamics ( |
||
| [time] | time | |
| [mass] | mass | |
| [ylength] | y coordinate | |
| [ylength / i time] | i * velocity | |
| [mass ylength^2 / time^2] | kinetic energy | |
| [mass ylength^2 / time^2] | potential energy | |
| [mass ylength^2 / i time] | i * action |
See also Toby Bartels‘ sci.physics post.
Caja on Yahoo!
Here at Google, I’m working on an open-source project called Caja. The name is Spanish for box or vault, and is pronounced “KA-hah”.
The general idea of Caja could be summed up as “virtual iFrames”. An iFrame is a little webpage stuck inside a bigger one, like a gadget in iGoogle or YAP. Web browsers use a security policy called the “same-domain policy”, which means that only a web page that came from Google’s servers should be allowed to cause changes to your Google-hosted data: you don’t want to allow the “pet turtle” gadget to delete all your email. So the way iGoogle protects your email is by putting the turtle gadget on a different domain, http://gmodules.com.
The same-domain policy does a good job of making it hard for gadgets to work together and a mediocre job of insulating mutually suspicious gadgets from each other; cross-site scripting (XSS) is a constant threat for any web site. Making sure you properly sanitize every use of user-supplied information is like trying to avoid getting a cold while surrounded by forty sniffling kindergarteners.
Also, even if you do manage to prevent XSS entirely, iFrames do nothing to prevent redirecting the page: it’s trivial for a gadget to tell your web browser to go to a page that looks like the Google login page, but really sends your password to the bad guys. All it has to do is include this line of code:
<script>window.top.location = “http://www.evil.com/phishing.html”;</script>
Caja addresses all these issues. On a gadget site like YAP, instead of sending your web browser a page with a bunch of iFrames, each of which causes your web browser to fetch a gadget, Yahoo’s server fetches the gadgets first and rewrites them with Caja, inserting code that looks at every operation the gadget tries to do. It also replaces the objects that code usually has access to, like window above, and replaces them with fake ones. The fake window object doesn’t have a working top.location property, so the gadget can’t redirect the page. The DOM objects sanitize strings passed to innerHTML and remove script blocks, so XSS is impossible. By letting two gadgets see the same variable, they can communicate with each other. Outgoing links are rewritten so that they pass through Yahoo’s proxy server, where the links can be checked in real time for malware.
With the lauch of My Yahoo! and Yahoo! Mail gadgets, we’ve got 275 million users. The best part is that we’re hardly mentioned anywhere: it’s so unobtrusive, that developers don’t even notice the restrictions. (But you can read about Caja on Yahoo’s site.) iGoogle’s sandbox also allows you to play with Caja today; it should go live next month, and we’re hoping to get Caja into several other Google properties as the year progresses.
How to misread Lord of the Rings
This was a great piece by Andrew Rilstone, that, except for on archive.org, seems to have disappeared from the internet. So here it is again. (more…)
I finally understand the state transformer monad!
It’s the monad arising from the currying adjunction between and
attaches an environment
to
:
This is the type of a function that takes a state of type and outputs a result of type
and a new state of type
. So
is the ability to depend on state and
is the ability to change the state.
The natural transformation is evaluation:
takes a function and an input point and evaluates the function at that point. So we get
by evaluating the in the middle, while the unit
is just the curried identity on pairs:
Wonderful talk on applying game design to app design
I had an idea like this for training users on a capability-based UI.
http://lostgarden.com/2008/10/princess-rescuing-application-slides.html
William’s witticisms
I was running an I jumped over a plate an I dint crack my head! I’n't that amazing? I went weeeooo squlksh an I falled in some applesauce.
I have a swim head. My head pops off an I have a swim body.
(William rubs Bruce’s feet.)
Bruce: That feels good!
William: You say funny words!
Wm: Ba’guys punch people.
MJ: Are you a bad guy? You kicked Aidan.
Wm: Ba’guys don’t kick.
Marty (pretending to sword fight): ching ching! Ching!
Wm: (sings) Ching, ching, ching! I love to ching!
Wm (running downstairs to us): I’m NAKED!
MJ: We noticed.
Wm: Yeah. It’s funny. HA HA HA HA!
Mike (noticing Martin’s blue lip): Did you drink the gatorade we told you not to drink?
Marty: William gave it to me and said I should drink it and I drank it.
PhD
I’ll be categorifying the lambda calculus. The nicest way of doing this would be to develop a good notion of a 2-theory, of which “the theory of a cartesian closed category” would be one example. It’s already known how to do this for cartesian categories: the kind of 2-theory has products itself, and has one type, a morphism
for the product in C and a morphism
for the unit; and several 2-morphisms for the diagrams.
Closed categories, however, need some notion of “op,” which is contravariant. Figuring out the nicest way of doing this will be some fun research. Paul Mellies has some ideas.
Hopefully, this kind of 2-theory will also let me talk about things like the blue calculus or Tyler Close’s web calculus (where objects are web services and POST is method invocation; lambda terms are stateless objects responding to the unique “call” method) that he uses in waterken.
Blue cell
It’s not immediately obvious how the blue calculus lets you do object references, so here’s a worked example.
I’m going to define a process, called Reference, that takes a new variable and turns it into an object reference. Because it’s easier to type, I’m going to use \ for lambda and v for nu, and when I apply a term to something other than a variable, you’re supposed to do the embedding from lambda calculus into blue calculus in your head. I’m leaving it out because the embedding behaves the same as lambda calculus and things are confusing enough as it is.
Reference = \ref.vc.( c | c = \x.(ref <= Pair c (x | cx)) )
Notice the Pair combinator (Pair = \xyz.zxy); it’s being applied to two processes that function like write and read, respectively, and is waiting for a third; if the third input is K, we get Kxy = x, and if it’s KI, we get KIxy = Iy = y.
You create an object by feeding it a variable for the reference and an initial state:
Reference ref u = vc.(cu | c = \x.(ref <= Pair c (x | cx)))
Now if you have a reference with some state in parallel with a writer process, it updates the state:
(Reference ref u | ref K w)
= ( vc.(cu | c = \x.(ref <= Pair c (x | cx))) | ref K w )
= vc.(cu | c = \x.(ref <= Pair c (x | cx)) | ref K w ) [by scope migration]
= vc.(\x.(ref <= Pair c (x | cx)) u | c = \x.(ref <= Pair c (x | cx)) | ref K w ) [by resource fetching on c]
= vc.(ref <= Pair c (u | cu) | c = \x.(ref <= Pair c (x | cx)) | ref K w ) [by small beta reduction on u]
= vc.(c = \x.(ref <= Pair c (x | cx)) | Pair c (u | cu) K w ) [by resource fetching on ref]
= vc.(c = \x.(ref <= Pair c (x | cx)) | K c (u | cu) w ) [by definition of Pair = \xyz.zxy]
= vc.(c = \x.(ref <= Pair c (x | cx)) | cw ) [by definition of K = \xy.x]
= vc.(cw | c = \x.(ref <= Pair c (x | cx))) [by symmetry of |]
= Reference ref w
Similarly, if you put it in parallel with a reader process,
(Reference ref u | ref KI w x y … z )
= vc.(c = \x.(ref <= Pair c (x | cx)) | K I c (u | cu) w x y … z) [by the same as above]
= vc.(c = \x.(ref <= Pair c (x | cx)) | I (u | cu) w x y … z) [by the definition of K]
= vc.(c = \x.(ref <= Pair c (x | cx)) | (u | cu) w x y … z) [by the definition of I]
= vc.(c = \x.(ref <= Pair c (x | cx)) | uwxy…z | cuwxy…z) [by distributivity]
= vc.(c = \x.(ref <= Pair c (x | cx)) | uwxy…z | \x.(ref <= Pair c (x | cx)) uwxy…z) [by resource fetching on c]
= vc.(c = \x.(ref <= Pair c (x | cx)) | uwxy…z | (ref <= Pair c (u | cu)) wxy…z) [by small beta reduction on u]
= vc.(c = \x.(ref <= Pair c (x | cx)) | uwxy…z | (ref <= Pair c (u | cu))) [by small beta reduction on w, x, y, ..., z]
= vc.(c = \x.(ref <= Pair c (x | cx)) | uwxy…z | \x.(ref <= Pair c (x | cx)) u) [by extensionality]
= vc.(c = \x.(ref <= Pair c (x | cx)) | uwxy…z | cu) [by extensionality]
= vc.(cu | c = \x.(ref <= Pair c (x | cx)) | uwxy…z) [by extensionality]
= (Reference ref u | uwxy…z)
Blue calculus
The lambda calculus is a Turing-complete language, but doesn’t handle concurrency very well; in particular, there’s no concept of a reference as opposed to a value. The lambda calculus has three productions:
The first production, , is a variable; the second is an application of one term to another; and the third is abstraction.
The beta-reduction rule says
if the variables in
are all free in
.
The part in brackets reads “with replacing
.” There are different evaluation strategies that one can use with the lambda calculus; one strategy that always reduces the term to its normal form if it has one is called “lazy evaluation.” With this strategy, you only reduce a term if it’s necessary to do so.
The blue calculus builds this strategy into the language. Here is an application term in the blue calculus:
Blue calculus splits application into four parts, or “processes”:
- a “small” application
in which a term
may only be applied to a variable
;
- a resource assignment
in which a variable
is associated to a one-time use of the term
;
- a new variable declaration
that introduces a new bound variable into scope; and
- a symmetric monoidal product of terms, thought of as “parallel processes”
The monoidal unit here is the “do-nothing” term
.
Blue calculus also splits beta reduction into two parts:
- a “small” beta reduction
if
is free in
;
- and a resource fetching rule that gets the value of a variable only when it’s needed
.
By splitting things up this way, the blue calculus allows patterns like nondeterministic choice. The blue calculus term
has two valid reductions,
or
.
It also allows for resource contention; here, the resource assignment acts as a mutex:
or
.
The application term is a linear term. If the term
uses
more than once, then only the first usage will be able to fetch the resource: a resource assignment is consumed by the small beta rule. However, the blue calculus also supports resource replication:
This allows the blue calculus to subsume both the lambda calculus and the pi calculus in a nice way. Blue terms are formed as follows:
Lambda terms embed like this:
Pi terms embed like this:
There are reduction rules for scope migration, associativity, distributivity, replication, and so forth; see the paper for more details.
The upshot of all this is that blue calculus variables are really references, and so one can build the whole theory of concurrent objects on top of it.
PhD
“I’m pleased to let you know that the Board of Graduate Studies has approved your registration as a candidate of the degree of Doctor of Philosophy…”
Hou tu pranownse Inglish
There are rules for pronouncing American English given the spelling; there are just a lot of them, and they’re ugly. No one’s going to get educated people to change how they spell–they’ve invested too much in learning it, and there’s too much legacy. But the way people speak, their dialect, varies widely with geography. So why not introduce a new one that pronounces words as they’re written?
Ankylodoxy
I started a blog to record my Gospel Doctrine lessons. It’s called ankylodoxy, as opposed to orthodoxy.

































leave a comment