Burning bright

Posted in Uncategorized by Mike Stay on 2009 October 29

Bengal tiger portrait shoot, with all four varieties.

Devils on Mars

Posted in Uncategorized by Mike Stay on 2009 October 29


Renormalization and Computation 4

Posted in Category theory, Math, Programming, Quantum by Mike Stay on 2009 October 15

This is the fourth in a series of posts on Yuri Manin’s pair of papers. In the previous posts, I laid out the background; this time I’ll actually get around to his result.

A homomorphism from the Hopf algebra into a target algebra is called a character. The functor that assigns an action to a path, whether classical or quantum, is a character. In the classical case, it’s into the rig (\mathbb{R} \cup \{\infty\}, \min, \infty, +, 0), and we take an infimum over paths; in the quantum it’s into the rig (\mathbb{C}, +, 0, \cdot, 1), and we take an integral over paths. Moving from the quantum to the classical case is called Maslov dequantization.

Manin mentions that the runtime of a parallel program is a character akin to the classical action, with the runtime of the composition of two programs being the sum of the respective runtimes, while the runtime of two parallel programs is the maximum of the two. A similar result holds for nearly any cost function. He also points out that computably enumerable reals \mathbb{R}^{c.e.} form a rig (\mathbb{R}^{c.e.}\cup \{\infty\}, \max, \infty, +, 0). He examines Rota-Baxter operators as a way to generalize what “polar part” means and extend the theorems on Hopf algebra renormalization to such rigs.

In the second paper, he looks at my work with Calude as an example of a character. He uses our same argument to show that lots of measures of program behavior have the property that if the measure hasn’t stopped growing after reaching a certain large amount with respect to the program size, then the density of finite values the measure could take decreases like 1/\log(t). Surprisingly, though he referred to these results as cutoffs, he didn’t actually use them anywhere for doing regularization.

Reading between the lines, he might be suggesting something like approximating the Kolmogorov complexity that he uses later by using a time cutoff, motivated by results from our paper: there’s a constant c depending only on the programming language such that if you run the nth program cn^2 steps and it hasn’t stopped, then the density of times near t > cn^2 at which it could stop is roughly 1/\log(t).

Levin suggested using a computable complexity that’s the sum of the program length and the log of the number of time steps; I suppose you could “regularize” the Kolmogorov complexity by adding \Lambda \log(t) to the length of the program, renormalize, and then let \Lambda go to zero, but that’s not something Manin does.

Instead, he proposed two other constructions suitable for renormalization; here’s the simplest. Given a partial computable function f:\mathbb{Z}^+\to \mathbb{Z}^+, define the computably enumerable \bar{f}:\mathbb{N}\to\mathbb{N} by \bar{f}(k) = f(k) if f(k) is defined, and 0 otherwise. Now define

\displaystyle \Psi(k,f;z) = \sum_{n=0}^{\infty} \frac{z^n}{\left(1+n\bar{f}(k)\right)^2}.

When f(k) is undefined, \Psi(k,f;z) = 1/(1-z), which has a pole at z=1. When f(k) is defined, \Psi(k,f;z) converges everywhere except z=\infty. Birkhoff decomposition would separate these two cases, though I’m not sure what value \Psi_+(f,k;1) would take or what it would mean.

The other construction involves turning \bar{f} into a permutation (x,y) \mapsto (x+\bar{f}(y),y), and inventing a function that has poles when the permutation has fixpoints.

So Manin’s idea of renormalizing the halting problem is to do some uncomputable stuff to get an easy-to-renormalize function and then throw the Birkhoff decomposition at it; since we know the halting problem is undecidable, perhaps the fact that he didn’t come up with a new technique for extracting information about the problem is unsurprising, but after putting in so much effort to understand it, I was left rather disappointed: if you’re going to allow yourself to do uncomputable things, why not just solve the halting problem directly?

I must suppose that his intent was not to tackle this hard problem, but simply to play with the analogy he’d noticed; it’s what I’ve done in other papers. And being forced to learn renormalization was exhilarating! I have a bunch of ideas to follow up; I’ll write them up as I get a chance.

Renormalization and Computation 3

Posted in Uncategorized by Mike Stay on 2009 October 10

This is the third in a series of posts on Yuri Manin’s recent pair of papers applying Hopf algebra renormalization to the Halting problem. Last time I talked about the way people usually do renormalization; this time I’ll talk about the recent work by Kreimer, Connes, and others in exposing the underlying Hopf algebra in this process.

A Hopf algebra is

  • An R-module for a commutative rig R, which means you can add vectors and multiply them by a scalar.
  • An algebra, which means you can take two vectors and multiply them. This operation is associative; there’s also a unit vector that satisfies left- and right-unit laws.
  • A bialgebra, which means there’s also a coassociative comultiplication and counit, and the structures all work together. When the tensor product is the cartesian product, the comultiplication duplicates the vector and the counit is the constant map to 1 in the base field. Even when the tensor product isn’t the cartesian product, it can still be useful to think of it that way.
  • A bialgebra with an involution, called the antipode.

A group is very like a Hopf algebra; in fact, a group object in the category of vector spaces and linear maps is a cocommutative Hopf algebra. You can multiply group elements and there’s a multiplicative unit; you can duplicate and delete them in equations; and you can invert them.

It turns out that Feynman diagrams form a Hopf algebra if you poke yourself in one eye and squint with the other. First, a cut C of an oriented graph g (i.e directed graph with no parallel edges) picks an upper set g^C and a lower set g_C of vertices such that

  • given an oriented wheel in the graph, its vertices either all belong to the upper set or all belong to the lower set, and
  • any edge connecting a vertex v in the upper set to a vertex w in the lower set must be directed from v to w.

Now, given a set of Feynman diagrams, consider all formal linear combinations of graph cuts. This is a vector space because you can add these things pointwise and multiply them by a scalar. We can make it into a bialgebra by defining multiplication to be a linear map

\displaystyle m(g \otimes g') = g \coprod g'

with unit

\displaystyle \begin{array}{rl}e:I & \to H \\ 1 & \mapsto 0, \end{array}

and comultiplication to be a linear map

\displaystyle \Delta(g) = \sum_C g^C \otimes g_C,

where C ranges over all cuts of g, with counit

\displaystyle \begin{array}{rl}\epsilon:H & \to I \\ \sum_g a_g g& \mapsto a_0. \end{array}

It’s graded: just count the number of vertices. And we can turn it into a Hopf algebra by defining the antipode S:H \to H to be a linear map such that

\displaystyle \begin{array}{rl}S(1) & = 1 \\  S(g) & \displaystyle = -g-\sum_C S(g^C) \coprod g_C \\ & \displaystyle = -g-\sum_C g^C \coprod S(g_C).\end{array}

Each algebra homomorphism (not necessarily preserving the Hopf algebra structure) from H to an algebra \mathcal{A} defines a way to assign a (generalized) probability amplitude to each diagram. The set \mbox{hom}(H, \mathcal{A}) of such homomorphisms becomes a group when we note that the functor \mbox{hom}(-, \mathcal{A}) is contravariant, so the comultiplication in H gets mapped to a multiplication.

Next: given a complex group G (that is, a group that’s also a complex manifold so the multiplication and inverse are complex-analytic functions), a Birkhoff decomposition of a loop \phi:S^1 \to G is an analytic continuation of the loop to

  • a holomorphic function \phi_+ on the standard disk inside the circle
  • a holomorphic function \phi_- on the complement of this disk in the projective complex plane
  • such that on the unit circle the original loop is reproduced as

    \displaystyle \phi = \phi_{+} \phi_{-}^{-1},

    where the product and the inverse on the right are taken in the group G. Notice that \phi_+(0) is a well defined element of G.

Take G = \mbox{hom}(H, \mathbb{C}). Now imagine our regularization parameter is a complex number \Lambda and we have some map \phi:\mathbb{C} \to G that’s singular at \Lambda = 0. Then the Connes-Kreimer theorem says that the Birkhoff decomposition always exists and gives an explicit formula. Hopf algebra renormalization is simply rearranging the terms in the Birkhoff decomposition:

\displaystyle \phi_{+} = \phi \star \phi_{-}^{-1},

where \star is the convolution product.

As I understand this, \phi is isomorphic to \tilde{\phi}:H \to \mbox{hom}(\mathbb{C},\mathbb{C}). Given a linear combination of graphs, \tilde{\phi} gives you back a Laurent polynomial in \Lambda which you can split into terms with negative exponents (the polar part) and those with positive exponents (the renormalized part).

Renormalization and Computation 2

Posted in General physics, Math, Quantum by Mike Stay on 2009 October 10

This is the second in a series of posts covering Yuri Manin’s ideas involving Hopf algebra renormalization of the Halting problem. Last time I showed how perturbing a quantum harmonic oscillator gave a sum over integrals involving n interactions with the perturbation; we can keep track of the integrals using Feynman diagrams, though in the case of a single QHO they weren’t very interesting.

One point about the QHO needs emphasis at this point. Given a wavefunction \psi = \sum_{n=0}^{\infty} \psi_n |n\rangle describing the state of the QHO, it must be the case that we get some value when we measure the energy; so if we sum up the norms of the probability amplitudes, we should get unity:

\displaystyle \langle \psi|\psi \rangle = \sum_{n=0}^{\infty} \langle n | \psi_n^* \psi_n | n \rangle = \sum_{n=0}^{\infty} |\psi_n|^2 = 1.

This is called the normalization condition.

When we perturb the QHO, the states |n\rangle are no longer the energy eigenvectors of the new Hamiltonian. We can express the new eigenvectors |m\rangle in terms of the old ones:

\displaystyle |m\rangle = \sum_{n=0}^{\infty}\lambda^n m_n|n\rangle,

where \lambda is the strength of the perturbation, and we reexpress our wavefunction in this new basis:

\displaystyle \psi = \sum_{m=0}^{\infty} \psi'_m |m\rangle

Since we’re working with a new set of coefficients, we have to make sure they sum up to unity, too:

\displaystyle \sum_{m=0}^{\infty} |\psi'_m|^2 = 1.

This is the renormalization condition. So renormalization is about making sure things sum up right once you perturb the system.

I want to talk about renormalization in quantum field theory; the trouble is, I don’t actually know quantum field theory, so I’ll just be writing up what little I’ve gathered from reading various things and conversations with Dr. Baez. I’ve likely got some things wrong, so please let me know and I’ll fix them.

A field is a function defined on spacetime. Scalar fields are functions with a single output, whereas vector fields are functions with several outputs. The electromagnetic field assigns a single number, called the electric field, and a vector, called the magnetic field, to every point in spacetime. When you have two electrons and move one of them, it feels a reaction force and loses momentum; the other electron doesn’t move until the influence, traveling at the speed of light, reaches it. Conservation of momentum says that the momentum has to be somewhere; it’s useful to consider it to be in the electromagnetic field.

When you take the Fourier transform of the field, you get a function that assigns values to harmonics of the field; in the case of electromagnetism, the transformed field \phi assigns a value to each color k of light. Quantizing this transformed field amounts to making \phi(k) into a creation operator, just like z in the QHO example from last time. So we have a continuum of QHOs, each indexed by a color k. (By the way—the zero-dimensional Fourier transform is the identity function, so the QHO example from last time can be thought of both as the field at the unique point in spacetime and the field at the unique frequency.)

When we move to positive-dimensional fields, we get more interesting pictures, like these from quantum electrodynamics:
Feynman diagrams
Here, our coupling constant is the fine structure constant \alpha = e^2/\hbar c, where e is the charge of the electron. For each vertex, we write down our coupling constant times -i times a delta function saying that the incoming momentum minus the outgoing momentum equals zero. For each internal line, we write down a propagator—a function representing the transfer of momentum from one point to another; it’s a function of the four-momentum k—and multiply all this stuff together. Then we integrate over all four-momenta and get something that looks like

\displaystyle \int_0^\infty f(k) d^4 k.

The trouble is, this integral usually gives infinity for an answer. We try to work around this in two steps: first, we regularize the integral by introducing a length scale \Lambda. This represents the point at which gravity starts being important and we need to move to a more fundamental theory. In the quantum field theory of magnetic domains in iron crystals, the length scale is the inter-atom distance in the lattice. Regularization makes the integral finite for \Lambda away from some singularity.

There are a few different ways of regularizing; one is to use \Lambda as a momentum cutoff:

\displaystyle \int_0^\Lambda f(k) d^4 k.

This obviously converges, and solutions to this are always a sum of three parts:

  • The first part diverges as \Lambda \to \infty, either logarithmically or as a power of \Lambda.
  • The second part is finite and independent of \Lambda.
  • The third part vanishes as \Lambda \to \infty.

Renormalization in this case amounts to getting rid of the first part.

These three parts represent three different length scales: at lengths larger than \Lambda, all quantum or statistical fluctuations are negligible, and we can use the mean field approximation and do classical physics. At lengths between \Lambda and 1/\Lambda, we use QFT to calculate what’s going on. Finally, at lengths smaller than 1/\Lambda, we need a new theory to describe what’s going on. In the case of QED, the new theory is quantum gravity; string theory and loop quantum gravity are the serious contenders for the correct theory.

The problem with this regularization scheme is that it doesn’t preserve gauge invariance, so usually physicists use another regularization scheme, called “dimensional regularization”. Here, we compute

\displaystyle \int_0^\infty f(k) d^\Lambda k

which gives us an expression involving gamma functions of \Lambda, where gamma is the continuous factorial function, and then we set \Lambda = 4 - \epsilon. The solutions to this are also a sum of three terms—a divergent part, a finite part, and a vanishing part—and then renormalization gets rid of the divergent part.

Assume we have some theory with a single free parameter g. We’d like to calculate a function F(x) perturbatively in terms of g, where F represents some physical quantity, and we know F(x_0) = g'. We assume F takes the form

\displaystyle F(x) = g + g^2 F_2(x) + g^3 F_3(x) + \cdots

and assume that this definition gives us divergent integrals for the F_n. The first step is regularization: instead of F we have a new function

\displaystyle F_\Lambda(x) = g + g^2 F_{2,\Lambda}(x) + g^3 F_{3,\Lambda}(x) + \cdots

Now we get to the business of renormalization! We solve this problem at each order; if the theory is renormalizable, knowing the solution at the previous order will give us a constraint for the next order, and we can subtract off all the divergent terms in a consistent way:

  1. Order g.

    Here, F_\Lambda(x) = g + O(g^2). Since it’s a constant, it has to match F(x_0) = g', so g = g' + O(g'^2). In this approximation, the coupling constant takes the classical value.

  2. Order g^2.

    Let g = g' + G_2(g') + G_3(g') + \cdots, where G_n(g') \sim O(g'^n). Plugging this into the definition of F_\Lambda, we get

    \displaystyle F_\Lambda(x) = g' + G_2(g') + g'^2 F_{2,\Lambda}(x) + O(g'^3).

    Using F(x_0) = g', we get G_2(g') = -g'^2F_{2,\Lambda}(x_0), which diverges as \Lambda \to \infty. In the case of QED, this says that the charge on the electron is infinite. While the preferred interpretation these days is that quantum gravity is a more fundamental theory that takes precedence on very small scales (a Planck length is to a proton as a proton is to a meter), when the theory was first introduced, there was no reason to think that we’d need another theory. So the interpretation was that with an infinite charge, an electron would be able to extract an infinite amount of energy from the electromagnetic field. Then the uncertainty principle would create virtual particles of all energies, which would exist for a time inversely proportional to the energy. The particles can be charged, so they line up with the field and dampen the strength just like dielectrics. In this interpretation, the charge on the electron depends on the energy of the particles you’re probing it with.

    So to second order,

    \displaystyle F_\Lambda(x) = g' + g'^2\left(F_{2,\Lambda}(x) - F_{2,\Lambda}(x_0)\right) + O(g'^3).

    A theory is therefore only renormalizable if the divergent part of F_{2,\Lambda}(x) is independent of x. In QED it is. We can now define F(x) as the limit

    \displaystyle F(x) = \lim_{\Lambda \to \infty} F_\Lambda(x).

  3. Higher orders.

    In a renormalizable theory, the process continues, with the counterterms entirely specified by knowing F(x_0).

Renormalization and Computation 1

Posted in Category theory, Math, Programming, Quantum by Mike Stay on 2009 October 7

Yuri Manin recently put two papers on the arxiv applying the methods of renormalization to computation and the Halting problem. Grigori Mints invited me to speak on Manin’s results at the weekly Stanford logic seminar because in his second paper, he expands on some of my work.

In these next few posts, I’m going to cover the idea of Feynman diagrams (mostly taken from the lecture notes for the spring 2004 session of John Baez’s Quantum Gravity seminar); next I’ll talk about renormalization (mostly taken from Andrew Blechman’s overview and B. Delamotte’s “hint”); third, I’ll look at the Hopf algebra approach to renormalization (mostly taken from this post by Urs Schreiber on the n-Category Café); and finally I’ll explain how Manin applies this to computation by exploiting the fact that Feynman diagrams and lambda calculus are both examples of symmetric monoidal closed categories (which John Baez and I tried to make easy to understand in our Rosetta stone paper), together with some results on the density of halting times from my paper “Most programs stop quickly or never halt” with Cris Calude. I doubt all of this will make it into the talk, but writing it up will make it clearer for me.

Renormalization is a technique for dealing with the divergent integrals that arise in quantum field theory. The quantum harmonic oscillator is quantum field theory in 0+1 dimensions—it describes what quantum field theory would be like if space consisted of a single point. It doesn’t need renormalization, but I’m going to talk about it first because it introduces the notion of a Feynman diagram.

“Harmonic oscillator” is a fancy name for a rock on a spring. The force exerted by a spring is proportional to how far you stretch it:

\displaystyle F = kx.

The potential energy stored in a stretched spring is the integral of that:

\displaystyle V_0 = \frac{1}{2}kx^2 + C,

and to make things work out nicely, we’re going to choose C = -1/2. The total energy H_0 is the sum of the potential and the kinetic energy:

\displaystyle H_0 = V_0 + T = \frac{1}{2}kx^2 + \frac{1}{2}mv^2 - \frac{1}{2}.

By choosing units so that k = m = 1, we get

\displaystyle H_0 = \frac{x^2}{2} + \frac{p^2}{2} - \frac{1}{2},

where p is momentum.

Next we quantize, getting a quantum harmonic oscillator, or QHO. We set p = -i \frac{\partial}{\partial x}, taking units where \hbar = 1. Now

\begin{array}{rl}\displaystyle [x, p]x^n & \displaystyle = xp - px \\ & = (- x i \frac{\partial}{\partial x} + i \frac{\partial}{\partial x} x)x^n \\\ & \displaystyle = -i(nx^n - (n+1)x^n) \\ & \displaystyle = ix^n.\end{array}

If we define a new observable z = \frac{p + ix}{\sqrt{2}}, then

\begin{array}{rl} \displaystyle z z^* & \displaystyle = \frac{(p + ix)}{\sqrt{2}} \frac{(p - ix)}{\sqrt{2}} \\ & = \frac{1}{2}(p^2 + i(xp - px) + x^2) \\ & = \frac{1}{2}(p^2 -1 + x^2) \\ & = H_0.\end{array}

We can think of z^* as \frac{d}{dz} and write the energy eigenvectors as polynomials in z:

\displaystyle H_0 z^n = z \frac{d}{dz} z^n = n z^n.

The creation operator z adds a photon to the mix; there’s only one way to do that, so z\cdot z^n = 1 z^{n+1}. The annihilation operator \frac{d}{dz} destroys one of the photons; in the state z^n, there are n photons to choose from, so \frac{d}{dz} z^n = n z^{n-1}.

Schrödinger’s equation says i \frac{d}{dt} \psi = H_0 \psi, so

\displaystyle \psi(t) = \sum_{n=0}^{\infty} e^{-itn} a_n z^n.

This way of representing the state of a QHO is known as the “Fock basis”.

Now suppose that we don’t have the ideal system, that the quadratic potential V_0 = \frac{1}{2}kx^2 - \frac{1}{2} is only a good local approximation to the real potential V_0 + \lambda V. Then we can write the total as H = H_0 + \lambda V, where V is a function of position and momentum, or equivalently of z and \frac{d}{dz}, and \lambda is small.

Now we solve Schrödinger’s equation perturbatively. We know that

\displaystyle \psi(t) = e^{-itH} \psi(0),

and we assume that

\displaystyle e^{-itH}\psi(t) \approx e^{-itH_0} \psi(t)

so that it makes sense to solve it perturbatively. Define

\displaystyle \psi_1(t) = e^{itH_0} e^{-itH}\psi(t)


\displaystyle V_1(t) = e^{itH_0} \lambda V e^{-itH_0}.

After a little work, we find that

\displaystyle \frac{d}{dt}\psi_1(t) = -i V_1(t) \psi_1(t),

and integrating, we get

\displaystyle \psi_1(t) = -i\int_0^t V_1(t_0) \psi_1(t_0) dt_0 + \psi(0).

We feed this equation back into itself recursively to get

\begin{array}{rl}\displaystyle \psi_1(t) & \displaystyle = -i \int_0^t V_1(t_0) \left[-i\int_0^{t_0} V_1(t_1) \psi_1(t_1) dt_1 + \psi(0) \right] dt_0 + \psi(0) \\ & \displaystyle = \left[\psi(0)\right] + \left[\int_0^t i^{-1} V_1(t_0)\psi(0) dt_0\right] + \left[\int_0^t\int_0^{t_0} i^{-2} V_1(t_0)V_1(t_1) \psi_1(t_1) dt_1 dt_0\right] \\ & \displaystyle = \sum_{n=0}^{\infty} \int_{t \ge t_0 \ge \ldots \ge t_{n-1} \ge 0} i^{-n} V_1(t_0)\cdots V_1(t_{n-1}) \psi(0) dt_{n-1}\cdots dt_0 \\ & \displaystyle = \sum_{n=0}^{\infty} (-\lambda i)^n \int_{t \ge t_0 \ge \ldots \ge t_{n-1} \ge 0} e^{-i(t-t_0)H_0} V e^{-i(t_0-t_1)H_0} V \cdots V e^{-i(t_{n-1}-0)H_0} \psi(0) dt_{n-1}\cdots dt_0.\end{array}

So here we have a sum of a bunch of terms; the nth term involves n interactions with the potential interspersed with evolving freely between the interactions, and we integrate over all possible times at which those interactions could occur.

Here’s an example Feynman diagram for this simple system, representing the fourth term in the sum above:

Three interactions with the perturbation.

The lines represent evolving under the free Hamiltonian H_0, while the dots are interactions with the potential V.

As an example, let’s consider V = (z + \frac{d}{dz}) and choose \lambda = \frac{1}{\sqrt{2}} so that \lambda V = p. When V acts on a state \psi = z^n, we get V \psi = z^{n+1} + nz^{n-1}. So at each interaction, the system either gains a photon or changes phase and loses a photon.

A particle moving in a quadratic potential in n-dimensional space gives the tensor product of n QHOs, which is QFT in a space where there are n possible harmonics. Quantum electrodynamics (QED) amounts to considering infinitely many QHOs, one for each possible energy-momentum, which forms a continuum. The diagrams for QED start to look more familiar:
Feynman diagrams
The vertices are interactions with the electromagnetic field. The straight lines are electrons and the wiggly ones are photons; between interactions, they propagate under the free Hamiltonian.

Fun mechanical geometry site

Posted in Uncategorized by Mike Stay on 2009 October 6

Transplants without rejection

Posted in Uncategorized by Mike Stay on 2009 October 4

Cured diabetes in mice via pancreas transplant, but probably works on every other organ, too.