Sunday, August 28, 2016

Deriving Quantum Theory from Basic Symmetries -- a slightly new approach


Quantum mechanics remains intuitively mysterious, in spite of its practical successes.   It is, however, very mathematically elegant.  

A literature has arisen attempting to explain why, mathematically speaking, quantum mechanics is actually completely natural and inevitable.   One aspect of this literature has remained intuitively unsatisfactory to me: the argument as to why complex numbers, rather than real numbers, should be used to quantify uncertainty.

Recently, though, I think I found an argument that feels right to me.  Here I will sketch that argument.  I haven't done all the math to prove the details (due to lack of time, as usual), so this will be sketchy and there might be something wrong here.   (I almost always put together an argument at this hand-wavy level before plunging into doing a proof.  But in recent years I'm so overwhelmed with other things to do, that I often just leave it at the hand-wavy argument and don't bother to do the proof.  This isn't as fully satisfying, but life in our current realm of scarcity is full of such trade-offs...)

Part of the background here is Saul Youssef's formulation ofquantum theory in terms of exotic probabilities.  Youssef argues (physicist style, not mathematician style) that the Kolmogorov axioms for probabilities, if one removes the axiom that probabilities be real numbers, still hold for complex, quaternionic and octonionic probabilities.    He then argues that if we assume probabilities are complex numbers, the Schrodinger equation essentially follows.   His complex-number probabilities are subtly different from the amplitudes normally used in quantum mechanics.

If we buy Youssef's formulation (as being, at least, one meaningful version of the QM truth), then the question "why QM" pretty much comes down to "why complex probabilities"?   This is what I think I may see a novel argument for -- based on tweaking some outstanding work by Knuth and Skillingon the algebraic foundations of inference.

Why Tuple Probabilities?

Most of my argument in this post will be mathematical (with big gaps where the details are lacking).   But I'll start with a more philosophical-ish point.

Let's assume the same basic setup regarding observers and observations that I used in my previous blog post, on maximum entropy.  

I will make a conceptual argument why in some cases it makes sense to think of probabilities as tuples with more than one entry.

The key question at hand is: How should O1 reason about the observations that O can distinguish but O1 cannot?    For instance, suppose O1 cannot distinguish obs1 from obs2, at least not in some particular context (e.g. in the context of having observed obs3).  

Suppose then there are two observations in bin t1, obsA and obsB.  Either of these could be obs1, and then the other would be obs2.    We have an utter lack of knowledge here. 

One way to phrase this situation is to postulate a number of different parallel “universes”, so that instead of a proposition

P = “obsA = obs1”

having a truth value on its own, it will have a relative truth value

tv(P,U)

where U is a certain universe.

The use of the word "universe" here is questionable.   But that's a fairly boring point, I think -- the application of human language to these rarefied domains is bound to be confusing.   Had Everett referred to the multiple universes of quantum theory as "potentiality planes" or some such his theory would have been much less popular even with the exact same content.  "Many potentiality planes hypothesis" just doesn't have the same zing as "Many worlds" or "Many universes"!  But instead of saying there are many worlds/universes, you could just as well say that there are many potentiality planes, and the oddity of quantum theory is that it doesn't explain how any of them becomes actual -- it just shifts around the weights of the different potentialities.  So either actuality is a bogus concept, or it needs to be explained by something other than quantum theory (in the Everett approach, as opposed to e.g. the von Neumann approach in which there is a "collapse" operation that translates certain potentialities into actualities).

Verbiage aside,  the above gives rise to the question: What are the logical requirements for these values tv(P,U)?

It seems to me that, via looking at the basic commonsensical symmetry properties of the tv(P,U) we can show that these must be complex numbers, thus arriving at a sort of abstract logical derivation of Youssef’s complex truth values.

We start out with the requirement that sometimes tv(P,U) and tv(P,U1) should be different ... so that the dependence on U is not degenerate.

Tuplizing Knuth and Skilling

The rest of my argument follows Knuth and Skilling's beautiful paper "Foundations of Inference."   What I would like to do is follow their arguments, but looking at valuations that are tuples (e.g. pairs) of reals rather than single reals.   This section of the blog post will only make sense to you after you read their paper carefully.

I note that Goyal, Knuth and Skilling have used a different method to argue why complex quantum amplitudes exist, based on extending their elementary symmetry arguments from "Foundations of Inference" in a different way than I suggest here.  You can find their paper here.

I think that is a fantastic paper, but yet I feel like it doesn't quite get at the essence.  I think one can get complex probabilities without veering that far from the very nice ideas in "Foundations of Inference," and without introducing so many other complications.  However, most likely the way to make the ideas in this blog post fully rigorous would be to borrow and slightly tweak a bunch of the math from the Goyal, Knuth and Skilling paper.  We're all sorta rearranging the same basic math in related ways.

To avoid problems with Wordpress plugins, I have deviated in notation from Knuth and Skilling a little here:

·      where they use a plus in a circle, I use +.
·      where they use an x with a circle in it, I use x.
·      where they use the typical "times" symbol for direct product, I use X
·      where they use a period with a circle around it, I use o.

Mostly: their "circle around" an operator has become my "period after."

Consider a space of tuples of ordered lattice elements, e.g.

x =  (x1, x2, ..., xk)

where each xi is a lattice element

Define a partial order on such tuples via

x < y iff { x1 < y1 and .... and xk

Note, this is not a total order.  For instance, one complex number x is less than another complex number y if both the real and imaginary parts of x are less than the corresponding parts of y.   If the real part of x is bigger than the real part of y, but the complex part of x is smaller than the complex part of y, then x and y are not comparable according to this partial order.

Next, define join on tuples as e.g.

(x1,x2) OR (y1, y2) = (x1 OR y1 , x2 OR y2)

and define cross on tuples as e.g.

(x1, x2) X (y1, y2) = (x1 X y1, x2 X y2) 

Next, define a valuation on tuples, where x^ denotes the value associated with a tuple.  Suppose that values are tuples of real numbers.

We will define addition on value tuples via

x^ +. y^ = x OR y

and multiplication on tuples via

x^ *.  y^ = x X y

Consider a chain [x,t] of two tuples so that x

We may suppose chaining is associative, so that

[[x, y], [y, z]] , [z, t] = [x, y], [[y, z], [z, t]]
We may associate each chain with a value tuple p(x|t); associativity then implies

p(x|z) = p(x|y) .o p(y,|z)

where .o represents a composition operator.

In terms of chains, what the partiality of the order < means is that sometimes two lattice-tuples can't be arranged in a chain in either order -- they are in this sense "logically incommensurable"; neither one implies nor refutes the other.

That is Knuth and Skilling's setup, ported to tuples of lattice elements rather than individual lattice elements, in what seems to me the most straightforward way

Now the question is how many of their arguments carry over to the case of tuples I'm considering here.  I haven't had time to do the calculations, but after some thinking, my intuitive conclusion is that they probably all should.   (If I'm wrong please tell me why!)

Symmetries 1-5 and Axioms 1-5 seem to all work OK, based on a quick think.   The symmetries involving < must be considered as restricted to those cases where the partial order < actually holds between the tuples involved.

And now I verge into some sketchier educated guesses.

First it seems to me that their Associative Theorem (Appendix A) should still work OK on tuples.

It should work because if there is a different isomorphism Psi regarding each component of a tuple, then the tuple of these component-wise isomorphisms should be an isomorphism on tuples of numbers.   Plus acts componentwise anyway.

Second (the bigger leap), it seems to me that their Multiplication Theorem (Appendix B) should work for complex numbers, much like it does for real numbers.  But their proof will obviously need to be altered a lot to handle this case -- i.e. the case where the function Psi in their multiplication functional equation maps into pairs of reals rather than single reals.

It's clear that the complex exponential fulfills this pairwise version of their "multiplication equation."   So one direction works, trivially.  Uniqueness is the trickier part.

If the Multiplication Theorem holds for pair-valued Psi, yielding the complex exponential, this would explain very satisfyingly (to me) why quantifying propositions with pairs leads to complex number values.  

Basically: We want multiplication to be morphic to direct product, but to do that on pairs you arrive at complex numbers (because it's the only way to multiply pairs that has the needed symmetries -- or so I am conjecturing....).

But why complex probabilities, rather than quaternionic or octonionic ones?   The latter would not be associative, hence could not be morphic to the direct product.   The argument to rule out quaternionic probabilities may be subtler, as commutativity is not strictly needed for Knuth and Skilling's arguments.  On the other hand, I suspect that given the weaker sort of ordering I've used for my tuples, commutativity may end up being needed to make some of the arguments work.   This needs more detailed digging.

So -- in brief -- the use of complex numbers emerges from the realization that single numbers are not enough, but that if we go too far beyond the complex numbers, we can't have the (join, direct product) algebra anymore.   But the (join, direct product) algorithm is assumed as elemental: join is just set union, and the direct product is just the elementary "taking of all combinations" of elements in two sets.

Summary

To recap:

We start with basic symmetries of lattices, because logical propositions are basic, and propositions form a lattice. 

We want to associate tuples of numbers with tuples of lattice elements, because it's nice to be able to measure and compare things quantitatively. 

We want to combine these number tuples in ways that are morphic to join and direct product.  

But to get this morphism to work for the case where the tuples are pairs, we get the complex numbers.  

And we (I think) can't get it to work for tuples bigger than pairs. 

But we need tuples that are at least pairs, to model the case where multiple possibilities cannot be distinguished and must be considered in parallel.  
So we must value propositions with complex numbers.

Quod Et Handwavium.... 

(QEH.   I like that!  Surely not original....)

Propositional Logic as Pre-Physics

I am reminded of something I read when I was 16 years old, back in the early 1980s, reading through Gravitation, the classic General Relativity text by Misner, Thorne and Wheeler.   (I didn't understand it fully the first time through -- I gradually grokked it better over the next 2.5 years as I went through undergrad and grad courses on differential geometry -- but it was good for me to get a basic view of GR first to guide my study of differential geometry.)   One of the nice aspects that book, at least for me on that first reading, was the large number of digressive asides in inset boxes.   One of these asides regarded Wheeler's speculative idea of "propositional logic as pregeometry".   He was speculating that, somehow,  the geometry of spacetime would emerge from the large-scale statistics of logical propositions.  This notion has stuck in my mind ever since.

The emergence of general relativity -- and hence the geometry of spacetime -- from large-scale statistics has been a topic of lots of recent attention ("gravity as entropy", etc. -- I have exploited this in my speculations on causal webs, for example).    Large-scale statistics of what?   Of underlying quantum reality.  But of course Wheeler already knew very well that quantum mechanics was modeled using Boolean lattices, and that lattice structure modeled the structure of logical propositions.

In the ordinary quantum logic framework, one looks at meet and join operations and constructs a non-Boolean logic in this way.   In Youssef's exotic probability framework, one sticks with the Boolean lattice, and then puts complex valuations on the elements of the Boolean lattice.  

One thing that Knuth and Skilling show is that the algebra of joins and direct products is important.  They show that this algebra morphs directly to the algebra of numerical sums and products on (real-valued) probabilities.   What I suggest (but have only hand-waved about, not yet actually shown rigorously) is that, if one looks at tuples of lattice elements, then this algebra (joins and direct products) maps directly to the algebra of numerical sums and products on complex-valued probabilities.  Thus getting Youssef's exotic probability formulation of quantum mechanics out of some basic symmetries.

I still feel Wheeler's intuition was basically right, but, as often occurs, getting the details right involves multiple complexities...

No comments: