thoughts on abstraction — Tarik Yildirim

Why is it always the case that formulation of deeper physics require more abstract mathematics? Why does understanding get better as it zooms out?

Side Note: Notice that there are two ways of zooming out. First, you can abstract by ignoring details. This is actually great for applications, but not good for understanding. It operates more like chunking, coarse-graining, forming equivalence classes etc. You end up sacrificing accuracy for the sake of practicality. Second, you can abstract in the sense of finding an underlying structure that allows you to see two phenomena as different manifestations of the same phenomenon. This is actually the meaning that we will be using throughout the blogpost. While coarse graining is easy, discovering an underlying structure is hard. You need to understand the specificity of a phenomenon which you normally consider to be general.

For instance, a lot of people are unsatisfied with the current formulation of quantum physics, blaming it for being too instrumental. Yes, the math is powerful. Yes, the predictions turn out to be correct. But the mathematical machinery (function spaces etc.) feels alien, even after one gets used to it over time. Or compare the down-to-earth Feynman diagrams with the amplituhedron theory... Again, you have a case where a stronger and more abstract beast is posited to dethrone a multitude of earthlings.

Is the alienness a price we have to pay for digging deeper? The answer is unfortunately yes. But this should not be surprising at all:

We should not expect to be able to explain deeper physics (which is so removed from our daily lives) using basic mathematics inspired from mundane physical phenomena. Abstraction gives us the necessary elbow room to explore realities that are far-removed from our daily lives.
You can use the abstract to can explain the specific but you can not proceed the other way around. Hence as you understand more, you inevitably need to go higher up in abstraction. For instance, you may hope that a concept as simple as the notion of division algebra will be powerful enough to explain all of physics, but you will sooner or later be gravely disappointed. There is probably a deeper truth lurking behind such a concrete pattern.

Abstraction as Compression

The simplicities of natural laws arise through the complexities of the languages we use for their expression.

- Eugene Wigner
That the simplest theory is best, means that we should pick the smallest program that explains a given set of data. Furthermore, if the theory is the same size as the data, then it is useless, because there is always a theory that is the same size as the data that it explains. In other words, a theory must be a compression of the data, and the greater the compression, the better the theory. Explanations are compressions, comprehension is compression!

Chaitin - Metaphysics, Metamathematics and Metabiology

We can not encode more without going more abstract. This is a fundamental feature of the human brain. Either you have complex patterns based on basic math or you have simple patterns based on abstract math. In other words, complexity is either apparent or hidden, never gotten rid of. (i.e. There is no loss of information.) By replacing one source of cognitive strain (complexity) with another source of cognitive strain (abstraction), we can lift our analysis to higher-level complexities.

In this sense, progress in physics is destined to be of an unsatisfactory nature. Our theories will keep getting more abstract (and difficult) at each successive information compression.

Don't think of this as a human tragedy though! Even machines will need abstract mathematics to understand deeper physics, because they too will be working under resource constraints. No matter how much more energy and resources you summon, the task of simulating a faithful copy of the universe will always require more.

As Bransford points out, people rarely remember written or spoken material word for word. When asked to reproduce it, they resort to paraphrase, which suggests that they were able to store the meaning of the material rather than making a verbatim copy of each sentence in the mind. We forget the surface structure, but retain the abstract relationships contained in the deep structure.
Jeremy Campbell - Grammatical Man (Page 219)

Depending on context, category theoretical techniques can yield proofs shorter than set theoretical techniques can, and vice versa. Hence, a machine that can sense when to switch between these two languages can probe the vast space of all true theories faster. Of course, you will need human aide (enhanced with machine learning algorithms) to discern which theories are interesting and which are not.

Abstraction is probably used by our minds as well, allowing it to decrease the number of used neurons without sacrificing explanatory power.

Rolnick and Max Tegmark of the Massachusetts Institute of Technology proved that by increasing depth and decreasing width, you can perform the same functions with exponentially fewer neurons. They showed that if the situation you’re modeling has 100 input variables, you can get the same reliability using either 2100 neurons in one layer or just 210 neurons spread over two layers. They found that there is power in taking small pieces and combining them at greater levels of abstraction instead of attempting to capture all levels of abstraction at once.
“The notion of depth in a neural network is linked to the idea that you can express something complicated by doing many simple things in sequence,” Rolnick said. “It’s like an assembly line.”
- Foundations Built for a General Theory of Neural Networks (Kevin Hartnett)

In a way, the success of neural network models with increased depth reflect the hierarchical aspects of the phenomena themselves. We end up mirroring nature more closely as we try to economize our models.

Abstraction as Unlearning

Abstraction is not hard because of technical reasons. (On the contrary, abstract things are easier to manipulate due to their greater simplicities.) It is hard because it involves unlearning. (That is why people who are better at forgetting are also better at abstracting.)

Side Note: Originality of the generalist is artistic in nature and lies in the intuition of the right definitions. Originality of the specialist is technical in nature and lies in the invention of the right proof techniques.

Globally, unlearning can be viewed as the Herculean struggle to go back to the tabula rasa state of a beginner's mind. (In some sense, what takes a baby a few months to learn takes humanity hundreds of years to unlearn.) We discard one by one what has been useful in manipulating the world in favor of getting closer to the truth.

Here are some beautiful observations of a physicist about the cognitive development of his own child:

My 2-year old’s insight into quantum gravity. If relative realism is right then ‘physical reality’ is what we experience as a consequence of looking at the world in a certain way, probing deeper and deeper into more and more general theories of physics as we have done historically (arriving by now at two great theories, quantum and gravity) should be a matter of letting go of more and more assumptions about the physical world until we arrive at the most general theory possible. If so then we should also be able to study a single baby, born surely with very little by way of assumptions about physics, and see where and why each assumption is taken on. Although Piaget has itemized many key steps in child development, his analysis is surely not about the fundamental steps at the foundation of theoretical physics. Instead, I can only offer my own anecdotal observations.
Age 11 months: loves to empty a container, as soon as empty fills it, as soon as full empties it. This is the basic mechanism of waves (two competing urges out of phase leading to oscillation).
Age 12-17 months: puts something in drawer, closes it, opens it to see if it is still there. Does not assume it would still be there. This is a quantum way of thinking. It’s only after repeatedly finding it there that she eventually grows to accept classical logic as a useful shortcut (as it is in this situation).
Age 19 months: comes home every day with mother, waves up to dad cooking in the kitchen from the yard. One day dad is carrying her. Still points up to kitchen saying ‘daddy up there in the kitchen’. Dad says no, daddy is here. She says ‘another daddy’ and is quite content with that. Another occasion, her aunt Sarah sits in front of her and talks to her on my mobile. When asked, Juliette declares the person speaking to her ‘another auntie Sarah’. This means that at this age Juliette’s logic is still quantum logic in which someone can happily be in two places at the same time.
Age 15 months (until the present): completely unwilling to shortcut a lego construction by reusing a group of blocks, insists on taking the bits fully apart and then building from scratch. Likewise always insists to read a book from its very first page (including all the front matter). I see this as part of her taking a creative control over her world.
Age 20-22 months: very able to express herself in the third person ‘Juliette is holding a spoon’ but finds it very hard to learn about pronouns especially ‘I’. Masters ‘my’ first and but overuses it ‘my do it’. Takes a long time to master ‘I’ and ‘you’ correctly. This shows that an absolute coordinate-invariant world view is much more natural than a relative one based on coordinate system in which ‘I’ and ‘you’ change meaning depending on who is speaking. This is the key insight of General Relativity that coordinates depend on a coordinate system and carry no meaning of themselves, but they nevertheless refer to an absolute geometry independent of the coordinate system. Actually, once you get used to the absolute reference ‘Juliette is doing this, dad wants to do that etc’ it’s actually much more natural than the confusing ‘I’ and ‘you’ and as a parent I carried on using it far past the time that I needed to. In the same way it’s actually much easier to do and teach differential geometry in absolute coordinate-free terms than the way taught in most physics books.
Age 24 months: until this age she did not understand the concept of time. At least it was impossible to do a bargain with her like ‘if you do this now, we will go to the playground tomorrow’ (but you could bargain with something immediate). She understood ‘later’ as ‘now’.
Age 29 months: quite able to draw a minor squiggle on a bit of paper and say ‘look a face’ and then run with that in her game-play. In other words, very capable of abstract substitutions and accepting definitions as per pure mathematics. At the same time pedantic, does not accept metaphor (‘you are a lion’ elicits ‘no, I’m me’) but is fine with similie, ‘is like’, ‘is pretending to be’.
Age 31 months: understands letters and the concept of a word as a line of letters but sometimes refuses to read them from left to right, insisting on the other way. Also, for a time after one such occasion insisted on having her books read from last page back, turning back as the ‘next page’. I interpret this as her natural awareness of parity and her right to demand to do it her own way.
Age 33 months (current): Still totally blank on ‘why’ questions, does not understand this concept. ‘How’ and ‘what’ are no problem. Presumably this is because in childhood the focus is on building up a strong perception of reality, taking on assumptions without question and as quickly as possible, as it were drinking in the world.
... and just in the last few days: remarked ‘oh, going up’ for the deceleration at the end of going down in an elevator, ‘down and a little bit up’ as she explained. And pulling out of my parking spot insisted that ‘the other cars are going away’. Neither observation was prompted in any way. This tells me that relativity can be taught at preschool.
- Algebraic Approach to Quantum Gravity I: Relative Realism (S. Majid)

Abstraction for Survival

The idea, according to research in Psychology of Aesthetics, Creativity, and the Arts, is that thinking about the future encourages people to think more abstractly—presumably becoming more receptive to non-representational art.
- How to Choose Wisely (Tom Vanderbilt)

Why do some people (like me) get deeply attracted to abstract subjects (like Category Theory)?

One of the reasons could be related to the point made above. Abstract things have higher chances of survival and staying relevant because they are less likely to be affected by the changes unfolding through time. (Similarly, in the words of Morgan Housel, "the further back in history you look, the more general your takeaways should be.") Hence, if you have an hunger for timelessness or a worry about being outdated, then you will be naturally inclined to move up the abstraction chain. (No wonder why I am also obsessed with the notion of time.)

Side Note: The more abstract the subject, the less community around it is willing to let you attach your name to your new discoveries. Why? Because the half-life of discoveries at higher levels of abstraction is much longer and therefore your name will live on for a much longer period of time. (i.e. It makes sense to be prudent.) After being trained in mathematics for so many years, I was shocked to see how easily researchers in other fields could “arrogantly” attach their names to basic findings. Later I realized that this behavior was not out of arrogance. These fields were so far away from truth (i.e. operating at very low levels of abstraction) that half-life of discoveries were very short. If you wanted to attach your name to a discovery, mathematics had a high-risk-high-return pay-off structure while these other fields had a low-risk-low-return structure.

But the higher you move up in the abstraction chain, the harder it becomes for you to innovate usefully. There is less room to play around since the objects of study have much fewer properties. Most of the meaningful ideas have already been fleshed out by others who came before you.

In other words, in the realm of ideas, abstraction acts as a lever between probability of longevity and probability of success. If you aim for a higher probability of longevity, then you need to accept the lower probability of success.

That is why abstract subjects are unsuitable for university environments. The pressure of "publish or perish" mentality pushes PhD students towards quick and riskless incremental research. Abstract subjects on the other hand require risky innovative research which may take a long time to unfold and result in nothing publishable.

Now you may be wondering whether the discussion in the previous section is in conflict with the discussion here. How can abstraction be both a process of unlearning and a means for survival? Is not the evolutionary purpose of learning to increase the probability of survival? I would say that it all depends on your time horizon. To survive the immediate future, you need to learn how your local environment operates and truth is not your primary concern. But as your time horizon expands into infinity, what is useful and what is true become indistinguishable, as your environment shuffles through all allowed possibilities.