analogies vs metaphors

“The existence of analogies between central features of various theories implies the existence of a general abstract theory which underlies the particular theories and unifies them with respect to those central features.”
- Eliakim Hastings Moore

Conceptual similarities manifest themselves as analogies, where one recognizes that two structures X and Y have a common meaningful core, say A, which can be pulled up to a higher level. The resulting relationship is symmetric in the sense that the structure A specializes to both X and Y. In other words, one can say either “X is like Y via A” or “Y is like X via A”.

Analogy.png

The analogy get codified in the more general structure A which in turn is mapped back onto X and Y. (I say “onto” because A represents a bigger set than both X and Y.) Discovering A is revelatory in the sense that one recognizes that X and Y are special instances of a more general phenomenon, not disparate structures.

Metaphors play a similar role as analogies. They too increase our total understanding, but unlike analogies, they are not symmetric in nature.

Say there are two structures X and Y where is X is more complex but also more familiar than Y. (In practice, X often happens to be an object we have an intuitive grasp of due to repeated daily interaction.) Discovering a metaphor, say M, involves finding a way of mapping X onto Y. (I say “onto” because X - via M - ends up subsuming Y inside its greater complexity.)

Metaphor.png

The explanatory effect comes from M pulling Y up to the familiar territory of X. All of a sudden, in an almost magical fashion, Y too starts to feel intuitive. Many paradigm shifts in the history of science were due to such discrete jumps. (e.g. Maxwell characterizing the electromagnetic field as a collection of wheels, pulleys and fluids.)

Notice that you want your analogy A to be as faithful as possible, capturing as many essential features of X and Y. If you generalize too much, you will end up with a useless A with no substance. Similarly, for each given Y, you want your metaphor pair (X,M) to be as tight as possible, while not letting X stray away from the domain of the familiar.

You may be wondering what happens if we dualize our approaches in the above two schemes.

  • Analogies. Instead of trying to rise above the pair (X,Y), why not try to go below it? In other words, why not consider specializations that both X and Y map onto, rather than focus on generalizations that map onto X and Y?

  • Metaphors. Instead of trying to approach Y from above, why not try approach it from below? In other words, why not consider metaphors that map the simple into the complex rather than focus on those that map the complex onto the simple?

The answer to both questions is the same: We do not, because the dual constructions do not require any ingenuity, and even if they turn out to be very fruitful, the outcomes do not illuminate the original inputs.

Let me expand on what I mean.

  • Analogies enhance our analytic understanding of the world of ideas. They are tools of the consciousness, which can not deal with the concrete (specialized) concepts head on. For instance, since it is insanely hard to study integers directly, we abstract and study more general concepts such as commutative rings instead. (Even then the challenge is huge. You could devote your whole life to ring theory and still die as confused as a beginner.)

    In the world of ideas, one can easily create more specialized concepts by taking conjunctions of various X’s and Y’s. Studying such concepts may turn out to be very fruitful indeed, but it does not further our understanding of the original X’s and Y’s. For instance, study of Lie Groups is exceptionally interesting, but it does not further our understanding of manifolds or groups.

  • Metaphors enhance our intuitive understanding of the world of things. They are tools of the unconsciousness, which is familiar with what is more immediate, and what is more immediate also happens to be what is more complex. Instruments allow us to probe what is remote from experience, namely the small and the big, and both turn out to be stranger but also simpler than the familiar stuff we encounter in our immediate daily lives.

    • What is smaller than us is simpler because it emerged earlier in the evolutionary history. (Compare atoms and cells to humans.)

    • What is bigger than us is simpler because it is an inanimate aggregate rather than an emergent life. (Those galaxies may be impressive, but their complexity pales in comparison to ours.)

    In the world of things, it is easy to come up with metaphors that map the simple into the complex. For instance, with every new technological paradigm shift, we go back to biology (whose complexity is way beyond anything else) and attack it with the brand new metaphor of the emerging Zeitgeist. During the industrial revolution we conceived the brain as a hydraulic system, which in retrospect sounds extremely naive. Now, during the digital revolution, we are conceiving it as - surprise, surprise - a computational system. These may be productive endeavors, but the discovery of the trigger metaphors itself is a no-brainer.

Now is a good time to make a few remarks on a perennial mystery, namely the mystery of why metaphors work at all.

It is easy to understand why analogies work since we start off with a pair of concepts (X,Y) and use it as a control while moving methodically upwards towards a general A. In the case of metaphors, however, we start off with a single object Y, and then look for a pair (X,M). Why should such a pair exist at all? I believe the answer lies in a combination of the following two quotes.

"We can so seldom declare what a thing is, except by saying it is something else."
- George Eliot

“Subtle is the Lord, but malicious He is not.”
- Albert Einstein

Remember, when Einstein characterized gravitation as curvature, he did not really tell us what gravity is. He just stated something unfamiliar in terms of something familiar. This is how all understanding works. Yes, science is progressing, but all we are doing is just making a bunch of restatements with no end in sight. Absolute truth is not accessible to us mere mortals.

“Truths are illusions which we have forgotten are illusions — they are metaphors that have become worn out and have been drained of sensuous force, coins which have lost their embossing and are now considered as metal and no longer as coins.”
- Friedrich Nietzsche

The reason why we can come up with metaphors of any practical significance is because nature subtly keeps recycling the same types of patterns in different places and at different scales. This is what Einstein means when he says that the Lord is not malicious, and is why nature is open to rational inquiry in the first place.

Unsurprisingly, Descartes himself, the founder of rationalism, was also a big believer in the universality of patterns.

Descartes followed this precept by liberal use of scaled-up models of microscopic physical events. He even used dripping wine vats, tennis balls, and walking-sticks to build up his model of how light undergoes refraction. His statement should perhaps also be taken as evidence of his belief in the universality of certain design principles in the machinery of Nature which he expects to reappear in different contexts. A world in which everything is novel would require the invention of a new science to study every phenomenon. It would possess no general laws of Nature; everything would be a law unto itself.

John D. Barrow - Universe That Discovered Itself (Page 107)

Of course, universality does not make it any easier to discover a great metaphor. It still requires a special talent and a trained mind to intuit one out of the vast number of possibilities.

Finding a good metaphor is still more of an art than a science. (Constructing a good analogy, on the other hand, is more of a science than an art.) Perhaps one day computers will be able to completely automate the search process. (Currently, as I pointed out in a previous blog post, they are horrible at horizontal type of thinking, the type of thinking required for spotting metaphors.) This will result in a disintermediation of mathematical models. In other words, computers will simply map reality back onto itself and push us out of the loop altogether.

Let us wrap up all the key observations we made so far in a single table:

analogies vs metaphors.png

Now let us take a brief detour in metaphysics before we have a one last look at the above dichotomy.

Recall the epistemology-ontology duality:

  • An idea is said to be true when every body obeys to it.

  • A thing is said to be real when every mind agrees to it.

This is a slightly different formulation of the good old mind-body duality.

  • Minds are bodies experienced from inside.

  • Bodies are minds experienced from outside.

While minds and bodies are dynamic entities evolving in time, true ideas and real things reside inside a static Platonic world.

  • Minds continuously shuffle through ideas, looking for the true ones, unable to hold onto any for a long time. Nevertheless truth always seems to be within reach, like a carrot dangling in the front.

  • Minds desperately attach names to phenomena, seeking permanency within the constant flux. Whatever they refer to as a real thing eventually turns out to be unstable and ceases to be.

Hence, the dichotomy between true ideas and real things can be thought of as the (static) Being counterpart of the mind-body duality which resides in (dynamic) Becoming. In fact, it would not be inappropriate to call the totality of all true ideas as God-mind and the totality of all real things as God-body.

Anyway, enough metaphysics. Let us now go back to our original discussion.

In order to find a good metaphor, our minds scan through the X’s that we are already experientially familiar with. The hope is to be able to pump up our intuition about a thing through another thing. Analogies on the other hand help us probe the darkness, and bring into light the previously unseen. Finding a good A is like pulling a rabbit out of a hat, pulling something that was out-of-experience into experience. The process looks as follows.

  1. First you encounter a pair of concepts (X,Y) in the shared public domain, literally composed of ink printed upon a paper or pixels lighting up on a screen.

  2. Your mind internalizes (X,Y) by turning it back to an idea form, hopefully in the fashion that was intended by its originator mind.

  3. You generalize (X,Y) to A within the world of ideas through careful reasoning and aesthetic guidance.

  4. You share A with other minds by turning it into a thing, expressed in a certain language, on a certain medium. (An idea put in a communicable form is essentially a thing that can be experienced by all minds.)

  5. End result is a one more useful concept in the shared public domain.

Analogies lift the iceberg, so to speak, by bringing completely novel ideas into existence and revealing more of the God-mind. In fact, the entirety of our technology, including the technology of reasoning via analogies, can be viewed as a tool for accelerating the transformation of ideas into things. We, and other intermediary minds like us, are the means through which God is becoming more and more aware of itself.

Remember, as time progresses, the evolutionary entities (i.e. minds) decrease in number and increase in size and complexity. Eventually, they get

  • so good at modeling the environment that their ideas start to resemble more and more the true ideas of the God-mind, and

  • so good at controlling the environment that they become increasingly indistinguishable from it and the world of things start to acquire a thoroughly mental character.

In the limit, when the revelation of the God-mind is complete, the number of minds finally dwindles down to one, and the One, now synonymous with the God-mind, dispenses with analogies or metaphors altogether.

  • As nothing seems special any more, the need to project the general onto the special ceases.

  • As nothing feels unfamiliar any more, the need to project the familiar onto the unfamiliar ceases.

Of course, this comes at the expense of time stopping altogether. Weird, right? My personal belief is that revelation will never reach actual completion. Life will hover over the freezing edge of permanency for as long as it can, and at some point, will shatter in such a spectacular fashion that it will have to begin from scratch all over again, just as it had done so last time around.

hypothesis vs data driven science

Science progresses in a dualistic fashion. You can either generate a new hypothesis out of existing data and conduct science in a data-driven way, or generate new data for an existing hypothesis and conduct science in a hypothesis-driven way. For instance, when Kepler was looking at the astronomical data sets to come up with his laws of planetary motion, he was doing data-driven science. When Einstein came up with his theory of General Relativity and asked experimenters to verify the theory’s prediction for the anomalous rate of precession of the perihelion of Mercury's orbit, he was doing hypothesis-driven science.

Similarly, technology can be problem-driven (the counterpart of “hypothesis-driven” in science) or tool-driven (the counterpart of “data-driven” in science). When you start with a problem, you look for what kind of (existing or not-yet-existing) tools you can throw at the problem, in what kind of a combination. (This is similar to thinking about what kind of experiments you can do to generate relevant data to support a hypothesis.) Conversely, when you start with a tool, you try to find a use case which you can deploy it at. (This is similar to starting off with a data set and digging around to see what kind of hypotheses you can extract out of it.) Tool-driven technology development is much more risky and stochastic. It is a taboo for most technology companies, since investors do not like random tinkering and prefer funding problems with high potential economic value and entrepreneurs who “know” what they are doing.

Of course, new tools allow you to ask new kind of questions to the existing data sets. Hence, problem-driven technology (by developing new tools) leads to more data-driven science. And this is exactly what is happening now, at a massive scale. With the development of cheap cloud computing (and storage) and deep learning algorithms, scientists are equipped with some very powerful tools to attack old data sets, especially in complex domains like biology.


Higher Levels of Serendipity

One great advantage of data-driven science is that it involves tinkering and “not really knowing what you are doing”. This leads to less biases and more serendipitous connections, and thereby to the discovery of more transformative ideas and hitherto unknown interesting patterns.

Hypothesis-driven science has a direction from the beginning. Hence surprises are hard to come by, unless you have exceptionally creative intuition capabilities. For instance, the theory of General Relativity was based on one such intuition leap by Einstein. (There has not been such a great leap since then. So it is extremely rare.) Quantum Mechanics on the other hand was literally forced by experimental data. It was so counter intuitive that people refused to believe it. All they could do is turn their intuition off and listen to the data.

Previously data sets were not huge, so scientists could literally eye ball them. Today this is no longer possible. That is why now scientists need computers, algorithms and statistical tools to help them decipher new patterns.

Governments do not give money to scientists so that they can tinker around and do whatever they want. So a scientist applying for a grant needs to know what he is doing. This forces everyone to be in a hypothesis-driven mode from the beginning and thereby leads to less transformative ideas in the long run. (Hat tip to Mehmet Toner for this point.)

Science and technology are polar opposite endeavors. Governments funding science like investors fund technology is a major mistake, and also an important reason why today some of the most exciting science is being done inside closed private companies rather than open academic communities.


Less Democratic Landscape

There is another good reason why the best scientists are leaving the academia. You need good quality data to do science within the data-driven paradigm, and since data is so easily monetizable the largest data sets are being generated by the private companies. So it is not surprising that the most cutting edge research in fields like AI is being done inside companies like Google and Facebook, which also provide the necessary compute power to play around with these data sets.

While hypotheses generation gets better when it is conducted in a decentralized open manner, the natural tendency of data is to be centralized under one roof where it can be harmonized and maintained consistently at a high quality. As they say, “data has gravity”. Once you pass certain critical thresholds, data starts generating strong positive feedback effects and thereby attracts even more data. That is why investors love it. Using smart data strategies, technology companies can build a moat around themselves and render their business models a lot more defensible.

In a typical private company, what data scientists do is to throw thousands of different neural networks at some massive internal data sets and simply observe which one gets the job done better. This of course is empiricism in its purest form, not any different than blindly screening millions of compounds during a drug development process. As they say, just throw it against a wall and see if it sticks.

This brings us to a major problem about big-data-driven science.


Lack of Deep Understanding

There is now a better way. Petabytes allow us to say: "Correlation is enough." We can stop looking for models. We can analyze the data without hypotheses about what it might show. We can throw the numbers into the biggest computing clusters the world has ever seen and let statistical algorithms find patterns where science cannot.

Chris Anderson - The End of Theory

We can not understand the complex machine learning models we are building. In fact, we train them the same way one trains a dog. That is why they are called black-box models. For instance, when the stock market experiences a flash crash we blame the algorithms for getting into a stupid loop, but we never really understand why they do so.

Is there any problem with this state of affairs if these models get the job done, make good predictions and (even better) earn us money? Can not scientists adopt the same pragmatic attitude of technologists and focus on results only, and suffice with successful manipulation of nature and leave true understanding aside? Are not the data sizes already too huge for human comprehension anyway? Why do we expect machines to be able to explain their thought processes to us? Perhaps they are the beginnings of the formation of a higher level life form, and we should learn to trust them about the activities they are better at than us?

Perhaps we have been under an illusion all along and our analytical models have never really penetrated that deep in to the nature anyway?

Closed analytic solutions are nice, but they are applicable only for simple configurations of reality. At best, they are toy models of simple systems. Physicists have known for centuries that the three-body problem or three dimensional Navier Stokes do not afford a closed form analytic solutions. This is why all calculations about the movement of planets in our solar system or turbulence in a fluid are all performed by numerical methods using computers.

Carlos E. Perez - The Delusion of Infinite Precision Numbers

Is it a surprise that as our understanding gets more complete, our equations become harder to solve?

To illustrate this point of view, we can recall that as the equations of physics become more fundamental, they become more difficult to solve. Thus the two-body problem of gravity (that of the motion of a binary star) is simple in Newtonian theory, but unsolvable in an exact manner in Einstein’s Theory. One might imagine that if one day the equations of a totally unified field are written, even the one-body problem will no longer have an exact solution!

Laurent Nottale - The Relativity of All Things (Page 305)

It seems like the entire history of science is a progressive approximation to an immense computational complexity via increasingly sophisticated (but nevertheless quiet simplistic) analytical models. This trend obviously is not sustainable. At some point we should perhaps just stop theorizing and let the machines figure out the rest:

In new research accepted for publication in Chaos, they showed that improved predictions of chaotic systems like the Kuramoto-Sivashinsky equation become possible by hybridizing the data-driven, machine-learning approach and traditional model-based prediction. Ott sees this as a more likely avenue for improving weather prediction and similar efforts, since we don’t always have complete high-resolution data or perfect physical models. “What we should do is use the good knowledge that we have where we have it,” he said, “and if we have ignorance we should use the machine learning to fill in the gaps where the ignorance resides.”

Natalie Wolchover - Machine Learning’s ‘Amazing’ Ability to Predict Chaos

Statistical approaches like machine learning have often been criticized for being dumb. Noam Chomsky has been especially vocal about this:

You can also collect butterflies and make many observations. If you like butterflies, that's fine; but such work must not be confounded with research, which is concerned to discover explanatory principles.

- Noam Chomsky as quoted in Colorless Green Ideas Learn Furiously

But these criticisms are akin to calling reality itself dumb since what we feed into the statistical models are basically virtualized fragments of reality. Analytical models conjure up abstract epi-phenomena to explain phenomena, while statistical models use phenomena to explain phenomena and turn reality directly onto itself. (The reason why deep learning is so much more effective than its peers among machine learning models is because it is hierarchical, just like the reality is.)

This brings us to the old dichotomy between facts and theories.


Facts vs Theories

Long before the computer scientists came into the scene, there were prominent humanists (and historians) fiercely defending fact against theory.

The ultimate goal would be to grasp that everything in the realm of fact is already theory... Let us not seek for something beyond the phenomena - they themselves are the theory.

- Johann Wolfgang von Goethe

Reality possesses a pyramid-like hierarchical structure. It is governed from the top by a few deep high-level laws, and manifested in its utmost complexity at the lowest phenomenological level. This means that there are two strategies you can employ to model phenomena.

  • Seek the simple. Blow your brains out, discover some deep laws and run simulations that can be mapped back to phenomena.

  • Bend the complexity back onto itself. Labor hard to accumulate enough phenomenological data and let the machines do the rote work.

One approach is not inherently superior to the other, and both are hard in their own ways. Deep theories are hard to find, and good quality facts (data) are hard to collect and curate in large quantities. Similarly, a theory-driven (mathematical) simulation is cheap to set up but expensive to run, while a data-driven (computational) simulation (of the same phenomena) is cheap to run but expensive to set up. In other words, while a data-driven simulation is parsimonious in time, a theory-driven simulation is parsimonious in space. (Good computational models satisfy a dual version of Occam’s Razor. They are heavy in size, with millions of parameters, but light to run.)

Some people try mix the two philosophies, inject our causal models into the machines and enjoy the best of both worlds. I believe that this approach is fundamentally mistaken, even if it proves to be fruitful in the short-run. Rather than biasing the machines with our theories, we should just ask them to economize their own thought processes and thereby come up with their own internal causal models and theories. After all, abstraction is just a form of compression, and when we talk about causality we (in practice) mean causality as it fits into the human brain. In the actual universe, everything is completely interlinked with everything else, and causality diagrams are unfathomably complicated. Hence, we should be wary of pre-imposing our theories on machines whose intuitive powers will soon surpass ours.

Remember that, in biological evolution, the development of unconscious (intuitive) thought processes came before the development of conscious (rational) thought processes. It should be no different for the digital evolution.

Side Note: We suffered an AI winter for mistakenly trying to flip this order and asking machines to develop rational capabilities before developing intuitional capabilities. When a scientist comes up with hypothesis, it is a simple effable distillation of an unconscious intuition which is of ineffable, complex statistical form. In other words, it is always “statistics first”. Sometimes the progression from the statistical to the causal takes place out in the open among a community of scientists (as happened in the smoking-causes-cancer research), but more often it just takes place inside the mind of a single scientist.


Continuing Role of the Scientist

Mohammed AlQuraishi, a researcher who studies protein folding, wrote an essay exploring a recent development in his field: the creation of a machine-learning model that can predict protein folds far more accurately than human researchers. AlQuiraishi found himself lamenting the loss of theory over data, even as he sought to reconcile himself to it. “There’s far less prestige associated with conceptual papers or papers that provide some new analytical insight,” he said, in an interview. As machines make discovery faster, people may come to see theoreticians as extraneous, superfluous, and hopelessly behind the times. Knowledge about a particular area will be less treasured than expertise in the creation of machine-learning models that produce answers on that subject.

Jonathan Zittrain - The Hidden Costs of Automated Thinking

The role of scientists in the data-driven paradigm will obviously be different but not trivial. Today’s world-champions in chess are computer-human hybrids. We should expect the situation for science to be no different. AI is complementary to human intelligence and in some sense only amplifies the already existing IQ differences. After all, a machine-learning model is only as good as the intelligence of its creator.

He who loves practice without theory is like the sailor who boards ship without a rudder and compass and never knows where he may cast.

- Leonardo da Vinci

Artificial intelligence (at least in its today’s form) is like a baby. Either it can be spoon-fed data or it gorges on everything. But, as we know, what makes great minds great is what they choose not to consume. This is where the scientists come in.

Deciding what experiments to conduct, what data sets to use are no trivial tasks. Choosing which portion of reality to “virtualize” is an important judgment call. Hence all data efforts are inevitably hypothesis-laden and therefore non-trivially involve the scientist.

For 30 years quantitative investing started with a hypothesis, says a quant investor. Investors would test it against historical data and make a judgment as to whether it would continue to be useful. Now the order has been reversed. “We start with the data and look for a hypothesis,” he says.

Humans are not out of the picture entirely. Their role is to pick and choose which data to feed into the machine. “You have to tell the algorithm what data to look at,” says the same investor. “If you apply a machine-learning algorithm to too large a dataset often it tends to revert to a very simple strategy, like momentum.”

The Economist - March of the Machines

True, each data generation effort is hypothesis-laden and each scientist comes with a unique set of biases generating a unique set of judgment calls, but at the level of the society, these biases get eventually washed out through (structured) randomization via sociological mechanisms and historical contingencies. In other words, unlike the individual, the society as a whole operates in a non-hypothesis-laden fashion, and eventually figures out the right angle. The role (and the responsibility) of the scientist (and the scientific institutions) is to cut the length of this search period as short as possible by simply being smart about it, in a fashion that is not too different from how enzymes speed up chemical reactions by lowering activation energy costs. (A scientist’s biases are actually his strengths since they implicitly contain lessons from eons of evolutionary learning. See the side note below.)

Side Note: There is this huge misunderstanding that evolution progresses via chance alone. Pure randomization is a sign of zero learning. Evolution on the other hand learns over time and embeds this knowledge in all complexity levels, ranging all the way from genetic to cultural forms. As the evolutionary entities become more complex, the search becomes smarter and the progress becomes faster. (This is how protein synthesis and folding happen incredibly fast within cells.) Only at the very beginning, in its most simplest form, does evolution try out everything blindly. (Physics is so successful because its entities are so stupid and comparatively much easier to model.) In other words, the commonly raised argument against the possibility of evolution achieving so much based on pure chance alone is correct. As mathematician Gregory Chaitin points out, “real evolution is not at all ergodic, since the space of all possible designs is much too immense for exhaustive search”.

Another venue where the scientists keep playing an important role is in transferring knowledge from one domain to another. Remember that there are two ways of solving hard problems: Diving into the vertical (technical) depths and venturing across horizontal (analogical) spaces. Machines are horrible at venturing horizontally precisely because they do not get to the gist of things. (This was the criticism of Noam Chomsky quoted above.)

Deep learning is kind of a turbocharged version of memorization. If you can memorize all that you need to know, that’s fine. But if you need to generalize to unusual circumstances, it’s not very good. Our view is that a lot of the field is selling a single hammer as if everything around it is a nail. People are trying to take deep learning, which is a perfectly fine tool, and use it for everything, which is perfectly inappropriate.

- Gary Marcus as quoted in Warning of an AI Winter


Trends Come and Go

Generally speaking, there is always a greater appetite for digging deeper for data when there is a dearth of ideas. (Extraction becomes more expensive as you dig deeper, as in mining operations.) Hence, the current trend of data-driven science is partially due to the fact that scientists themselves have ran out of sensible falsifiable hypotheses. Once the hypothesis space becomes rich again, the pendulum will inevitably swing back. (Of course, who will be doing the exploration is another question. Perhaps it will be the machines, and we will be doing the dirty work of data collection for them.)

As mentioned before, data-driven science operates stochastically in a serendipitous fashion and hypothesis-driven science operates deterministically in a directed fashion. Nature on the other hand loves to use both stochasticity and determinism together, since optimal dynamics reside - as usual - somewhere in the middle. (That is why there are tons of natural examples of structured randomnesses such as Levy Flights etc.) Hence we should learn to appreciate the complementarity between data-drivenness and hypothesis-drivenness, and embrace the duality as a whole rather than trying to break it.


If you liked this post, you will also enjoy the older post Genius vs Wisdom where genius and wisdom are framed respectively as hypothesis-driven and data-driven concepts.

physics as study of ignorance

Contemporary physics is based on the following three main sets of principles:

  1. Variational Principles

  2. Statistical Principles

  3. Symmetry Principles

Various combinations of these principles led to the birth of the following fields:

  • Study of Classical Mechanics (1)

  • Study of Statistical Mechanics (2)

  • Study of Group and Representation Theory (3)

  • Study of Path Integrals (1 + 2)

  • Study of Gauge Theory (1 + 3)

  • Study of Critical Phenomena (2 + 3)

  • Study of Quantum Field Theory (1 + 2 + 3)

Notice that all three sets of principles are based on ignorances that arise from us being inside the structure we are trying to describe. 

  1. Variational Principles arise due to our inability to experience time as a continuum. (Path information is inaccessible.)

  2. Statistical Principles arise due to our inability to experience space as a continuum. (Coarse graining is inevitable.)

  3. Symmetry Principles arise due to our inability to experience spacetime as a whole.  (Transformations are undetectable.)

Since Quantum Field Theory is based on all three principles, it seems like most of the structure we see arises from these sets of ignorances themselves. From the hypothetical outside point of view of God, none of these ignorances are present and therefore none of the entailed structures are present neither.

Study of physics is not complete yet, but its historical progression suggests that its future depends on us discovering new aspects of our ignorances:

  1. Variational Principles were discovered in the 18th Century.

  2. Statistical Principles were discovered in the 19th Century.

  3. Symmetry Principles were discovered in the 20th Century.

The million dollar question is what principle we will discover in the 21st Century. Will it help us merge General Relativity with Quantum Field Theory or simply lead to the birth of brand new fields of study?

emergence of life

Cardiac rhythm is a good example of a network that includes DNA only as a source of protein templates, not as an integral part of the oscillation network. If proteins were not degraded and needing replenishment, the oscillation could continue indefinitely with no involvement of DNA...

Functional networks can therefore float free, as it were, of their DNA databases. Those databases are then used to replenish the set of proteins as they become degraded. That raises several more important questions. Which evolved first: the networks or the genomes? As we have seen, attractors, including oscillators, form naturally within networks of interacting components, even if these networks start off relatively uniform and unstructured. There is no DNA, or any equivalent, for a spiral galaxy or for a tornado. It is very likely, therefore, that networks of some kinds evolved first. They could have done so even before the evolution of DNA. Those networks could have existed by using RNA as the catalysts. Many people think there was an RNA world before the DNA-protein world. And before that? No one knows, but perhaps the first networks were without catalysts and so very slow. Catalysts speed-up reactions. They are not essential for the reaction to occur. Without catalysts, however, the processes would occur extremely slowly. It seems likely that the earliest forms of life did have very slow networks, and also likely that the earliest catalysts would have been in the rocks of the Earth. Some of the elements of those rocks are now to be found as metal atoms (trace elements) forming important parts of modern enzymes.

Noble - Dance to the Tune of Life (Pages 83, 86)

Darwin unlocked evolution by understanding its slow nature. (He was inspired by the recent geological discoveries indicating that water - given enough time - can carve out entire canyons.) Today we are still under the influence of a similar Pre-Darwinian bias. Just as we were biased in favor of fast changes (and could not see the slow moving waves of evolution), we are biased in favor of fast entities. (Of course, what is fast or slow is defined with respect to the rate of our own metabolisms.) For instance, we get surprised when we see a fast-forwarded video of growing plants, because we equate life with motion and regard slow moving life forms as inferior.

Evolution favors the fast and therefore life is becoming increasingly faster at an increasingly faster rate. Imagine catalyzed reactions, myelinated neurons etc. Replication is another such accelerator technology. Although we tend to view it as a must-have quality of life, what is really important for the definition of life is repeating "patterns” and such patterns can emerge without any replication mechanisms. In other words, what matters is persistence. Replication mechanisms speed up the evolution of new forms of persistence. That is all. Let me reiterate again: Evolution has only two ingredients, constant variation and constant selection. (See Evolution as a Physical Theory post) Replication is not fundamental.

Unfortunately most people still think that replicators came first and led to the emergence of functional (metabolic) networks later, although this order is extremely unlikely since replicators have an error-correction problem and need supportive taming mechanisms (e.g. metabolic networks) right from the start.

In our present state of ignorance, we have a choice between two contrasting images to represent our view of the possible structure of a creature newly emerged at the first threshold of life. One image is the replicator model of Eigen, a molecular structure tightly linked and centrally controlled, replicating itself with considerable precision, achieving homeostasis by strict adherence to a rigid pattern. The other image is the "tangled bank" of Darwin, an image which Darwin put at the end of his Origin of Species to make vivid his answer to the question, What is Life?, an image of grasses and flowers and bees and butterflies growing in tangled profusion without any discernible pattern, achieving homeostasis by means of a web of interdependences too complicated for us to unravel.

The tangled bank is the image which I have in mind when I try to imagine what a primeval cell would look like. I imagine a collection of molecular species, tangled and interlocking like the plants and insects in Darwin's microcosm. This was the image which led me to think of error tolerance as the primary requirement for a model of a molecular population taking its first faltering steps toward life. Error tolerance is the hallmark of natural ecological communities, of free market economies and of open societies. I believe it must have been a primary quality of life from the very beginning. But replication and error tolerance are naturally antagonistic principles. That is why I like to exclude replication from the beginnings of life, to imagine the first cells as error-tolerant tangles of non-replicating molecules, and to introduce replication as an alien parasitic intrusion at a later stage. Only after the alien intruder has been tamed, the reconciliation between replication and error tolerance is achieved in a higher synthesis, through the evolution of the genetic code and the modern genetic apparatus.

The modern synthesis reconciles replication with error tolerance by establishing the division of labor between hardware and software, between the genetic apparatus and the gene. In the modem cell, the hardware of the genetic apparatus is rigidly controlled and error-intolerant. The hardware must be error-intolerant in order to maintain the accuracy of replication. But the error tolerance which I like to believe inherent in life from its earliest beginnings has not been lost. The burden of error tolerance has merely been transferred to the software. In the modern cell, with the infrastructure of hardware firmly in place and subject to a strict regime of quality control, the software is free to wander, to make mistakes and occasionally to be creative. The transfer of architectural design from hardware to software allowed the molecular architects to work with a freedom and creativity which their ancestors before the transfer could never have approached.

Dyson - Infinite in All Directions (Pages 92-93)

Notice how Dyson frames replication mechanisms as stabilizers allowing metabolic networks to take even further risks. In other words, replication not only speeds up evolution but also enlarges the configuration space for it. So we see not only more variation per second but also more variation at any given time.

Going back to our original question…

Life was probably unimaginably slow at the beginning. In fact, such life forms are probably still out there. Are spiral galaxies alive for instance? What about the entire universe? We may be just too local and too fast to see the grand patterns.

As Noble points out in the excerpt above, our bodies contain catalyst metals which are remnants of our deep past. Those metals were forged inside stars far away from us and shot across the space via supernova explosions. (This is how all heavy atoms in the universe got formed.) In other words, they used to be participants in vast-scale metabolic networks.

In some sense, life never emerged. It was always there to begin with. It is just speeding up over time and thereby life forms of today are becoming blind to life form of deep yesterdays.

It is really hard not to be mystical about all this. Have you ever felt bad about disrupting repeating patterns for instance, no matter how physical they are? You can literally hurt such patterns. They are the most embryonic forms of life, some of which are as old as those archaic animals who still hang around in the deep oceans. Perhaps we should all work a little on our artistic sensitivities which would in turn probably give rise to a general increase in our moral sensitivities.


How Fast Will Things Get?

Life is a nested hierarchy of complexity layers and the number of these layers increases overtime. We are already forming many layers above ourselves, the most dramatic of which is the entirety of our technological creations, namely what Kevin Kelly calls as Technium.

Without doubt, we will look pathetically slow for the newly emerging electronic forms of life. Just as we have a certain degree of control over the slow-moving plants, they too (will need us but also) harvest us for their own good. (This is already happening as we are becoming more and more glued to our screens.)

But how much faster will things eventually get?

According to the generally accepted theories, our universe started off with a big bang and went through a very fast evolution that resulted in a sudden expansion of space. While physics has since been slowing down, biology (including new electronic forms) is picking up speed at a phenomenal rate.

Of all the sustainable things in the universe, from a planet to a star, from a daisy to an automobile, from a brain to an eye, the thing that is able to conduct the highest density of power - the most energy flowing through a gram of matter each second - lies at the core of your laptop.

Kelly - What Technology Wants (Page 59)

Evolution seems to be taking us to a very strange end, an end that seems to contain life forms that exhibit features that are very much like those exhibited by the beginning states of physics, extreme speed and density. (I had brought up this possibility at the end of Evolution as a Physical Theory post as well.)

Of course, flipping this logic, the physical background upon which life is currently unfolding is probably alive as well. I personally believe that this indeed is the case. To understand what I mean, we will first need to make an important conceptual clarification and then dive into Quantum Mechanics.



Autonomy as the Flip-Side of Control

Autonomy and control are two sides of the same coin, just like one man's freedom fighter is always another man's terrorist. In particular, what we can not exert any control over looks completely autonomous to us.

But how do you measure autonomy?

Firstly, notice that autonomy is a relative concept. In other words, nothing can be autonomous in and of itself. Secondly, the degree of autonomy correlates with the degree of unanticipatability. For instance, something will look completely autonomous to you only if you can not model its behavior at all. But how would such a behavior literally look like, any guesses? Yes, that is right, it would look completely random.

Random often means inability to predict... A random series should show no discernible pattern, and if one is perceived then the random nature of the series is denied. However, the inability to discern a pattern is no guarantee of true randomness, but only a limitation of the ability to see a pattern... A series of ones and noughts may appear quite random for use as a sequence against which to compare the tossing of a coin, head equals one, tails nought, but it also might be the binary code version of a well known song and therefore perfectly predictable and full of pattern to someone familiar with binary notation.

Shallis - On Time (Pages 122-124)

The fact that randomness is in the eye of the beholder (and that absolute randomness is an ill-defined notion) is the central tenet of Bayesian school of probability. The spirit is also similar to how randomness is defined in algorithmic complexity theory, which I do not find surprising at all since computer scientists are empiricists at heart.

Kolmogorov randomness defines a string (usually of bits) as being random if and only if it is shorter than any computer program that can produce that string. To make this precise, a universal computer (or universal Turing machine) must be specified, so that "program" means a program for this universal machine. A random string in this sense is "incompressible" in that it is impossible to "compress" the string into a program whose length is shorter than the length of the string itself. A counting argument is used to show that, for any universal computer, there is at least one algorithmically random string of each length. Whether any particular string is random, however, depends on the specific universal computer that is chosen.

Wikipedia - Kolmogorov Complexity

Here a completely different terminology is used to say basically the same thing:

  • “compressibility” = “explanability” = “anticipatability”

  • “randomness can only be defined relative to a specific choice of a universal computer” = “randomness is in the eye of the beholder”



Quantum Autonomy

Quantum Mechanics has randomness built into its very foundations. Whether this randomness is absolute or the theory itself is currently incomplete is not relevant. There is a maximal degree of unanticipatability (i.e. autonomy) in Quantum Mechanics and it is practically uncircumventable. (Even the most deterministic interpretations of Quantum Mechanics lean back on artificially introduced stochastic background fields.)

Individually quantum collapses are completely unpredictable, but collectively they exhibit a pattern over time. (For more on such structured forms of randomness, read this older blog post.) This is actually what allows us to tame the autonomy of quantum states in practice: Although we can not exert any control over them at any point in time, we can control their behavior over a period of time. Of course, as life evolves and gets faster (as pointed out in the beginning of this post), it will be able to probe time periods at more and more frequent rates and thereby tighten its grip on quantum phenomena increasingly more.

Another way to view maximal unanticipatability is to frame it as maximal complexity. Remember that every new complexity layer emerges through a complexification process. Once a functional network with a boundary becomes complex enough, it starts to behave more like an “actor” with teleological tendencies. Once it becomes ubiquitous enough, it starts to display an ensemble-behavior of its own, forming a higher layer of complexity and hiding away its own internal complexities. All fundamentally unanticipatable phenomena in nature are instances of such actors who seem to have a sense of unity (a form of consciousness?) that they “want” to preserve.

Why should quantum phenomena be an exception? Perhaps Einstein was right and God does not play dice, and that there are experimentally inaccessible deeper levels of reality from which quantum phenomena emerge? (Bohm was also thinking this way.) Perhaps it is turtles all the way down (and up)?

Universe as a Collection of Nested Autonomies

Fighting for power is the same thing as fighting for control, and gaining control of something necessitates outgrowing the complexity of that thing. That is essentially why life is becoming more complex and autonomous over time.

Although each complexity layer can accommodate a similar level of maximal complexity within itself before starting to spontaneously form a new layer above itself, due to the nested nature of these layers, total complexity rises as new layers emerge. (e.g. We are more complex than our cells since we contain their complexity as well.)

It is not surprising that social sciences are much less successful than natural sciences. Humans are not that great at modeling other humans. This is expected. You need to out-compete in complexity what you desire to anticipate. Each layer can hope to anticipate only the layers below it. Brains are not complex enough to understand themselves. (It is amazing how we equate smartness with the ability to reason about lower layers like physics, chemistry etc. Social reasoning is actually much more sophisticated, but we look down on it since we are naturally endowed with it.)

Side Note: Generally speaking, each layer can have generative effects only upwards and restrictive effects only downwards. Generative effects can be bad for you as in having cancer cells and restrictive effects can be good for you as in having a great boss. Generative effects may falsely look restrictive in the sense that what generates you locks you in form, but it is actually these effects themselves which enable the exploration of the form space in the first place. Think at a population level, not at an individual level. Truth resides there.

Notice that as you move up to higher levels, autonomy becomes harder to describe. Quantum Mechanics, which currently seems to be the lowest level of autonomy, is open to mathematical scrutiny, but higher levels can only be simulated via computational methods and are not analytically accessible.

I know, you want to ask “What about General Relativity? It describes higher level phenomena.” My answer to that would be “No, it does not.”

General Relativity does not model a higher level complexity. It may be very useful today but it will become increasingly irrelevant as life dominates the universe. As autonomy levels increase all over, trying to predict galactic dynamics with General Relativity will be as funny and futile as using Fluid Dynamics to predict the future carbon dioxide levels in the atmosphere without taking into consideration the role of human beings. General Relativity models the aggregate dynamics of quantum “decisions” made at the lowest autonomy level. (We refer to this level-zero as “physics”.) It is predictive as long as higher autonomy levels do not interfere.

God as the Highest Level of Autonomy

The universe shows evidence of the operations of mind on three levels. The first level is elementary physical processes, as we see them when we study atoms in the laboratory. The second level is our direct human experience of our own consciousness. The third level is the universe as a whole. Atoms in the laboratory are weird stuff, behaving like active agents rather than inert substances. They make unpredictable choices between alternative possibilities according to the laws of quantum mechanics. It appears that mind, as manifested by the capacity to make choices, is to some extent inherent in every atom. The universe as a whole is also weird, with laws of nature that make it hospitable to the growth of mind. I do not make any clear distinction between mind and God. God is what mind becomes when it has passed beyond the scale of our comprehension. God may be either a world-soul or a collection of world-souls. So I am thinking that atoms and humans and God may have minds that differ in degree but not in kind. We stand, in a manner of speaking, midway between the unpredictability of atoms and the unpredictability of God. Atoms are small pieces of our mental apparatus, and we are small pieces of God's mental apparatus. Our minds may receive inputs equally from atoms and from God.

Freeman Dyson - Progress in Religion

I remember the moment when I ran into this exhilarating paragraph of Dyson. It was so relieving to find such a high-caliber thinker who also interprets quantum randomness as choice-making. Nevertheless, with all due respect, I would like to clarify two points that I hope will help you understand Dyson’s own personal theology from the point of view of the philosophy outlined in this post.

  • There are many many levels of autonomies. Dyson points out only the most obvious three. (He calls them “minds” rather than autonomies.)

    • Atomic. Quantum autonomy is extremely pure and in your face.

    • Human. A belief in our own autonomy comes almost by default.

    • Cosmic. Universe as a whole feels beyond our understanding.

  • Dyson defines God as “what mind becomes when it has passed beyond the scale of our comprehension” and then he refers to the entirety of the universe as God as well. I on the other hand would have defined God as the top level autonomy and not referred to human beings or the universe at all, for the following two reasons:

    • God should not be human centric. Each level should be able to talk about its own God. (There are many things out there that would count you as part of their God.)

      • Remember that the levels below you can exert only generative efforts towards you. It is only the above-levels that can restrict you. In other words, God is what constraints you. Hence, striving for freedom is equivalent to striving for Godlessness. (It is no surprise that people turn more religious when they are physically weak or mentally susceptible.) Of course, complete freedom is an unachievable fantasy. What makes humans human is the nurturing (i.e. controlling) cultural texture they are born into. In fact, human babies can not even survive without a minimal degree of parental and cultural intervention. (Next time you look into your parents’ eyes, remember that part of your God resides in there.) Of course, we also have a certain degree of freedom in choosing what to be governed by. (Some let money govern them for instance.) At the end of the day, God is a social phenomenon. Every single higher level structure we create (e.g. governments selected by our votes, algorithms trained on our data) governs us back. Even the ideas and feelings we restrict ourselves by arise via our interactions with others and do not exist in a vacuum.

    • Most of the universe currently seems to exhibit only the lowest level of autonomy. Not everywhere is equally alive.

      • However, as autonomy reaches higher levels, it will expand in size as well, due to the nested and expansionary nature of complexity generation. (Atomic autonomy lacks extensiveness in the most extreme sense.) So eventually the top level autonomy should grow in size and seize the whole of reality. What happens then? How can such an unfathomable entity exercise control over the entire universe, including itself? Is not auto-control paradoxical in the sense that one can not out-compete in complexity oneself? We should not expect to be able to answer such tough questions, just like we do not expect a stomach cell to understand human consciousness. Higher forms of life will be wildly different and smarter than us. (For instance, I bet that they will be able to manipulate the spacetime fabric which seems to be an emergent phenomenon.) In some sense, it is not surprising that there is such a proliferation of religions. God is meant to be beyond our comprehension.

Four men, who had been blind from birth, wanted to know what an elephant was like; so they asked an elephant-driver for information. He led them to an elephant, and invited them to examine it; so one man felt the elephant's leg, another its trunk, another its tail and the fourth its ear. Then they attempted to describe the elephant to one another. The first man said ”The elephant is like a tree”. ”No,” said the second, ”the elephant is like a snake“. “Nonsense!” said the third, “the elephant is like a broom”. ”You are all wrong,” said the fourth, ”the elephant is like a fan”. And so they went on arguing amongst themselves, while the elephant stood watching them quietly.

- The Indian folklore story of the blind men and the elephant, as adapted from E. J. Robinson’s Tales and Poems of South India by P. T. Johnstone in the Preface of Sketches of an Elephant

genius vs wisdom

Genius maxes out upon birth and gradually diminishes. Wisdom displays the opposite dynamics. It is nonexistent at birth and gradually builds up until death. That is why genius is often seen as a potentiality and wisdom as an actuality. (Youth have potentiality, not the old.)

Midlife crises tend to occur around the time when wisdom surpasses genius. That is why earlier maturation correlates with earlier “mid” life crisis. (On the other hand, greater innate genius does not result in a delayed crisis since it entails faster accumulation of wisdom.)


"Every child is an artist. The problem is how to remain an artist once we grow up."
- Pablo Picasso

Here Picasso is actually asking you to maintain your genius at the expense of gaining less wisdom. That is why creative folks tend to be quite unwise folks (and require the assistance of experienced talent managers to succeed in the real world). They methodologically wrap themselves inside protective environments that allow them to pause or postpone their maturation.

Generally speaking, the greater control you have over your environment, the less wisdom you need to survive. That is why wisest people originate from low survival-rate tough conditions, and rich families have hard time raising unspoiled kids without simulating artificial scarcities. (Poor folks have the opposite problem and therefore simulate artificial abundances by displaying more love, empathy etc.)


"Young man knows the rules and the old man knows the exceptions."
- Oliver Wendell Holmes Sr.

Genius is hypothesis-driven and wisdom is data-driven. That is why mature people tend to prefer experimental (and historical) disciplines, young people tend to dominate theoretical (and ahistorical) disciplines etc.

The old man can be rigid but he can also display tremendous cognitive fluidity because he can transcend the rules, improvise and dance around the set of exceptions. In fact, he no longer thinks of the exceptions as "exceptions" since an exception can only be defined with respect to a certain collection of rules. He directly intuits them as unique data points and thus is not subject to the false positives generated by operational definitions. (The young man on the other hand has not explored the full territory of possibilities yet and thus needs a practical guide no matter how crude.)

Notice that the old man can not transfer his knowledge of exceptions to the young man because that knowledge is in the form of an ineffable complex neural network that has been trained on tons of data. (Apprentice-master relationships are based on mimetic learning.) Rules on the other hand are much more transferable since they are of linguistic nature. (They are not only transferable but also a lot more compact in size, compared to the set of exceptions.) Of course, the fact that rules are transferable does not mean that the transfers actually occur! (Trivial things are deemed unworthy by the old man and important things get ignored by the young man. It is only the stuff in the middle that gets successfully transferred.)

Why is it much harder for old people to change their minds? Because wisdom is data-driven, and in a data-driven world, bugs (and biases) are buried inside large data sets and therefore much harder to find and fix. (In a hypothesis driven world, all you need to do is to go through the much shorter list of rules, hypotheses etc.)


The Hypothesis-Data duality highlighted in the previous section can be recast as young people being driven more by rational thinking vs. old people being driven more by intuitional thinking. (In an older blog post, we had discussed how education should focus on cultivating intuition, which leads to a superior form of thinking.)

We all start out life with a purely intuitive mindset. As we learn we come up with certain heuristics and rules, resulting in an adulthood that is dominated by rationality. Once we accumulate enough experience (i.e. data), we get rid of these rules and revert back to an intuitive mindset, although at a higher level than before. (That is why the old get along very well with kids.)

Artistic types (e.g. Picasso) tend to associate genius with the tabula-rasa intuitive fluidity of the newborn. Scientific types tend to associate it with the rationalistic peak of adulthood. (That is why they start to display insecurities after they themselves pass through this peak.)

As mentioned in the previous section, rules are easily transferable across individuals. Results of intuitive thinking on the other hand are non-transferable. From a societal point of view, this is a serious operational problem and the way it is overcome is through a mechanism called “trust”. Since intuition is a black box (like all machine learning models are), the only way you can transfer it is through a wholesome imitation of the observed input-outputs. (i.e. mimetic learning) In other words, you can not understand black box models, you can only have faith in them.

As we age and become more intuition-driven, our trust in trust increases. (Of course, children are dangerously trustworthy to begin with.) Adulthood on the other hand is dominated by rational thinking and therefore corresponds to the period when we are most distrustful of each other. (No wonder why economists are such distrustful folks. They always model humans as ultra-rationalistic machines.)

Today we vastly overvalue the individual over the society, and the rational over the intuitional. (Just look at how we structure school curriculums.) We decentralized society and trivialized the social fabric by centralizing trust. (Read the older blogpost Blockchain and Decentralization) We no longer trust each other because we simply do not have to. Instead we trust the institutions that we collectively created. Our analytical frameworks have reached an individualist zenith in Physics which is currently incapable of guaranteeing the reality of other peoples’ points of view. (Read the older blogpost Reality and Analytical Inquiry) We banished faith completely from public discourse and have even demanded God to be verifiable.

In short, we seem to be heading to the peak adulthood phase of humanity, facing a massive mid-life crisis. Our collective genius has become too great for our own good.

In this context, the current rise of data-driven technological paradigms is not surprising. Humanity is entering a new intuitive post-midlife-crisis phase. Our collective wisdom is now being encoded in the form of disembodied black-box machine-learning models which will keep getting more and more sophisticated over time. (At some point, we may dispense with our analytical models altogether.) Social fabric on the other hand will keep being stretched as more types of universally-trusted centralized nodes emerge and enable new forms of indirect intuition transfer.

Marx was too early. He viewed socialism in a human way as a rationalistic inevitability, but it will probably arrive in an inhuman fashion via intuitionistic technologies. (Calling such a system still as socialism will be vastly ironic since it will be resting on complete absence of trust among individuals.) Of course, not every decision making will be centralized. Remember that the human mind itself emerged for addressing non-local problems. (There is still a lot of local decision making going on within our cells etc.) The “hive” mind will be no different, and as usual, deciding whether a problem in the gray zone is local or non-local will be determined through a tug-of-war.

The central problem of ruler-ship, as Scott sees it, is what he calls legibility. To extract resources from a population the state must be able to understand that population. The state needs to make the people and things it rules legible to agents of the government. Legibility means uniformity. States dream up uniform weights and measures, impress national languages and ID numbers on their people, and divvy the country up into land plots and administrative districts, all to make the realm legible to the powers that be. The problem is that not all important things can be made legible. Much of what makes a society successful is knowledge of the tacit sort: rarely articulated, messy, and from the outside looking in, purposeless. These are the first things lost in the quest for legibility. Traditions, small cultural differences, odd and distinctive lifeways … are all swept aside by a rationalizing state that preserves (or in many cases, imposes) only what it can be understood and manipulated from the 2,000 foot view. The result, as Scott chronicles with example after example, are many of the greatest catastrophes of human history.

Tanner Greer - Tradition is Smarter Than You

states vs processes

We think of all dynamical situations as consisting of a space of states and a set of laws codifying how these states are weaved across time, and refer to the actual manifestation of these laws as processes.

Of course, one can argue whether it is sensical to split the reality into states and processes but so far it has been very fruitful to do so.


1. Interchangeability

1.1. Simplicity as Interchangeability of States and Processes

In mathematics, structures (i.e. persisting states) tend to be exactly whatever are preserved by transformations (i.e. processes). That is why Category Theory works, why you can study processes in lieu of states without losing information. (Think of continuous maps vs topological spaces) State and process centric perspectives each have their own practical benefits, but they are completely interchangeable in the sense that both Set Theory (state centric perspective) and Category Theory (process centric perspective) can be taken as the foundation of all of mathematics.

Physics is similar to mathematics. Studying laws is basically the same thing as studying properties. Properties are whatever are preserved by laws and can also be seen as whatever give rise to laws. (Think of electric charge vs electrodynamics) This observation may sound deep, but (as with any deep observation) is actually tautologous since we can study only what does not change through time and only what does not change through time allows us to study time itself. (Study of time is equivalent to study of laws.)

Couple of side-notes:

  • There are no intrinsic (as opposed to extrinsic) properties in physics since physics is an experimental subject and all experiments involve an interaction. (Even mass is an extrinsic property, manifesting itself only dynamically.) Now here is the question that gets to the heart of the above discussion: If there exists only extrinsic properties and nothing else, then what holds these properties? Nothing! This is basically the essence of Radical Ontic Structural Realism and exactly why states and processes are interchangeable in physics. There is no scaffolding.

  • You probably heard about the vast efforts and resources being poured into the validation of certain conjectural particles. Gauge theory tells us that the search for new particles is basically the same thing as the search for new symmetries which are of course nothing but processes.

  • Choi–Jamiołkowski isomorphism helps us translate between quantum states and quantum processes.

Long story short, at the foundational level, states and processes are two sides of the same coin.


1.2. Complexity as Non-Interchangeability of States and Processes

You understand that you are facing complexity exactly when you end up having to study the states themselves along with the processes. In other words, in complex subjects, the interchangeability of state and process centric perspectives start to no longer make any practical sense. (That is why stating a problem in the right manner matters a lot in complex subjects. Right statement is half the solution.)

For instance, in biology, bioinformatics studies states and computational biology studies processes. (Beware that the nomenclature in biology literature has not stabilized yet.) Similarly, in computer science, study of databases (i.e. states) and programs (i.e. processes) are completely different subjects. (You can view programs themselves as databases and study how to generate new programs out of programs. But then you are simply operating in one higher dimension. Philosophy does not change.)

There is actually a deep relation between biology and computer science (similar to the one between physics and mathematics) which was discussed in an older blog post.


2. Persistence

The search for signs of persistence can be seen as the fundamental goal of science. There are two extreme views in metaphysics on this subject:

  • Heraclitus says that the only thing that persists is change. (i.e. Time is real, space is not.)

  • Parmenides says that change is illusionary and that there is just one absolute static unity. (i.e. Space is real, time is not.)

The duality of these points of views were most eloquently pointed out by the physicist John Wheeler, who said "Explain time? Not without explaining existence. Explain existence? Not without explaining time".

Persistences are very important because they generate other persistencies. In other words, they are the building blocks of our reality. For instance, states in biology are complex simply because biology strives to resist change by building persistence upon persistence.


2.1. Invariances as State-Persistences

From a state perspective, the basic building blocks are invariances, namely whatever that do not change across processes.

Study of change involves an initial stage where we give names to substates. Then we observe how these substates change with respect to time. If a substate changes to the point where it no longer fits the definition of being A, we say that substate (i.e. object) A failed to survive. In this sense, study of survival is a subset of study of change. The only reason why they are not the same thing is because our definitions themselves are often imprecise. (From one moment to the next, we say that the river has survived although its constituents have changed etc.)

Of course, the ambiguity here is on purpose. Otherwise without any definiens, you do not have an academic field to speak of. In physics for instance, the definitions are extremely precise, and the study of survival and the study of change completely overlap. In a complex subject like biology, states are so rich that the definitions have to be ambiguous. (You can only simulate the biological states in a formal language, not state a particular biological state. Hence the reason why computer science is a better fit for biology than mathematics.)


2.2. Cycles as Process-Persistences

Processes become state-like when they enter into cyclic behavior. That is why recurrence is so prevalent in science, especially in biology.

As an anticipatory affair, biology prefers regularities and predictabilities. Cycles are very reliable in this sense: They can be built on top of each other, and harnessed to record information about the past and to carry information to the future. (Even behaviorally we exploit this fact: It is easier to construct new habits by attaching them to old habits.) Life, in its essence, is just a perpetuation of a network of interacting ecological and chemical cycles, all of which can be traced back to the grand astronomical cycles.

Prior studies have reported that 15% of expressed genes show a circadian expression pattern in association with a specific function. A series of experimental and computational studies of gene expression in various murine tissues has led us to a different conclusion. By applying a new analysis strategy and a number of alternative algorithms, we identify baseline oscillation in almost 100% of all genes. While the phase and amplitude of oscillation vary between different tissues, circadian oscillation remains a fundamental property of every gene. Reanalysis of previously published data also reveals a greater number of oscillating genes than was previously reported. This suggests that circadian oscillation is a universal property of all mammalian genes, although phase and amplitude of oscillation are tissue-specific and remain associated with a gene’s function. (Source)

A cyclic process traces out what is called an orbital which are like invariances that are smeared across time. An invariance is a substate preserved by a process, namely a portion of a state that is mapped identically to itself. An orbital too is mapped to itself by the cyclic process, but it is not identically done so. (Each orbital point moves forward in time to another orbital point and eventually ends up at its initial position.) Hence orbitals and process-persistency can be viewed respectively as generalizations of invariances and state-persistency.


3. Information

In practice, we do not have perfect knowledge of the states nor the processes. Since we can not move both feet at the same time, in our quest to understand nature, we assume that we have perfect knowledge of either the states or the processes.

  • Assumption: Perfect knowledge of all the actual processes but imperfect knowledge of the state
    Goal: Dissect the state into explainable and unexplainable parts
    Expectation: State is expected to be partially unexplainable due to experimental constraints on measuring states.

  • Assumption: Perfect knowledge of a state but no knowledge of the actual processes
    Goal: Find the actual (minimal) process that generated the state from the library of all possible processes.
    Expectation: State is expected to be completely explainable due to perfect knowledge about the state and the unbounded freedom in finding the generating process.

The reason why I highlighted expectations here is because it is quite interesting how our psychological stance against the unexplainable (which is almost always - in our typical dismissive tone - referred to as noise) differs in each case.

  • In the presence of perfect knowledge about the processes, we interpret the noisy parts of states as absence of information.

  • In the absence of perfect knowledge about the processes, we interpret the noisy parts of states as presence of information.

The flip side of the above statements is that, in our quest to understand nature, we use the word information in two opposite senses.

  • Information is what is explainable.

  • Information is what is inexplainable.


3.1 Information as the Explainable

In this case, noise is the ideal left-over product after everything else is explained away, and is considered normal and expected. (We even gave the name “normal” to the most commonly encountered noise distribution.)

This point of view is statistical and is best exemplified by the field of statistical mechanics where massive micro-degrees freedom can be safely ignored due to their random nature and canned into highly regular noise distributions.


3.2. Information as the Inexplainable

In this case, noise is the only thing that can not be compressed further or explained away. It is surprising and unnerving. In computer speak, one would say “It is not a bug, it is a feature.”

This point of view is algorithmic and is best exemplified by the field of algorithmic complexity which looks at the notion of complexity from a process centric perspective.

fitness and virality

In general, restricting your audience enables you to design more effectively around your users' tastes and needs. Resulting structures are more fitting. The downside is that their speed of adoption is slower due to the lower virality coefficients associated with dispersed audiences.

Prone to critical thresholds, growth of social structures like marketplaces and social media platforms are very sensitive to speed of adoption. This is the primary reason why social verticals repeatedly fail to take off while non-social verticals easily succeed. Those that take off are usually subgraphs of already existing general graphs and therefore suffer from serious design defects.


This discussion is related to another blog post where I viewed abstraction as a lever between probability of longevity and probability of success.

  • In the practical realm, general and useful structures are easier to find but also easier to kill. (They eventually get dismantled by verticals which can more efficiently solve each of the collectively-addressed problems.) In the theoretical realm, abstract and useful results are harder to kill but also harder to find.

  • In the practical realm general structures emerge first and verticals come later. In the theoretical realm specific results emerge first and abstractions come later.

These dichotomies stem from the difference between serving and understanding. Former gets better as you zoom in, latter gets better as you zoom out.

thoughts on abstraction

Why is it always the case that formulation of deeper physics require more abstract mathematics? Why does understanding get better as it zooms out?

Side Note: Notice that there are two ways of zooming out. First, you can abstract by ignoring details. This is actually great for applications, but not good for understanding. It operates more like chunking, coarse-graining, forming equivalence classes etc. You end up sacrificing accuracy for the sake of practicality. Second, you can abstract in the sense of finding an underlying structure that allows you to see two phenomena as different manifestations of the same phenomenon. This is actually the meaning that we will be using throughout the blogpost. While coarse graining is easy, discovering an underlying structure is hard. You need to understand the specificity of a phenomenon which you normally consider to be general.

For instance, a lot of people are unsatisfied with the current formulation of quantum physics, blaming it for being too instrumental. Yes, the math is powerful. Yes, the predictions turn out to be correct. But the mathematical machinery (function spaces etc.) feels alien, even after one gets used to it over time. Or compare the down-to-earth Feynman diagrams with the amplituhedron theory... Again, you have a case where a stronger and more abstract beast is posited to dethrone a multitude of earthlings.

Is the alienness a price we have to pay for digging deeper? The answer is unfortunately yes. But this should not be surprising at all:

  • We should not expect to be able to explain deeper physics (which is so removed from our daily lives) using basic mathematics inspired from mundane physical phenomena. Abstraction gives us the necessary elbow room to explore realities that are far-removed from our daily lives.

  • You can use the abstract to can explain the specific but you can not proceed the other way around. Hence as you understand more, you inevitably need to go higher up in abstraction. For instance, you may hope that a concept as simple as the notion of division algebra will be powerful enough to explain all of physics, but you will sooner or later be gravely disappointed. There is probably a deeper truth lurking behind such a concrete pattern.



Abstraction as Compression

The simplicities of natural laws arise through the complexities of the languages we use for their expression.

- Eugene Wigner

That the simplest theory is best, means that we should pick the smallest program that explains a given set of data. Furthermore, if the theory is the same size as the data, then it is useless, because there is always a theory that is the same size as the data that it explains. In other words, a theory must be a compression of the data, and the greater the compression, the better the theory. Explanations are compressions, comprehension is compression!

Chaitin - Metaphysics, Metamathematics and Metabiology

We can not encode more without going more abstract. This is a fundamental feature of the human brain. Either you have complex patterns based on basic math or you have simple patterns based on abstract math. In other words, complexity is either apparent or hidden, never gotten rid of. (i.e. There is no loss of information.) By replacing one source of cognitive strain (complexity) with another source of cognitive strain (abstraction), we can lift our analysis to higher-level complexities.

In this sense, progress in physics is destined to be of an unsatisfactory nature. Our theories will keep getting more abstract (and difficult) at each successive information compression. 

Don't think of this as a human tragedy though! Even machines will need abstract mathematics to understand deeper physics, because they too will be working under resource constraints. No matter how much more energy and resources you summon, the task of simulating a faithful copy of the universe will always require more.

As Bransford points out, people rarely remember written or spoken material word for word. When asked to reproduce it, they resort to paraphrase, which suggests that they were able to store the meaning of the material rather than making a verbatim copy of each sentence in the mind. We forget the surface structure, but retain the abstract relationships contained in the deep structure.

Jeremy Campbell - Grammatical Man (Page 219)

Depending on context, category theoretical techniques can yield proofs shorter than set theoretical techniques can, and vice versa. Hence, a machine that can sense when to switch between these two languages can probe the vast space of all true theories faster. Of course, you will need human aide (enhanced with machine learning algorithms) to discern which theories are interesting and which are not.

Abstraction is probably used by our minds as well, allowing it to decrease the number of used neurons without sacrificing explanatory power.

Rolnick and Max Tegmark of the Massachusetts Institute of Technology proved that by increasing depth and decreasing width, you can perform the same functions with exponentially fewer neurons. They showed that if the situation you’re modeling has 100 input variables, you can get the same reliability using either 2100 neurons in one layer or just 210 neurons spread over two layers. They found that there is power in taking small pieces and combining them at greater levels of abstraction instead of attempting to capture all levels of abstraction at once.

“The notion of depth in a neural network is linked to the idea that you can express something complicated by doing many simple things in sequence,” Rolnick said. “It’s like an assembly line.”

- Foundations Built for a General Theory of Neural Networks (Kevin Hartnett)

In a way, the success of neural network models with increased depth reflect the hierarchical aspects of the phenomena themselves. We end up mirroring nature more closely as we try to economize our models.


Abstraction as Unlearning

Abstraction is not hard because of technical reasons. (On the contrary, abstract things are easier to manipulate due to their greater simplicities.) It is hard because it involves unlearning. (That is why people who are better at forgetting are also better at abstracting.)

Side Note: Originality of the generalist is artistic in nature and lies in the intuition of the right definitions. Originality of the specialist is technical in nature and lies in the invention of the right proof techniques.

Globally, unlearning can be viewed as the Herculean struggle to go back to the tabula rasa state of a beginner's mind. (In some sense, what takes a baby a few months to learn takes humanity hundreds of years to unlearn.) We discard one by one what has been useful in manipulating the world in favor of getting closer to the truth.

Here are some beautiful observations of a physicist about the cognitive development of his own child:

My 2-year old’s insight into quantum gravity. If relative realism is right then ‘physical reality’ is what we experience as a consequence of looking at the world in a certain way, probing deeper and deeper into more and more general theories of physics as we have done historically (arriving by now at two great theories, quantum and gravity) should be a matter of letting go of more and more assumptions about the physical world until we arrive at the most general theory possible. If so then we should also be able to study a single baby, born surely with very little by way of assumptions about physics, and see where and why each assumption is taken on. Although Piaget has itemized many key steps in child development, his analysis is surely not about the fundamental steps at the foundation of theoretical physics. Instead, I can only offer my own anecdotal observations.

Age 11 months: loves to empty a container, as soon as empty fills it, as soon as full empties it. This is the basic mechanism of waves (two competing urges out of phase leading to oscillation).

Age 12-17 months: puts something in drawer, closes it, opens it to see if it is still there. Does not assume it would still be there. This is a quantum way of thinking. It’s only after repeatedly finding it there that she eventually grows to accept classical logic as a useful shortcut (as it is in this situation).

Age 19 months: comes home every day with mother, waves up to dad cooking in the kitchen from the yard. One day dad is carrying her. Still points up to kitchen saying ‘daddy up there in the kitchen’. Dad says no, daddy is here. She says ‘another daddy’ and is quite content with that. Another occasion, her aunt Sarah sits in front of her and talks to her on my mobile. When asked, Juliette declares the person speaking to her ‘another auntie Sarah’. This means that at this age Juliette’s logic is still quantum logic in which someone can happily be in two places at the same time.

Age 15 months (until the present): completely unwilling to shortcut a lego construction by reusing a group of blocks, insists on taking the bits fully apart and then building from scratch. Likewise always insists to read a book from its very first page (including all the front matter). I see this as part of her taking a creative control over her world.

Age 20-22 months: very able to express herself in the third person ‘Juliette is holding a spoon’ but finds it very hard to learn about pronouns especially ‘I’. Masters ‘my’ first and but overuses it ‘my do it’. Takes a long time to master ‘I’ and ‘you’ correctly. This shows that an absolute coordinate-invariant world view is much more natural than a relative one based on coordinate system in which ‘I’ and ‘you’ change meaning depending on who is speaking. This is the key insight of General Relativity that coordinates depend on a coordinate system and carry no meaning of themselves, but they nevertheless refer to an absolute geometry independent of the coordinate system. Actually, once you get used to the absolute reference ‘Juliette is doing this, dad wants to do that etc’ it’s actually much more natural than the confusing ‘I’ and ‘you’ and as a parent I carried on using it far past the time that I needed to. In the same way it’s actually much easier to do and teach differential geometry in absolute coordinate-free terms than the way taught in most physics books.

Age 24 months: until this age she did not understand the concept of time. At least it was impossible to do a bargain with her like ‘if you do this now, we will go to the playground tomorrow’ (but you could bargain with something immediate). She understood ‘later’ as ‘now’.

Age 29 months: quite able to draw a minor squiggle on a bit of paper and say ‘look a face’ and then run with that in her game-play. In other words, very capable of abstract substitutions and accepting definitions as per pure mathematics. At the same time pedantic, does not accept metaphor (‘you are a lion’ elicits ‘no, I’m me’) but is fine with similie, ‘is like’, ‘is pretending to be’.

Age 31 months: understands letters and the concept of a word as a line of letters but sometimes refuses to read them from left to right, insisting on the other way. Also, for a time after one such occasion insisted on having her books read from last page back, turning back as the ‘next page’. I interpret this as her natural awareness of parity and her right to demand to do it her own way.

Age 33 months (current): Still totally blank on ‘why’ questions, does not understand this concept. ‘How’ and ‘what’ are no problem. Presumably this is because in childhood the focus is on building up a strong perception of reality, taking on assumptions without question and as quickly as possible, as it were drinking in the world.

... and just in the last few days: remarked ‘oh, going up’ for the deceleration at the end of going down in an elevator, ‘down and a little bit up’ as she explained. And pulling out of my parking spot insisted that ‘the other cars are going away’. Neither observation was prompted in any way. This tells me that relativity can be taught at preschool.

- Algebraic Approach to Quantum Gravity I: Relative Realism (S. Majid)


Abstraction for Survival

The idea, according to research in Psychology of Aesthetics, Creativity, and the Arts, is that thinking about the future encourages people to think more abstractly—presumably becoming more receptive to non-representational art.

- How to Choose Wisely (Tom Vanderbilt)

Why do some people (like me) get deeply attracted to abstract subjects (like Category Theory)?

One of the reasons could be related to the point made above. Abstract things have higher chances of survival and staying relevant because they are less likely to be affected by the changes unfolding through time. (Similarly, in the words of Morgan Housel, "the further back in history you look, the more general your takeaways should be.") Hence, if you have an hunger for timelessness or a worry about being outdated, then you will be naturally inclined to move up the abstraction chain. (No wonder why I am also obsessed with the notion of time.)

Side Note: The more abstract the subject, the less community around it is willing to let you attach your name to your new discoveries. Why? Because the half-life of discoveries at higher levels of abstraction is much longer and therefore your name will live on for a much longer period of time. (i.e. It makes sense to be prudent.) After being trained in mathematics for so many years, I was shocked to see how easily researchers in other fields could “arrogantly” attach their names to basic findings. Later I realized that this behavior was not out of arrogance. These fields were so far away from truth (i.e. operating at very low levels of abstraction) that half-life of discoveries were very short. If you wanted to attach your name to a discovery, mathematics had a high-risk-high-return pay-off structure while these other fields had a low-risk-low-return structure.

But the higher you move up in the abstraction chain, the harder it becomes for you to innovate usefully. There is less room to play around since the objects of study have much fewer properties. Most of the meaningful ideas have already been fleshed out by others who came before you.

In other words, in the realm of ideas, abstraction acts as a lever between probability of longevity and probability of success. If you aim for a higher probability of longevity, then you need to accept the lower probability of success.

That is why abstract subjects are unsuitable for university environments. The pressure of "publish or perish" mentality pushes PhD students towards quick and riskless incremental research. Abstract subjects on the other hand require risky innovative research which may take a long time to unfold and result in nothing publishable.

Now you may be wondering whether the discussion in the previous section is in conflict with the discussion here. How can abstraction be both a process of unlearning and a means for survival? Is not the evolutionary purpose of learning to increase the probability of survival? I would say that it all depends on your time horizon. To survive the immediate future, you need to learn how your local environment operates and truth is not your primary concern. But as your time horizon expands into infinity, what is useful and what is true become indistinguishable, as your environment shuffles through all allowed possibilities.

a visual affair

We vastly overvalue visual input over other sources of sensual inputs since most of our bandwidth is devoted to vision:

Source: David McCandless - The Beauty of Data Visualization (The small white corner represents the total bandwidth that we can actually be aware of.)

Source: David McCandless - The Beauty of Data Visualization (The small white corner represents the total bandwidth that we can actually be aware of.)

This bias infiltrates both aesthetics and science:

  • The set of people you find beautiful will change drastically if you lose your eyesight. (Get a full body massage and you will see what I mean.)

  • We explain auditory phenomenon in terms of mathematical metaphors that burgeoned out of visual inputs. There are no mathematical metaphors with auditory origin, and therefore no scientific explanations of visual phenomenon in terms of auditory expressions. Rationality is a strictly visual affair. In fact, the word "idea" has etymological roots going back to the Greek word "Edeo" - "to see". (No wonder why deep neural networks mimicking the structure of our visual system has become so successful in machine learning challenges.)