digital vs physical businesses

In the first part, I will analyze how digital businesses and physical businesses are complementary to each other via the following dualities:

  1. Risk of Death vs Potential for Growth

  2. Controlling Demand vs Controlling Supply

  3. Network Effects vs Scale Effects

  4. Mind vs Body

  5. Borrowing Space vs Borrowing Time

In the second part, I will analyze how the rise of digital businesses against physical businesses is triggering the following trends:

  1. Culture is Shifting from Space to Time

  2. Progress is Accelerating

  3. Science is Becoming More Data-Driven

  4. Economy is Getting Lighter

  5. Power is Shifting from West to East

Duality 1: Risk of Death vs Potential for Growth

Since information is frictionless, every digital startup has a potential for fast growth. But since the same fact holds for every other startup as well, there is also a potential for a sudden downfall. That is why defensibility (i.e. ability to survive after reaching success) is often mentioned as the number one criterion by the investors of such companies.

Physical businesses face the inverse reality: They are harder to grow but easier to defend, due to factors like high barriers to entry, limited real estate space, hard-to-set-up distribution networks etc. That is why competitive landscape is the most scrutinized issue by the investors of such companies.

Duality 2: Controlling Supply vs Controlling Demand

In the physical world, limited by scarcity, economic power comes from controlling supply; in the digital world, overwhelmed by abundance, economic power comes from controlling demand.
- Ben Thompson - Ends, Means and Antitrust

Although Ben’s point is quite clear, it is worth expanding it a little bit.

In the physical world, supply is much more limited than demand and therefore whoever controls the supply wins.

  • Demand. Physical consumption is about hoarding in space which is for all practical purposes infinite. Since money is digital in its nature, I can buy any object in any part of the world at the speed of light and that object will immediately become mine.

  • Supply. Extracting new materials and nurturing new talents take a lot of time. In other words, in the short run, supply of physical goods is severely limited.

In the digital world, demand is much more limited than supply and therefore whoever controls the demand wins:

  • Demand. Digital consumption is information based and therefore cognitive in nature. Since one can pay attention to only so many things at once, it is restricted mainly to the time dimension. For instance, for visual information, daily screen time is the limiting factor on how much can be consumed.

  • Supply. Since information travels at the speed of light, every bit in the world is only a touch away from you. Hence, in the short run, supply is literally unlimited.

Duality 3: Scale Effects vs Network Effects

Physical economy is dominated by geometric dynamics since distances matter. (Keyword here is space.) Digital economy on the other hand is information based and information travels at the speed of light, which is for all practical purposes infinite. Hence distances do not matter, only connectivities do. In other words, the dynamics is topological, not geometric. (Keyword here is network.)

Side Note: Our memories too work topologically. We remember the order of events (i.e. temporal connectivity) easily but have hard time situating them in absolute time. (Often we just remember the dates of significant events and then try to date everything else relative to them.) But while we are living, we focus on the continuous duration (i.e. the temporal distance), not the discrete events themselves. That is why the greater the number of things we are pre-occupied with and the less we can feel the duration, the more quickly time seems to pass. In memory though, the reverse happens: Since the focus is on events (everything else is cleared out!), the greater the number of events, the less quickly time seems to have passed.

This nicely ties back to the previous discussion about defensibility. Physical businesses are harder to grow because that is precisely how they protect themselves. They reside in space and scale effects help them make better use of time through efficiency gains. Digital businesses on the other hand reside in time and network effects help them make better use of space through connectivity gains. Building protection is what is hard and also what is valuable in each case.

Side Note: Just as economic value continuously trickles down to the space owners (i.e. land owners) in the physical economy, it trickles down to “time owners” in the digital economy (i.e. companies who control your attention through out the day).

Scale does not correlate with defensible value in the digital world, just as connectivity does not correlate with defensible value in the physical world. Investors are perennially confused about this since scale is so easy to see and our reptilian brains are so susceptible to be impressed by it.

Of course, at the end of the day, all digital businesses thrive on physical infrastructures and all physical businesses thrive on digital infrastructures. This leads to an interesting mixture.

  • As a structure grows, it suffers from internal complexities which arise from increased interdependencies between increased number of parts.

  • Similarly, greater connectivity requires greater internal scale. In fact, scalability is a huge challenge for fast-growing digital businesses.

Hence, physical businesses thrive on scale effects but suffer from negative internal network effects (which are basically software problems), and digital businesses thrive on network effects but suffer from negative internal scale effects (which are basically hardware problems). In other words, these two types of businesses are dependent on each other to be able to generate more value.

  • As physical businesses get better at leveraging software solutions to manage their complexity issues, they will break scalability records.

  • As digital businesses get better at leveraging hardware solutions to manage their scalability issues, they will break connectivity records.

Note that we have now ventured beyond the world of economics and entered the much more general world of evolutionary dynamics. Time has two directional arrows:

  • Complexity. Correlates closely with size. Increases over time, as in plants being more complex than cells.

  • Connectivity. Manifests itself as “entropy” at the lowest complexity level (i.e. physics). Increases over time, as evolutionary entities become more interlinked.

Evolution always pushes for greater scale and connectivity.

Side Note: "The larger the brain, the larger the fraction of resources devoted to communications compared to computation." says Sejnowski. Many scientists think that evolution has already reached an efficiency limit for the size of the biological brain. A great example of a digital entity (i.e. the computing mind) whose growing size is limited by the accompanying growing internal complexity which manifests itself in the form of internal communication problems.

Duality 4: Mind vs Body

All governments desire to increase the value of their economies but also feel threatened by the evolutionary inclination of the economic units to push for greater scale and connectivity. Western governments (e.g. US) tend to be more sensitive about size. They monitor and explicitly break up physical businesses that cross a certain size threshold. Eastern governments (e.g. China) on the other hand tend to be more sensitive about connectivity. They monitor and implicitly take over digital businesses that cross a certain connectivity threshold. (Think of the strict control of social media in China versus the supreme freedom of all digital networks in US.)

Generally speaking, the Western world falls on the right-hand side of the mind-body duality, while the Eastern world falls on the left-hand side.

  • As mentioned above, Western governments care more about the physical aspects of reality (like size) while Eastern governments care more about the mental aspects of reality (like connectivity).

  • Western sciences equate the mind with the brain, and thereby treats software as hardware. Eastern philosophies are infused with panpsychic ideas, ascribing consciousness (i.e. mind-like properties) to the entirety of universe, and thereby treats hardware as software.

We can think of the duality between digital and physical businesses as the social version of the mind-body duality. When you die, your body gets recycled back into the ecosystem. (This is no different than the machinery inside a bankrupt factory getting recycled back into the economy.) Your mind on the other hand simply disappears. What survive are the impressions you made on other minds. Similarly, when digital businesses die, they leave behind only memories in the form of broken links and cached pages, and therefore need “tombstones” to be remembered. Physical businesses on the other hand leave behind items which continue to circulate in the second-hand markets and buildings which change hands to serve new purposes.

Duality 5: Borrowing Space vs Borrowing Time

Banking too is moving from space to time dimension, and this is happening in a very subtle way. Yes, banks are becoming increasingly more digital, but this is not what I am talking about at all. Digitalized banks are more efficient at delivering the same exact services, continuing to serve the old banking needs of the physical economy. What I am talking about is the unique banking needs of the new digital economy. What do I mean by this?

Remember, physical businesses reside in space and scale effects help them make better use of time through efficiency gains. Digital businesses on the other hand reside in time and network effects help them make better use of space through connectivity gains. Hence, their borrowing needs are polar opposite: Physical businesses need to borrow time to accelerate their defensibility in space, while digital businesses need to borrow space to accelerate their defensibility in time. (What matters in the long run is only defensibility!)

But what does it mean to borrow time or space?

  • Lending time is exactly what regular banks do. They give you money and charge you an interest rate, which can be viewed as the cost of moving (discounting) the money you will be making in the future to now. In other words, banks are in the business of creating contractions in the time dimension, not unlike creating wormholes through time.

  • Definition of space for a digital company depends on the network it resides in. This could be a specific network of people, businesses etc. A digital company does not defend itself by scale effects, it defends itself by network effects. Hence its primary goal is to increase the connectivity of its network. In other words, a digital company needs creation of wormholes through space, not through time. Whatever facilitates further stitching of its network satisfies its “banking needs”.

Bankers of the digital economy are the existing deeply-penetrated networks like Alibaba, WeChat, LinkedIn, Facebook, Amazon etc. What masquerades as a marketing expense for a digital company to rent the connectivity of these platforms is actually in part a “banking” expense, not unlike the interest payments made to a regular bank.

Trend 1: Culture is Shifting from Space to Time

Culturally we are moving from geometry to topology, more often deploying topological rather than geometric language while narrating our lives. We meet our friends in online networks rather than physical spaces.

Correlation between the rise of the digital economy and the rise of the experience economy (and its associated cultural offshoots like hipster movement and decluttering movement) is not a coincidence. Experiential goods (not just those that are information-based) exhibit the same dynamics as digital goods. They are completely mental and reside in time dimension.

Our sense of privacy too is shifting from space dimension to time dimension. We are growing less sensitive about sharing objects and more sensitive about sharing experiences. We are participating in a myriad of sharing economies, but also becoming more ruthless about time optimization. (What is interpreted as a general decline in attention span is actually a protective measure erected by the digital natives, forcing everyone to cut their narratives short.) Increasingly we are spending less time with people although we look more social from outside since we share so many objects with each other.

Our sense of aesthetics has started to incorporate time rather than banish it. We leave surfaces unfinished and prefer using raw and natural-looking rather than polished and new-looking materials. Everyone has become wabi-sabi fans, preferring to buy stuff that time has taken (or seems to have taken) its toll on them.

Even physics is caught in the Zeitgeist. Latest theories are all claiming that time is fundamental and space is emergent. Popular opinion among the physicists used to be the opposite. Einstein had put the final nail on the coffin by completely spatializing time into what is called spacetime, an unchanging four-dimensional block universe. He famously had said “the distinction between past, present, and future is only a stubbornly persistent illusion.”

Trend 2: Progress is Accelerating

As economies and consumption patterns shift to time dimension, we feel more overwhelmed by the demands on our time, and life seems to progress at a faster rate.

Let us dig deeper into this seemingly trivial observation. First recall the following two facts:

  1. In a previous blog post, I had talked about the effect of aging on perception of time. As you accumulate more experience and your library of cognitive models grows, you become more adept at chunking experience and shifting into an automatic mode. What was used to be processed consciously now starts getting processed unconsciously. (This is no different than stable software patterns eventually trickling down and hardening to become hardware patterns.)

  2. In a previous blog post, I had talked about how the goal of education is to learn how not to think, not how to think. In other words, “chunking” is the essence of learning.

Combining these two facts we deduce the following:

  • Learning accelerates perception of time.

This observation in turn is intimately related to the following fact:

What exactly is this relation?

Remember, at micro-level, both learning and progress suffer from the diminishing returns of S-curves. However, at the macro-level, both overcome these limits via sheer creativity and manage to stack S-curves on top of each other to form a (composite) exponential curve that literally shoots to infinity.

This structural similarity is not a coincidence: Progress is simply the social version of learning. However, progress happens out in the open, while learning takes place internally within each of our minds and therefore can not be seen. That is why we can not see learning in time, but nevertheless can feel its acceleration by reflecting it off time.

Side Note: For those of you who know about Ken Wilber’s Integral Theory, what we found here is that “learning” belongs to the upper-left quadrant while “progress” belongs to the lower-right quadrant. The infinitary limiting point is often called Nirvana in personal learning and Singularity in social progress.

Recall how we framed the duality between digital and physical businesses as the social version of the mind-body duality. True, from the individual’s perspective, progress seems to happen out in the open. However, from the perspective of the mind of the society (represented by the aggregation of all things digital), progress “feels” like learning.

Hence, going back to the beginning of this discussion, your perception of time accelerates for two dual reasons:

  1. Your data processing efficiency increases as you learn more.

  2. Data you need to process increases as society learns more.

Time is about change. Perception of time is about processed change, and how much change your mind can process is a function of both your data processing efficiency (which defines your bandwidth) and the speed of data flow. (You can visualize bandwidth as the diameter of a pipe.) As society learns more (i.e. progresses further), you become bombarded with more change. Thankfully, as you learn more, you also become more capable of keeping up with change.

There is an important caveat here though.

  1. Your mind loses its plasticity over time.

  2. The type of change you need to process changes over time.

The combination of these two facts is very problematic. Data processing efficiency is sustained by the cognitive models you develop through experience, based on past data sets. Hence, their continued efficiency is guaranteed only if the future is similar to the past, which of course is increasingly not the case.

As mentioned previously, the exponential character of progress stems from the stacking of S-curves on top of each other. Each new S-curve represents a discontinuous creative jump, a paradigm shift that requires a significant revision of existing cognitive models. As progress becomes faster and life expectancy increases, individuals encounter a greater number of such challenges within their lifetimes. This means that they are increasingly at risk of being left behind due to the plasticity of their minds decreasing over time.

This is exactly why the elderly enjoy nostalgia and wrap themselves inside time capsules like retirement villages. Their desire to stop time creates a demographic tension that will become increasingly more palpable in the future, as the elderly become increasingly more irrelevant while still clinging onto their positions of power and keeping the young at bay.

Trend 3: Science is Becoming More Data-Driven

Rise of the digital economy can be thought of as the maturation of the social mind. The society as a whole is aging, not just us. You can tell this also from how science is shifting from being hypothesis-driven to being data-driven, thanks to digital technologies. (Take a look at the blog post I have written on this subject.) Social mind is moving from conscious thinking to unconscious thinking, becoming more intuitive and getting wiser in the process.

Trend 4: Economy is Getting Lighter

As software is taking over the world, information is being infused into everything and our use of matter is getting smarter.

Automobiles weigh less than they once did and yet perform better. Industrial materials have been replaced by nearly weightless high-tech know-how in the form of plastics and composite fiber materials. Stationary objects are gaining information and losing mass, too. Because of improved materials, high-tech construction methods, and smarter office equipment, new buildings today weigh less than comparable ones from the 1950s. So it isn’t only your radio that is shrinking, the entire economy is losing weight too.

Kevin Kelly - New Rules for the New Economy (Pages 73-74)

Energy use in US has stayed flat despite enormous growth. We now make less use of atoms, and the share of tangibles in total equity value is continuously decreasing. As R. Buckminster Fuller said, our economies are being ephemeralized thanks to the technological advances which are allowing us to do "more and more with less and less until eventually [we] can do everything with nothing."

This trend will probably, in a rather unexpected way, ease the global warming problem. (Remember, it is the sheer mass of what is being excavated and moved around, that is responsible for the generation of greenhouse gases.)

Trend 5: Power is Shifting from West to East

Now I will venture far further and bring religion into the picture. There are some amazing historical dynamics at work that can be recognized only by elevating ourselves and looking at the big picture.

First, let us take a look at the Western world.

  • Becoming. West chose a pragmatic, action-oriented attitude towards Becoming and did not directly philosophize about it.

  • Being. Western religions are built on the notion of Being. Time is deemed to be an illusion and God is thought of as a static all-encompassing Being, not too different from the entirety of Mathematics. There is believed to be an order behind the messy unfolding of Becoming, an order that is waiting to be discovered by us. It is with this deep conviction that Newton managed to discover the first mathematical formalism to predict natural phenomena. There is nothing in the history of science that is comparable to this achievement. Only a religious zeal could have generated the sort of tenacity that is needed to tackle a challenge of this magnitude.

This combination of applying intuition to Becoming and reason to Being eventually led to a meteoric rise in technology and economy.

Side Note: Although an Abrahamic religion itself, Islam did not fuel a similar meteoric rise, because it was practiced more dogmatically. Christianity on the other hand self-reformed itself into a myriad of sub-religions. Although not too great, there was enough intellectual freedom to allow people to seek unchanging patterns in reality, signs of Being within Becoming. Islam on the other hand persecuted any such aspirations. Even allegorical paintings about Being was not allowed.

East did the opposite and applied reason to Becoming and intuition to Being.

  • Becoming. East based its religion in Becoming and this instilled a fundamental suspicion against any attempts to mathematically model the unfolding reality or seek absolute knowledge. Of course, reasoning about Becoming without an implicit belief in unchanging absolutes is not an easy task. In fact, it is so hard that one has no choice but to be imprecise and poetic, and of course that is exactly what Eastern religions did. (Think of Taoism.)

  • Being. How about applying intuition to Being? How can you go about experiencing Being directly, through the “heart” so to speak? Well, through non-verbal silent meditation of course! That is exactly what Eastern religions did. (Think of Buddhism.)

Why could not East reason directly about Becoming in a formal fashion, like West reasoned directly about Being using mathematics? Remember Galileo saying "Mathematics is the language in which God has written the universe." What would have been the corresponding statement for the East? In other words, what is the formal language of Becoming? It is computer science of course, which was born out of Mathematics in the West around 1930s.

Now you understand why West was so lucky. Even if East had managed to discover computer science first, it would have been useless in understanding Becoming, because without the actual hardware to run simulations, you can not create computational models. A model needs to be run on something. It is not like a math theory in a book, waiting for you to play with it. Historically speaking, mathematics had to come first, because it is the cheaper, more basic technology. All you need is literally a pen, a paper and a trash bin.

Side Note: Here is a nerdy joke for you… The dean asks the head of the physics department to see him. “Why are you using so many resources? All those labs and experiments and whatnot; this is getting expensive! Why can’t you be more like mathematicians – they only need pens, paper, and a trash bin. Or philosophers – they only need pens and paper!”

But now is different. We have tremendous amounts of cheap computation and storage at our disposal, allowing us to finally crack the language of Becoming. Our entire economy is shifting from physical to digital, and our entire culture is shifting from space to time. An extraordinary period indeed!

It was never a coincidence that Chinese mathematicians chose to work in (and subsequently dominated) statistics, the most practical fields within mathematics. (They are culturally oriented toward Becoming.) Now all these statisticians are turning into artificial intelligence experts while West is still being paranoid about the oncoming Singularity, the exponential rise of AI.

Why have the Japanese always loved robots while the West has always been afraid of them? Why is the adoption of digital technologies happening faster in the East? Why are the kids and their parents in the East less worried about being locked into digital screens? As we elaborated above, the answer is metaphysical. Differences in metaphysical frameworks (often inherited from religions) are akin to the hard-to-notice (but exceptionally consequential) differences in the low-level code sitting right above the hardware.

Now guess who will dominate the new digital era? Think of the big picture. Do not extrapolate from recent past, think of the vast historical patterns.

I believe that people are made equal everywhere and in the long-run whoever is more zealous wins. East is more zealous about Becoming than the West, and therefore will sooner or later dominate the digital era. Our kids will learn their languages and find their religious practices more attractive. (Meditation is already spreading like wildfire.) What is “cool” will change and all these things will happen effortlessly in a mindless fashion, due to the fundamental shift in Zeitgeist and the strong structural forces of economics.

Side Note: Remember, in Duality 4, we had said that the East has an intrinsic tendency to regulate digital businesses rather than physical businesses. And here we just claimed that the East has an intrinsic passion for building digital businesses rather than physical businesses. Combining these two observations, we can predict that the East will unleash both greater energy and greater restrain in the digital domain. This actually makes a lot of sense, and is in line with the famous marketing slogan of the tyre manufacturing company Pirelli: “Power is Nothing Without Control”

Will the pendulum eventually swing back? Will the cover pages again feature physical businesses as they used to do a decade ago? The answer is no. Virtualization is one of the main trends in evolution. Units of evolution are getting smarter and becoming increasingly more governed by information dynamics rather than energy dynamics. (Information is substrate independent. Hence the term “virtualization”.) Nothing can stop this trend, barring some temporary setbacks here and there.

It seems like West has only two choices in the long run:

  1. It can go through a major religious overhaul and adopt a Becoming-oriented interpretation of Christianity, like that of Teilhard de Chardin.

  2. It can continue as is, and be remembered as the civilization that dominated the short intermediary period which begun with the birth of mathematical modeling and ended with the birth of computational modeling. (Equivalently, one could say that West dominated the industrial revolution and East will dominate the digital revolution.)


If you liked this post, you will probably enjoy the older post Innovative vs Classical Businesses as well. (Note that digital does not mean innovative and physical does not mean classical. You can have a classical digital or an innovative physical business.)

hypothesis vs data driven science

Science progresses in a dualistic fashion. You can either generate a new hypothesis out of existing data and conduct science in a data-driven way, or generate new data for an existing hypothesis and conduct science in a hypothesis-driven way. For instance, when Kepler was looking at the astronomical data sets to come up with his laws of planetary motion, he was doing data-driven science. When Einstein came up with his theory of General Relativity and asked experimenters to verify the theory’s prediction for the anomalous rate of precession of the perihelion of Mercury's orbit, he was doing hypothesis-driven science.

Similarly, technology can be problem-driven (the counterpart of “hypothesis-driven” in science) or tool-driven (the counterpart of “data-driven” in science). When you start with a problem, you look for what kind of (existing or not-yet-existing) tools you can throw at the problem, in what kind of a combination. (This is similar to thinking about what kind of experiments you can do to generate relevant data to support a hypothesis.) Conversely, when you start with a tool, you try to find a use case which you can deploy it at. (This is similar to starting off with a data set and digging around to see what kind of hypotheses you can extract out of it.) Tool-driven technology development is much more risky and stochastic. It is a taboo for most technology companies, since investors do not like random tinkering and prefer funding problems with high potential economic value and entrepreneurs who “know” what they are doing.

Of course, new tools allow you to ask new kind of questions to the existing data sets. Hence, problem-driven technology (by developing new tools) leads to more data-driven science. And this is exactly what is happening now, at a massive scale. With the development of cheap cloud computing (and storage) and deep learning algorithms, scientists are equipped with some very powerful tools to attack old data sets, especially in complex domains like biology.


Higher Levels of Serendipity

One great advantage of data-driven science is that it involves tinkering and “not really knowing what you are doing”. This leads to less biases and more serendipitous connections, and thereby to the discovery of more transformative ideas and hitherto unknown interesting patterns.

Hypothesis-driven science has a direction from the beginning. Hence surprises are hard to come by, unless you have exceptionally creative intuition capabilities. For instance, the theory of General Relativity was based on one such intuition leap by Einstein. (There has not been such a great leap since then. So it is extremely rare.) Quantum Mechanics on the other hand was literally forced by experimental data. It was so counter intuitive that people refused to believe it. All they could do is turn their intuition off and listen to the data.

Previously data sets were not huge, so scientists could literally eye ball them. Today this is no longer possible. That is why now scientists need computers, algorithms and statistical tools to help them decipher new patterns.

Governments do not give money to scientists so that they can tinker around and do whatever they want. So a scientist applying for a grant needs to know what he is doing. This forces everyone to be in a hypothesis-driven mode from the beginning and thereby leads to less transformative ideas in the long run. (Hat tip to Mehmet Toner for this point.)

Science and technology are polar opposite endeavors. Governments funding science like investors fund technology is a major mistake, and also an important reason why today some of the most exciting science is being done inside closed private companies rather than open academic communities.


Less Democratic Landscape

There is another good reason why the best scientists are leaving the academia. You need good quality data to do science within the data-driven paradigm, and since data is so easily monetizable the largest data sets are being generated by the private companies. So it is not surprising that the most cutting edge research in fields like AI is being done inside companies like Google and Facebook, which also provide the necessary compute power to play around with these data sets.

While hypotheses generation gets better when it is conducted in a decentralized open manner, the natural tendency of data is to be centralized under one roof where it can be harmonized and maintained consistently at a high quality. As they say, “data has gravity”. Once you pass certain critical thresholds, data starts generating strong positive feedback effects and thereby attracts even more data. That is why investors love it. Using smart data strategies, technology companies can build a moat around themselves and render their business models a lot more defensible.

In a typical private company, what data scientists do is to throw thousands of different neural networks at some massive internal data sets and simply observe which one gets the job done better. This of course is empiricism in its purest form, not any different than blindly screening millions of compounds during a drug development process. As they say, just throw it against a wall and see if it sticks.

This brings us to a major problem about big-data-driven science.


Lack of Deep Understanding

There is now a better way. Petabytes allow us to say: "Correlation is enough." We can stop looking for models. We can analyze the data without hypotheses about what it might show. We can throw the numbers into the biggest computing clusters the world has ever seen and let statistical algorithms find patterns where science cannot.

Chris Anderson - The End of Theory

We can not understand the complex machine learning models we are building. In fact, we train them the same way one trains a dog. That is why they are called black-box models. For instance, when the stock market experiences a flash crash we blame the algorithms for getting into a stupid loop, but we never really understand why they do so.

Is there any problem with this state of affairs if these models get the job done, make good predictions and (even better) earn us money? Can not scientists adopt the same pragmatic attitude of technologists and focus on results only, and suffice with successful manipulation of nature and leave true understanding aside? Are not the data sizes already too huge for human comprehension anyway? Why do we expect machines to be able to explain their thought processes to us? Perhaps they are the beginnings of the formation of a higher level life form, and we should learn to trust them about the activities they are better at than us?

Perhaps we have been under an illusion all along and our analytical models have never really penetrated that deep in to the nature anyway?

Closed analytic solutions are nice, but they are applicable only for simple configurations of reality. At best, they are toy models of simple systems. Physicists have known for centuries that the three-body problem or three dimensional Navier Stokes do not afford a closed form analytic solutions. This is why all calculations about the movement of planets in our solar system or turbulence in a fluid are all performed by numerical methods using computers.

Carlos E. Perez - The Delusion of Infinite Precision Numbers

Is it a surprise that as our understanding gets more complete, our equations become harder to solve?

To illustrate this point of view, we can recall that as the equations of physics become more fundamental, they become more difficult to solve. Thus the two-body problem of gravity (that of the motion of a binary star) is simple in Newtonian theory, but unsolvable in an exact manner in Einstein’s Theory. One might imagine that if one day the equations of a totally unified field are written, even the one-body problem will no longer have an exact solution!

Laurent Nottale - The Relativity of All Things (Page 305)

It seems like the entire history of science is a progressive approximation to an immense computational complexity via increasingly sophisticated (but nevertheless quiet simplistic) analytical models. This trend obviously is not sustainable. At some point we should perhaps just stop theorizing and let the machines figure out the rest:

In new research accepted for publication in Chaos, they showed that improved predictions of chaotic systems like the Kuramoto-Sivashinsky equation become possible by hybridizing the data-driven, machine-learning approach and traditional model-based prediction. Ott sees this as a more likely avenue for improving weather prediction and similar efforts, since we don’t always have complete high-resolution data or perfect physical models. “What we should do is use the good knowledge that we have where we have it,” he said, “and if we have ignorance we should use the machine learning to fill in the gaps where the ignorance resides.”

Natalie Wolchover - Machine Learning’s ‘Amazing’ Ability to Predict Chaos

Statistical approaches like machine learning have often been criticized for being dumb. Noam Chomsky has been especially vocal about this:

You can also collect butterflies and make many observations. If you like butterflies, that's fine; but such work must not be confounded with research, which is concerned to discover explanatory principles.

- Noam Chomsky as quoted in Colorless Green Ideas Learn Furiously

But these criticisms are akin to calling reality itself dumb since what we feed into the statistical models are basically virtualized fragments of reality. Analytical models conjure up abstract epi-phenomena to explain phenomena, while statistical models use phenomena to explain phenomena and turn reality directly onto itself. (The reason why deep learning is so much more effective than its peers among machine learning models is because it is hierarchical, just like the reality is.)

This brings us to the old dichotomy between facts and theories.


Facts vs Theories

Long before the computer scientists came into the scene, there were prominent humanists (and historians) fiercely defending fact against theory.

The ultimate goal would be to grasp that everything in the realm of fact is already theory... Let us not seek for something beyond the phenomena - they themselves are the theory.

- Johann Wolfgang von Goethe

Reality possesses a pyramid-like hierarchical structure. It is governed from the top by a few deep high-level laws, and manifested in its utmost complexity at the lowest phenomenological level. This means that there are two strategies you can employ to model phenomena.

  • Seek the simple. Blow your brains out, discover some deep laws and run simulations that can be mapped back to phenomena.

  • Bend the complexity back onto itself. Labor hard to accumulate enough phenomenological data and let the machines do the rote work.

One approach is not inherently superior to the other, and both are hard in their own ways. Deep theories are hard to find, and good quality facts (data) are hard to collect and curate in large quantities. Similarly, a theory-driven (mathematical) simulation is cheap to set up but expensive to run, while a data-driven (computational) simulation (of the same phenomena) is cheap to run but expensive to set up. In other words, while a data-driven simulation is parsimonious in time, a theory-driven simulation is parsimonious in space. (Good computational models satisfy a dual version of Occam’s Razor. They are heavy in size, with millions of parameters, but light to run.)

Some people try mix the two philosophies, inject our causal models into the machines and enjoy the best of both worlds. I believe that this approach is fundamentally mistaken, even if it proves to be fruitful in the short-run. Rather than biasing the machines with our theories, we should just ask them to economize their own thought processes and thereby come up with their own internal causal models and theories. After all, abstraction is just a form of compression, and when we talk about causality we (in practice) mean causality as it fits into the human brain. In the actual universe, everything is completely interlinked with everything else, and causality diagrams are unfathomably complicated. Hence, we should be wary of pre-imposing our theories on machines whose intuitive powers will soon surpass ours.

Remember that, in biological evolution, the development of unconscious (intuitive) thought processes came before the development of conscious (rational) thought processes. It should be no different for the digital evolution.

Side Note: We suffered an AI winter for mistakenly trying to flip this order and asking machines to develop rational capabilities before developing intuitional capabilities. When a scientist comes up with hypothesis, it is a simple effable distillation of an unconscious intuition which is of ineffable, complex statistical form. In other words, it is always “statistics first”. Sometimes the progression from the statistical to the causal takes place out in the open among a community of scientists (as happened in the smoking-causes-cancer research), but more often it just takes place inside the mind of a single scientist.


Continuing Role of the Scientist

Mohammed AlQuraishi, a researcher who studies protein folding, wrote an essay exploring a recent development in his field: the creation of a machine-learning model that can predict protein folds far more accurately than human researchers. AlQuiraishi found himself lamenting the loss of theory over data, even as he sought to reconcile himself to it. “There’s far less prestige associated with conceptual papers or papers that provide some new analytical insight,” he said, in an interview. As machines make discovery faster, people may come to see theoreticians as extraneous, superfluous, and hopelessly behind the times. Knowledge about a particular area will be less treasured than expertise in the creation of machine-learning models that produce answers on that subject.

Jonathan Zittrain - The Hidden Costs of Automated Thinking

The role of scientists in the data-driven paradigm will obviously be different but not trivial. Today’s world-champions in chess are computer-human hybrids. We should expect the situation for science to be no different. AI is complementary to human intelligence and in some sense only amplifies the already existing IQ differences. After all, a machine-learning model is only as good as the intelligence of its creator.

He who loves practice without theory is like the sailor who boards ship without a rudder and compass and never knows where he may cast.

- Leonardo da Vinci

Artificial intelligence (at least in its today’s form) is like a baby. Either it can be spoon-fed data or it gorges on everything. But, as we know, what makes great minds great is what they choose not to consume. This is where the scientists come in.

Deciding what experiments to conduct, what data sets to use are no trivial tasks. Choosing which portion of reality to “virtualize” is an important judgment call. Hence all data efforts are inevitably hypothesis-laden and therefore non-trivially involve the scientist.

For 30 years quantitative investing started with a hypothesis, says a quant investor. Investors would test it against historical data and make a judgment as to whether it would continue to be useful. Now the order has been reversed. “We start with the data and look for a hypothesis,” he says.

Humans are not out of the picture entirely. Their role is to pick and choose which data to feed into the machine. “You have to tell the algorithm what data to look at,” says the same investor. “If you apply a machine-learning algorithm to too large a dataset often it tends to revert to a very simple strategy, like momentum.”

The Economist - March of the Machines

True, each data generation effort is hypothesis-laden and each scientist comes with a unique set of biases generating a unique set of judgment calls, but at the level of the society, these biases get eventually washed out through (structured) randomization via sociological mechanisms and historical contingencies. In other words, unlike the individual, the society as a whole operates in a non-hypothesis-laden fashion, and eventually figures out the right angle. The role (and the responsibility) of the scientist (and the scientific institutions) is to cut the length of this search period as short as possible by simply being smart about it, in a fashion that is not too different from how enzymes speed up chemical reactions by lowering activation energy costs. (A scientist’s biases are actually his strengths since they implicitly contain lessons from eons of evolutionary learning. See the side note below.)

Side Note: There is this huge misunderstanding that evolution progresses via chance alone. Pure randomization is a sign of zero learning. Evolution on the other hand learns over time and embeds this knowledge in all complexity levels, ranging all the way from genetic to cultural forms. As the evolutionary entities become more complex, the search becomes smarter and the progress becomes faster. (This is how protein synthesis and folding happen incredibly fast within cells.) Only at the very beginning, in its most simplest form, does evolution try out everything blindly. (Physics is so successful because its entities are so stupid and comparatively much easier to model.) In other words, the commonly raised argument against the possibility of evolution achieving so much based on pure chance alone is correct. As mathematician Gregory Chaitin points out, “real evolution is not at all ergodic, since the space of all possible designs is much too immense for exhaustive search”.

Another venue where the scientists keep playing an important role is in transferring knowledge from one domain to another. Remember that there are two ways of solving hard problems: Diving into the vertical (technical) depths and venturing across horizontal (analogical) spaces. Machines are horrible at venturing horizontally precisely because they do not get to the gist of things. (This was the criticism of Noam Chomsky quoted above.)

Deep learning is kind of a turbocharged version of memorization. If you can memorize all that you need to know, that’s fine. But if you need to generalize to unusual circumstances, it’s not very good. Our view is that a lot of the field is selling a single hammer as if everything around it is a nail. People are trying to take deep learning, which is a perfectly fine tool, and use it for everything, which is perfectly inappropriate.

- Gary Marcus as quoted in Warning of an AI Winter


Trends Come and Go

Generally speaking, there is always a greater appetite for digging deeper for data when there is a dearth of ideas. (Extraction becomes more expensive as you dig deeper, as in mining operations.) Hence, the current trend of data-driven science is partially due to the fact that scientists themselves have ran out of sensible falsifiable hypotheses. Once the hypothesis space becomes rich again, the pendulum will inevitably swing back. (Of course, who will be doing the exploration is another question. Perhaps it will be the machines, and we will be doing the dirty work of data collection for them.)

As mentioned before, data-driven science operates stochastically in a serendipitous fashion and hypothesis-driven science operates deterministically in a directed fashion. Nature on the other hand loves to use both stochasticity and determinism together, since optimal dynamics reside - as usual - somewhere in the middle. (That is why there are tons of natural examples of structured randomnesses such as Levy Flights etc.) Hence we should learn to appreciate the complementarity between data-drivenness and hypothesis-drivenness, and embrace the duality as a whole rather than trying to break it.


If you liked this post, you will also enjoy the older post Genius vs Wisdom where genius and wisdom are framed respectively as hypothesis-driven and data-driven concepts.

physics as study of ignorance

Contemporary physics is based on the following three main sets of principles:

  1. Variational Principles

  2. Statistical Principles

  3. Symmetry Principles

Various combinations of these principles led to the birth of the following fields:

  • Study of Classical Mechanics (1)

  • Study of Statistical Mechanics (2)

  • Study of Group and Representation Theory (3)

  • Study of Path Integrals (1 + 2)

  • Study of Gauge Theory (1 + 3)

  • Study of Critical Phenomena (2 + 3)

  • Study of Quantum Field Theory (1 + 2 + 3)

Notice that all three sets of principles are based on ignorances that arise from us being inside the structure we are trying to describe. 

  1. Variational Principles arise due to our inability to experience time as a continuum. (Path information is inaccessible.)

  2. Statistical Principles arise due to our inability to experience space as a continuum. (Coarse graining is inevitable.)

  3. Symmetry Principles arise due to our inability to experience spacetime as a whole.  (Transformations are undetectable.)

Since Quantum Field Theory is based on all three principles, it seems like most of the structure we see arises from these sets of ignorances themselves. From the hypothetical outside point of view of God, none of these ignorances are present and therefore none of the entailed structures are present neither.

Study of physics is not complete yet, but its historical progression suggests that its future depends on us discovering new aspects of our ignorances:

  1. Variational Principles were discovered in the 18th Century.

  2. Statistical Principles were discovered in the 19th Century.

  3. Symmetry Principles were discovered in the 20th Century.

The million dollar question is what principle we will discover in the 21st Century. Will it help us merge General Relativity with Quantum Field Theory or simply lead to the birth of brand new fields of study?

appeal of the outrageous

We should perhaps also add to this list of criteria the response from the famous mathematician John Conway to the question of what makes a great conjecture: “It should be outrageous.” An appealing conjecture is also somewhat ridiculous or fantastic, with unforeseen range and consequences. Ideally it combines components from distant domains that haven’t met before in a single statement, like the surprising ingredients of a signature dish.

Robbert Dijkgraaf - The Subtle Art of the Mathematical Conjecture

We are used to click-bait new with outrageous titles that incite your curiosity. This may look like a one-off ugly phenomenon, but it is not. As consumers of information, we display the same behavior everywhere. This is forcing even scientists to produce counter-intuitive papers with outrageous titles so that they can attract the attention of the press. (No wonder why most published research is false!)

Generally speaking, people do not immediately recognize the importance of an emerging matter. Even in mathematics, you need to induce a shock to spur activity and convince others join you in the exploration of a new idea.

In 1872, Karl Weierstrass astounded the mathematical world by giving an example of a function that is continuous at every point but whose derivative does not exist anywhere. Such a function defied geometric intuition about curves and tangent lines, and consequently spurred much deeper investigations into the concepts of real analysis.

Robert G. Bartle &‎ Donald R. Sherbert - Introduction to Real Analysis (Page 163)

Similar to the above example, differential topology became a subject on its own and attracted a lot of attention only after John Milnor shocked the world by showing that 7 dimensional sphere admits exactly 28 different oriented diffeomorphism classes of differentiable structures. (Why 28, right? It actually marks the beginning of one of the most amazing number sequences in mathematics.)

charisma and meaning as rapid expansions

Charisma is geometric phenomenon, generated via a rapid spatiotemporal expansion of the self within the physical space.

Next time you enter that Japanese restaurant enter the place as if you own it and eat that edamame like you have been eating it for the last one hundred years.


Meaning is a topological phenomenon, generated via a rapid spatiotemporal expansion of the self within the social graph.

The crusader's life gains purpose by suborning his heart and soul to a cause greater than himself; the traditionalist finds the transcendent by linking her life to traditions whose reach extend far past herself.

Tanner Greer - Questing for Transcendence

necessity of dualities

All truths lie between two opposite positions. All dramas unfold between two opposing forces. Dualities are both ubiquitous and fundamental. They shape both our mental and physical worlds.

Here are some examples:

Mental

objective | subjective
rational | emotional
conscious | unconscious
reductive | inductive
absolute | relative
positive | negative
good | evil
beautiful | ugly
masculine | feminine


Physical

deterministic | indeterministic
continuous | discrete
actual | potential
necessary | contingent
inside | outside
infinite | finite
global | local
stable | unstable
reversible | irreversible

Notice that even the above split between the two groups itself is an example of duality.

These dualities arise as an epistemological byproduct of the method of analytical inquiry. That is why they are so thoroughly infused into the languages we use to describe the world around us.

Each relatum constitutive of dipolar conceptual pairs is always contextualized by both the other relatum and the relation as a whole, such that neither the relata (the parts) nor the relation (the whole) can be adequately or meaningfully defined apart from their mutual reference. It is impossible, therefore, to conceptualize one principle in a dipolar pair in abstraction from its counterpart principle. Neither principle can be conceived as "more fundamental than," or "wholly derivative of" the other.

Mutually implicative fundamental principles always find their exemplification in both the conceptual and physical features of experience. One cannot, for example, define either positive or negative numbers apart from their mutual implication; nor can one characterize either pole of a magnet without necessary reference to both its counterpart and the two poles in relation - i.e. the magnet itself. Without this double reference, neither the definiendum nor the definiens relative to the definition of either pole can adequately signify its meaning; neither pole can be understood in complete abstraction from the other.

- Epperson & Zafiris - Foundations of Relational Realism (Page 4)


Various lines of Eastern religious and philosophical thinkers intuited how languages can hide underlying unity by artificially superimposing conceptual dualities (the primary of which is the almighty object-subject duality) and posited the nondual wholesomeness of nature several thousand years before the advent of quantum mechanics. (The analytical route to enlightenment is always longer than the intuitive route.)

Western philosophy on the other hand

  • ignored the mutually implicative nature of all dualities and denied the inaccessibility of wholesomeness of nature to analytical inquiry.

  • got fooled by the precision of mathematics which is after all just another language invented by human beings.

  • confused partial control with understanding and engineering success with ontological precision. (Understanding is a binary parameter, meaning that either you understand something or you do not. Control on the other hand is a continuous parameter, meaning that you can have partial control over something.)

As a result Western philosophers mistook representation as reality and tried to confine truth to one end of each dualism in order to create a unity of representation matching the unity of reality.

Side Note: Hegel was an exception. Like Buddha, he too saw dualities as artificial byproducts of analysis, but unlike him, he suggested that one should transcend them via synthesis. In other words, for Buddha unity resided below and for Hegel unity resided above. (Buddha wanted to peel away complexity to its simplest core, while Hegel wanted to embrace complexity in its entirety.) While Buddha stopped theorizing and started meditating instead, Hegel saw the salvation through higher levels of abstraction via alternating chains of analyses and syntheses. (Buddha wanted to turn off cognition altogether, while Hegel wanted to turn up cognition full-blast.) Perhaps at the end of the day they were both preaching the same thing. After all, at the highest level of abstraction, thinking probably halts and emptiness reigns.

It was first the social thinkers who woke up and revolted against the grand narratives built on such discriminative pursuits of unity. There was just way too much politically and ethically at stake for them. The result was an overreaction, replacing unity with multiplicity and considering all points of views as valid. In other words, the pendulum swung the other way and Western philosophy jumped from one state of deep confusion into another. In fact, this time around the situation was even worse since there was an accompanying deep sense of insecurity as well.

The cacophony spread into hard sciences like physics too. Grand narrations got abandoned in favor of instrumental pragmatism. Generations of new physicists got raised as technicians who basically had no clue about the foundations of their disciplines. The most prominent of them could even publicly make an incredibly naive claim such as “something can spontaneously arise from nothing through a quantum fluctuation” and position it as a non-philosophical and non-religious alternative to existing creation myths.

Just to be clear, I am not trying to argue here in favor of Eastern holistic philosophies over Western analytic philosophies. I am just saying that the analytic approach necessitates us to embrace dualities as two-sided entities, including the duality between holistic and analytic approaches.


Politics experienced a similar swing from conservatism (which hailed unity) towards liberalism (which hailed multiplicity). During this transition, all dualities and boundaries got dissolved in the name of more inclusion and equality. The everlasting dynamism (and the subsequent wisdom) of dipolar conceptual pairs (think of magnetic poles) got killed off in favor of an unsustainable burst in the number of ontologies.

Ironically, liberalism resulted in more sameness in the long run. For instance, the traditional assignment of roles and division of tasks between father and mother got replaced by equal parenting principles applied by genderless parents. Of course, upon the dissolution of the gender dipolarity, the number of parents one can have became flexible as well. Having one parent became as natural as having two, three or four. In other words, parenting became a community affair in its truest sense.

 
Duality.png
 

The even greater irony was that liberalism itself forgot that it represented one extreme end of another duality. It was in a sense a self-defeating doctrine that aimed to destroy all discriminative pursuits of unity except for that of itself. (The only way to “resolve” this paradox is to introduce a conceptual hierarchy among dualities where the higher ones can be used to destroy the lower ones, in a fashion that is similar to how mathematicians deal with Russell’s paradox in set theory.)


Of course, at some point the pendulum will swing back to pursuit of unity again. But while we swing back and forth between unity and multiplicity, we keep skipping the only sources of representational truths, namely the dualities themselves. For some reason we are extremely uncomfortable with the fact that the world can only be represented via mutually implicative principles. We find “one” and “infinity” tolerable but “two” arbitrary and therefore abhorring. (Prevalence of “two” in mathematics and “three” in physics was mentioned in a previous blog post.)

I am personally obsessed with “two”. I look out for dualities everywhere and share the interesting finds here on my blog. In fact, I go even further and try to build my entire life on dualities whose two ends mutually enhance each other every time I visit them.

We should not collapse dualities into unities for the sake of satisfying our sense of belonging. We need to counteract this dangerous sociological tendency using our common sense at the individual level. Choosing one side and joining the groupthink is the easy way out. We should instead strive to carve out our identities by consciously sampling from both sides. In other words, when it comes to complex matters, we should embrace the dualities as a whole and not let them split us apart. (Remember, if something works very well, its dual should also work very well. However, if something is true, its dual has to be wrong. This is exactly what separates theory from reality.)

Of course, it is easy to talk about these matters, but who said that pursuit of truth would be easy?

Perhaps there is no pursuit to speak of unless one is pre-committed to choose a side, and swinging back and forth between the two ends of a dualism is the only way nature can maintain its neutrality without sacrificing its dynamicity? (After all, there is no current without a polarity in the first place.)

Perhaps we should just model our logic after reality (like Hegel wanted to) and rather than expect reality to conform to our logic? (In this way we can have our cake and eat it too!)

thoughts on cybersecurity business

  • Cybersecurity and drug development are similar in the sense that neither can harbor deep, long-lived productification processes. Problems are dynamical. Enemies eventually evolve protection against productified attacks.

  • Cybersecurity and number theory are similar in the sense that they contain the hardest problems of their respective fields and are not built on a generally-agreed-upon core body of knowledge. Nothing meaningful is accessible to beginner-level students since all sorts of techniques from other subfields are utilized to crack problems.

Hence, in its essence, cyber security is an elite services business. Anyone else claiming the opposite (that it is a product company, that it does not necessitate the recruitment of the best minds of the industry) is selling a sense of security, not real security.

thoughts on abstraction

Why is it always the case that formulation of deeper physics require more abstract mathematics? Why does understanding get better as it zooms out?

Side Note: Notice that there are two ways of zooming out. First, you can abstract by ignoring details. This is actually great for applications, but not good for understanding. It operates more like chunking, coarse-graining, forming equivalence classes etc. You end up sacrificing accuracy for the sake of practicality. Second, you can abstract in the sense of finding an underlying structure that allows you to see two phenomena as different manifestations of the same phenomenon. This is actually the meaning that we will be using throughout the blogpost. While coarse graining is easy, discovering an underlying structure is hard. You need to understand the specificity of a phenomenon which you normally consider to be general.

For instance, a lot of people are unsatisfied with the current formulation of quantum physics, blaming it for being too instrumental. Yes, the math is powerful. Yes, the predictions turn out to be correct. But the mathematical machinery (function spaces etc.) feels alien, even after one gets used to it over time. Or compare the down-to-earth Feynman diagrams with the amplituhedron theory... Again, you have a case where a stronger and more abstract beast is posited to dethrone a multitude of earthlings.

Is the alienness a price we have to pay for digging deeper? The answer is unfortunately yes. But this should not be surprising at all:

  • We should not expect to be able to explain deeper physics (which is so removed from our daily lives) using basic mathematics inspired from mundane physical phenomena. Abstraction gives us the necessary elbow room to explore realities that are far-removed from our daily lives.

  • You can use the abstract to can explain the specific but you can not proceed the other way around. Hence as you understand more, you inevitably need to go higher up in abstraction. For instance, you may hope that a concept as simple as the notion of division algebra will be powerful enough to explain all of physics, but you will sooner or later be gravely disappointed. There is probably a deeper truth lurking behind such a concrete pattern.



Abstraction as Compression

The simplicities of natural laws arise through the complexities of the languages we use for their expression.

- Eugene Wigner

That the simplest theory is best, means that we should pick the smallest program that explains a given set of data. Furthermore, if the theory is the same size as the data, then it is useless, because there is always a theory that is the same size as the data that it explains. In other words, a theory must be a compression of the data, and the greater the compression, the better the theory. Explanations are compressions, comprehension is compression!

Chaitin - Metaphysics, Metamathematics and Metabiology

We can not encode more without going more abstract. This is a fundamental feature of the human brain. Either you have complex patterns based on basic math or you have simple patterns based on abstract math. In other words, complexity is either apparent or hidden, never gotten rid of. (i.e. There is no loss of information.) By replacing one source of cognitive strain (complexity) with another source of cognitive strain (abstraction), we can lift our analysis to higher-level complexities.

In this sense, progress in physics is destined to be of an unsatisfactory nature. Our theories will keep getting more abstract (and difficult) at each successive information compression. 

Don't think of this as a human tragedy though! Even machines will need abstract mathematics to understand deeper physics, because they too will be working under resource constraints. No matter how much more energy and resources you summon, the task of simulating a faithful copy of the universe will always require more.

As Bransford points out, people rarely remember written or spoken material word for word. When asked to reproduce it, they resort to paraphrase, which suggests that they were able to store the meaning of the material rather than making a verbatim copy of each sentence in the mind. We forget the surface structure, but retain the abstract relationships contained in the deep structure.

Jeremy Campbell - Grammatical Man (Page 219)

Depending on context, category theoretical techniques can yield proofs shorter than set theoretical techniques can, and vice versa. Hence, a machine that can sense when to switch between these two languages can probe the vast space of all true theories faster. Of course, you will need human aide (enhanced with machine learning algorithms) to discern which theories are interesting and which are not.

Abstraction is probably used by our minds as well, allowing it to decrease the number of used neurons without sacrificing explanatory power.

Rolnick and Max Tegmark of the Massachusetts Institute of Technology proved that by increasing depth and decreasing width, you can perform the same functions with exponentially fewer neurons. They showed that if the situation you’re modeling has 100 input variables, you can get the same reliability using either 2100 neurons in one layer or just 210 neurons spread over two layers. They found that there is power in taking small pieces and combining them at greater levels of abstraction instead of attempting to capture all levels of abstraction at once.

“The notion of depth in a neural network is linked to the idea that you can express something complicated by doing many simple things in sequence,” Rolnick said. “It’s like an assembly line.”

- Foundations Built for a General Theory of Neural Networks (Kevin Hartnett)

In a way, the success of neural network models with increased depth reflect the hierarchical aspects of the phenomena themselves. We end up mirroring nature more closely as we try to economize our models.


Abstraction as Unlearning

Abstraction is not hard because of technical reasons. (On the contrary, abstract things are easier to manipulate due to their greater simplicities.) It is hard because it involves unlearning. (That is why people who are better at forgetting are also better at abstracting.)

Side Note: Originality of the generalist is artistic in nature and lies in the intuition of the right definitions. Originality of the specialist is technical in nature and lies in the invention of the right proof techniques.

Globally, unlearning can be viewed as the Herculean struggle to go back to the tabula rasa state of a beginner's mind. (In some sense, what takes a baby a few months to learn takes humanity hundreds of years to unlearn.) We discard one by one what has been useful in manipulating the world in favor of getting closer to the truth.

Here are some beautiful observations of a physicist about the cognitive development of his own child:

My 2-year old’s insight into quantum gravity. If relative realism is right then ‘physical reality’ is what we experience as a consequence of looking at the world in a certain way, probing deeper and deeper into more and more general theories of physics as we have done historically (arriving by now at two great theories, quantum and gravity) should be a matter of letting go of more and more assumptions about the physical world until we arrive at the most general theory possible. If so then we should also be able to study a single baby, born surely with very little by way of assumptions about physics, and see where and why each assumption is taken on. Although Piaget has itemized many key steps in child development, his analysis is surely not about the fundamental steps at the foundation of theoretical physics. Instead, I can only offer my own anecdotal observations.

Age 11 months: loves to empty a container, as soon as empty fills it, as soon as full empties it. This is the basic mechanism of waves (two competing urges out of phase leading to oscillation).

Age 12-17 months: puts something in drawer, closes it, opens it to see if it is still there. Does not assume it would still be there. This is a quantum way of thinking. It’s only after repeatedly finding it there that she eventually grows to accept classical logic as a useful shortcut (as it is in this situation).

Age 19 months: comes home every day with mother, waves up to dad cooking in the kitchen from the yard. One day dad is carrying her. Still points up to kitchen saying ‘daddy up there in the kitchen’. Dad says no, daddy is here. She says ‘another daddy’ and is quite content with that. Another occasion, her aunt Sarah sits in front of her and talks to her on my mobile. When asked, Juliette declares the person speaking to her ‘another auntie Sarah’. This means that at this age Juliette’s logic is still quantum logic in which someone can happily be in two places at the same time.

Age 15 months (until the present): completely unwilling to shortcut a lego construction by reusing a group of blocks, insists on taking the bits fully apart and then building from scratch. Likewise always insists to read a book from its very first page (including all the front matter). I see this as part of her taking a creative control over her world.

Age 20-22 months: very able to express herself in the third person ‘Juliette is holding a spoon’ but finds it very hard to learn about pronouns especially ‘I’. Masters ‘my’ first and but overuses it ‘my do it’. Takes a long time to master ‘I’ and ‘you’ correctly. This shows that an absolute coordinate-invariant world view is much more natural than a relative one based on coordinate system in which ‘I’ and ‘you’ change meaning depending on who is speaking. This is the key insight of General Relativity that coordinates depend on a coordinate system and carry no meaning of themselves, but they nevertheless refer to an absolute geometry independent of the coordinate system. Actually, once you get used to the absolute reference ‘Juliette is doing this, dad wants to do that etc’ it’s actually much more natural than the confusing ‘I’ and ‘you’ and as a parent I carried on using it far past the time that I needed to. In the same way it’s actually much easier to do and teach differential geometry in absolute coordinate-free terms than the way taught in most physics books.

Age 24 months: until this age she did not understand the concept of time. At least it was impossible to do a bargain with her like ‘if you do this now, we will go to the playground tomorrow’ (but you could bargain with something immediate). She understood ‘later’ as ‘now’.

Age 29 months: quite able to draw a minor squiggle on a bit of paper and say ‘look a face’ and then run with that in her game-play. In other words, very capable of abstract substitutions and accepting definitions as per pure mathematics. At the same time pedantic, does not accept metaphor (‘you are a lion’ elicits ‘no, I’m me’) but is fine with similie, ‘is like’, ‘is pretending to be’.

Age 31 months: understands letters and the concept of a word as a line of letters but sometimes refuses to read them from left to right, insisting on the other way. Also, for a time after one such occasion insisted on having her books read from last page back, turning back as the ‘next page’. I interpret this as her natural awareness of parity and her right to demand to do it her own way.

Age 33 months (current): Still totally blank on ‘why’ questions, does not understand this concept. ‘How’ and ‘what’ are no problem. Presumably this is because in childhood the focus is on building up a strong perception of reality, taking on assumptions without question and as quickly as possible, as it were drinking in the world.

... and just in the last few days: remarked ‘oh, going up’ for the deceleration at the end of going down in an elevator, ‘down and a little bit up’ as she explained. And pulling out of my parking spot insisted that ‘the other cars are going away’. Neither observation was prompted in any way. This tells me that relativity can be taught at preschool.

- Algebraic Approach to Quantum Gravity I: Relative Realism (S. Majid)


Abstraction for Survival

The idea, according to research in Psychology of Aesthetics, Creativity, and the Arts, is that thinking about the future encourages people to think more abstractly—presumably becoming more receptive to non-representational art.

- How to Choose Wisely (Tom Vanderbilt)

Why do some people (like me) get deeply attracted to abstract subjects (like Category Theory)?

One of the reasons could be related to the point made above. Abstract things have higher chances of survival and staying relevant because they are less likely to be affected by the changes unfolding through time. (Similarly, in the words of Morgan Housel, "the further back in history you look, the more general your takeaways should be.") Hence, if you have an hunger for timelessness or a worry about being outdated, then you will be naturally inclined to move up the abstraction chain. (No wonder why I am also obsessed with the notion of time.)

Side Note: The more abstract the subject, the less community around it is willing to let you attach your name to your new discoveries. Why? Because the half-life of discoveries at higher levels of abstraction is much longer and therefore your name will live on for a much longer period of time. (i.e. It makes sense to be prudent.) After being trained in mathematics for so many years, I was shocked to see how easily researchers in other fields could “arrogantly” attach their names to basic findings. Later I realized that this behavior was not out of arrogance. These fields were so far away from truth (i.e. operating at very low levels of abstraction) that half-life of discoveries were very short. If you wanted to attach your name to a discovery, mathematics had a high-risk-high-return pay-off structure while these other fields had a low-risk-low-return structure.

But the higher you move up in the abstraction chain, the harder it becomes for you to innovate usefully. There is less room to play around since the objects of study have much fewer properties. Most of the meaningful ideas have already been fleshed out by others who came before you.

In other words, in the realm of ideas, abstraction acts as a lever between probability of longevity and probability of success. If you aim for a higher probability of longevity, then you need to accept the lower probability of success.

That is why abstract subjects are unsuitable for university environments. The pressure of "publish or perish" mentality pushes PhD students towards quick and riskless incremental research. Abstract subjects on the other hand require risky innovative research which may take a long time to unfold and result in nothing publishable.

Now you may be wondering whether the discussion in the previous section is in conflict with the discussion here. How can abstraction be both a process of unlearning and a means for survival? Is not the evolutionary purpose of learning to increase the probability of survival? I would say that it all depends on your time horizon. To survive the immediate future, you need to learn how your local environment operates and truth is not your primary concern. But as your time horizon expands into infinity, what is useful and what is true become indistinguishable, as your environment shuffles through all allowed possibilities.

biology as computation

If the 20th century was the century of physics, the 21st century will be the century of biology. While combustion, electricity and nuclear power defined scientific advance in the last century, the new biology of genome research - which will provide the complete genetic blueprint of a species, including the human species - will define the next.

Craig Venter & Daniel Cohen - The Century of Biology

It took 15 years for technology to catch up with this audacious vision that was articulated in 2004. Investors who followed the pioneers got severely burned by the first hype cycle, just like those who got wiped out by the dot-com bubble.

But now the real cycle is kicking in. Cost of sequencing, storing and analyzing genomes dropped dramatically. Nations are finally initiating population wide genetics studies to jump-start their local genomic research programs. Regulatory bodies are embracing the new paradigm, changing their standards, approving new gene therapies, curating large public datasets and breaking data silos. Pharmaceutical companies and new biotech startups are flocking in droves to grab a piece of the action. Terminal patients are finding new hope in precision medicine. Consumers are getting accustomed to clinical genomic diagnostics. Popular culture is picking up as well. Our imagination is being rekindled. Skepticism from the first bust is wearing off as more and more success stories pile up.

There is something much deeper going on too. It is difficult to articulate but let me give a try.

Mathematics did a tremendous job at explaining physical phenomena. It did so well that all other academic disciplines are still burning with physics envy. As the dust settled and our understanding of physics got increasingly more abstract, we realized something more, something that is downright crazy: Physics seems to be just mathematics and nothing else. (This merits further elaboration of course, but I will refrain from doing so.)

What about biology? Mathematics could not even scratch its surface. Computer science on the other hand proved to be wondrously useful, especially after our data storage and analytics capabilities passed a certain threshold.

Although currently a gigantic subject on its own, at its foundations, computer science is nothing but constructive mathematics with space and time constraints. Note that one can not even formulate a well-defined notion of complexity without such constraints. For physics, complexity is a bug, not a feature, but for biology it is the most fundamental feature. Hence it is not a surprise that mathematics is so useless at explaining biological phenomena. 

The fact that analogies between computer science and biology are piling up gives me the feeling that we will soon (within this century) realize that biology and computer science are really just the same subject.

This may sound outrageous today but that is primarily because computer science is still such a young subject. Just like physics converged to mathematics overtime, computer science will converge to biology. (Younger subject converges to the older subject. That is why you should always pay attention when a master of the older subject has something to say about the younger converging subject.)

The breakthrough moment will happen when computer scientists become capable of exploiting the physicality of information itself, just like biology does. After all hardware is just frozen software and information itself is something physical that can change shape and exhibit structural functionalities. Today we freeze because we do not have any other means of control. In the future, we will learn how to exert geometric control and thereby push evolution into a new phase that exhibits even more teleological tendencies.

A visualization of the AlexNet deep neural network by Graphcore

A visualization of the AlexNet deep neural network by Graphcore


If physics is mathematics and biology is computer science, what is chemistry then?

Chemistry seems to be an ugly chimera. It can be thought of as the study of either complicated physical states or failed biological states. (Hat tip to Deniz Kural for the latter suggestion.) In other words, it is the collection of all the degenerate in-between phenomena. Perhaps this is the reason why it does not offer any deep insights, while physics and biology are philosophically so rich.

a visual affair

We vastly overvalue visual input over other sources of sensual inputs since most of our bandwidth is devoted to vision:

Source: David McCandless - The Beauty of Data Visualization (The small white corner represents the total bandwidth that we can actually be aware of.)

Source: David McCandless - The Beauty of Data Visualization (The small white corner represents the total bandwidth that we can actually be aware of.)

This bias infiltrates both aesthetics and science:

  • The set of people you find beautiful will change drastically if you lose your eyesight. (Get a full body massage and you will see what I mean.)

  • We explain auditory phenomenon in terms of mathematical metaphors that burgeoned out of visual inputs. There are no mathematical metaphors with auditory origin, and therefore no scientific explanations of visual phenomenon in terms of auditory expressions. Rationality is a strictly visual affair. In fact, the word "idea" has etymological roots going back to the Greek word "Edeo" - "to see". (No wonder why deep neural networks mimicking the structure of our visual system has become so successful in machine learning challenges.)