definitions as explanations

Of course, since we have a definition for truth and since every definition enables us to replace the definiendum by its definiens, an elimination of the term "true" in its semantic sense is always theoretically possible... Some people have therefore urged that the term "true" in the semantic sense can always be eliminated, and that for this reason the semantic conception of truth is altogether sterile and useless... If, however, anyone continues to urge that — because of the theoretical possibility of eliminating the word "true" on the basis of its definition — the concept of truth is sterile, he must accept the further conclusion that all defined notions are sterile. But this outcome is so absurd and so unsound historically that any comment on it is unnecessary. In fact, I am rather inclined to agree with those who maintain that the moments of greatest creative advancement in science frequently coincide with the introduction of new notions by means of definition.

Tarski - The Semantic Conception of Truth

Definitions form the building blocks of theorems. If you can pinpoint the correct ones, then you are more likely to get interesting deductions. That is why good definitions advance science.

Definitions by themselves have no intrinsic value. They are nothing but labels attached to some collection of mathematical sentences and physical phenomena. That is why a computer will never find the introduction of a new definition illuminating.

Definition are not explanations, but the good ones can give the sensation of an explanation. For instance, more than half of this Wikipedia article consists of definitions. Nevertheless it is extremely illuminating!

ideal researcher

Publishing thousands of pages over one's lifetime in numerous books and technical articles should not be the aim. The goal should be coming up with an original, simple, beautiful and ground breaking idea that can be explained to a smart high-school student in a couple of hours.

(Of course, in today's academic environment which is ruled by the principle "publish or perish", this will amount to a career suicide. You can not risk not publishing anything!)

Coming up with one such brilliant idea and then leaving the scene is even more ideal. There is something aesthetic about letting your name be historically associated with only one single idea. You have discovered the peak of a large mountain after a rough (but very enjoyable) climb. Now the other climbers (i.e. researchers) can explore the various pathways from this peak to other mountains. You have shown them the way to the top, and unfolding the implications of your finding should be courteously left to others. If you continue hiking around in the hope of another peak, you will be exhibiting the territorial hunger of a savage. It is better to be modest. Besides there is always the risk of being forced to end your research career in an anti-climatic fashion. Legends are not made that way. You are better off leaving the stage while you are still at the top.

individualistic concepts

Some words are suitable to use only at the level of individuals. Their applications at societal level always involve an implicit personification of society.

Blame. You can blame somebody for an undesirable outcome. But blaming a class of people or a nation for an outcome does not work. In such cases we look for a leader with a there-must-be-somebody-behind-all-this attitude. You can not punish everyone, even if you know for sure that the undesired outcome was a spontaneous result of individual-level decision making. (i.e. There was no leader directing the crowd.)

Madness. As Nietzsche said madness is exception in individuals while it is a rule in crowds. Hence it does not make much sense to say the crowd is behaving madly. You do not expect rationality and consistency from crowds anyway.

many-valued logic

In standard logic there are only two truth values: 0 and 1. Here 0 stands for "False" and 1 stands for "True".

In quantum mechanics, due to the prevailing uncertainty, the truthfulness of most statements are simply unknown. In computer science, not all programs terminate for every input. In other words, viewed as functions, programs are undefined at some points of their domain. Hence, again some questions regarding the behaviour of your program, may have no answer. In other words, some statements will be neither True nor False.

Among many other motivations, these are the two reasons why a need arose for a three-value logic whose truth values consist of: 0, 1/2, 1. Here 1/2 stands for "Undefined".

Now it takes a small leap of imagination to jump into a more generalized logical system which has uncountably many truth values: [0,1] namely all the real numbers between 0 and 1. How should these varying degrees of truth be interpreted?

One interpretation is as follows. If the truth value of a statement P is greater than 1/2, then P is more likely to be true than false. In other words, proximity to 0 or 1 indicate proximity to "being false" or "being true".

What is the problem with this interpretation?

Say you have done an experiment, but the results are somewhat conflicting and therefore do not allow you to conclude whether P is true or false. Nevertheless you claim that P is more likely to be true than false. If someone asks you why, you say that "there is more evidence in favour of the truth of P."

But what does it mean to have "more" evidence for something? How do you measure evidence? There is obviously also some evidence for ~P. (Otherwise you would have concluded that P is true.) What makes this evidence less worthwhile? Of course, the link between "evidence for P" and P, and the link between "evidence for ~P" and ~P have to be tenuous. (Otherwise the totality of all evidences is inconsistent.) Hence one inevitably has to look at the contents of these argumentative links and decide which one has more strength. These measurements and decisions are obviously all subjective. Moreover, with the arrival of new evidences you may change your opinion regarding the truth value of P.

If the state of affairs is that dynamic and vague, why assign a truth value to P at all? The truth value should not be a malleable quantity. (The word itself has static underpinnings.) Until a decisive experiment is made, it should simply not exist (or be undefined so to say). Here is an exemplary attitude by an experimental physicist:

How has the universe’s expansion, and hence the influence of dark energy, changed since the Big Bang?

For cosmologists there’s this interesting moment in the very, very early universe—10–35 seconds or so after the Big Bang—called the inflationary period. Inflation was another period of acceleration, and we don’t know what caused that acceleration, either. It’s possible that there was another kind of dark energy back then. After inflation there was so much mass so close together that gravity dominated and the expansion slowed. That lasted until about halfway through the life of the universe. It was some 7 billion years before the universe expanded to the point where matter was too scattered to keep the expansion slowing. At that point, dark energy’s power started to be felt and the universe started to accelerate again.

What does this discovery mean for the fate of the universe? Will dark energy ever let up?

Well, you can just take the naive approach of saying that the universe is accelerating now, so that means it will accelerate forever and lead to a very dark, empty, cold end, and that’s all we have to look forward to. However, we should remember that we don’t know what’s causing the current acceleration, and we don’t know what caused that acceleration during inflation at the very beginning of the universe. That inflation turned around—it stopped and the universe started to decelerate. Who knows whether we’re seeing something now that might also decay away, and then the universe could collapse. So I would say that the fate of the universe has to remain in the category of unknown until we have any clue as to why it is currently accelerating.

(Discover Interview with Saul Perlmutter)


There is also something disturbing with the plurality of logics. How do you select which one is more appropriate for a given decision process? Intuitionistic logic, classical logic, three-valued logic or infinite-valued logic? (There are many more.) Making such a selection necessitates a higher level decision process. In other words the question becomes "Using which logic do you decide which logic is appropriate for the given decision process?" Of course, making this higher level selection necessitates yet another higher level decision process... Ad infinitum. In order to avoid this ad infinitum, you somehow need to reduce the number of legitimate logics available. You do not need to reduce the number to one. You only have to ensure that, at some point in the chain of decision processes, the choice of logic (from the reduced set) will make no difference. (i.e. It will be unnecessary to go up another level.)

empty restaurants

Several reasons for not entering an empty restaurant:

- You will not be able to have anonymous conversations. Since there is nobody else in the restaurant besides you, there will inevitably be some eavesdropping by the waiters.

- If you are dressed to impress, there will be no one to impress.

- All of a sudden you will find yourself perceived as a prime customer by the waiters. (Remember you will probably be their only source of a tip for the whole day.) If too much attention turns you off, do not walk in.

- There may be a good reason why the restaurant is empty. (e.g. The quality of food or service may be pretty bad.)


Further points:

- Every once in a while you should take your chances. Otherwise you will never be able to discover a gem restaurant. (i.e. a restaurant that serves great food but is ignored by most people)

- If prices are not declared upfront, you should beware. Lesser the number of customers, more is the per unit fixed cost that needs to be reflected onto each customer's bill. (Decent restaurants have fixed menus. So this advice is more for the sketchy ones.)

- In front of you are two restaurants looking exactly the same, serving exactly the same food. One is crowded while the other is empty. Which one will you choose to enter? The empty restaurant is stuck in a really bad equilibrium. (Crowd is attracting more crowd and emptiness is resulting in further emptiness.) What if it had a non-transparent façade? Most customers, once greeted with a warm welcome, would not have the face to leave the restaurant even if it is completely empty.

structured randomness

People tend to think that the evolution of a system is either deterministic or random. This dichotomy is mistaken. There are a lot of possible dynamics in between.


EXAMPLES

Physics: In Quantum Mechanics, the wave function evolves across time in a deterministic fashion as dictated by the Schrodinger equation. In other words the uncertainty is deterministically constrained.

Finance: In Black-Scholes model, volatility and drift of the underlying processes are deterministic functions of time. In other words time-evolution of the random processes have deterministic structure.

Biology 1: Invasive bacteria employ controlled genetic mutation to evade the immune systems of host bodies:

Haemophilus influenzae frequently manages to evade its host's ever-changing defenses, and also copes with the varied environments it encounters in different parts of the host's body. It does so because it possesses what Richard Moxon has called "contingency genes." These are highly mutable genes that code for products that determine the surface structures of the bacteria. Because they are so mutable, subpopulations of bacteria can survive in the different microhabitats within their host by changing their surface structures. Moreover, by constantly presenting the host's immune system with new surface molecules that it has not encountered before and does not recognize, the bacteria may evade the host's defenses... What, then, is the basis for the enormous mutation rate in these contingency genes? Characteristically, the DNA of these genes contains short nucleotide sequences that are repeated again and again, one after the other. This leads to a lot of mistakes being made as the DNA is maintained and copied... It is difficult to find an appropriate term for the type of mutational process that occurs in contingency genes. Moxon refers to it as "discriminate mutation", and the term "targeted" mutation may also be appropriate. Whatever we call it, there is little doubt that it is a product of natural selection: lineages with DNA sequences that lead to a high mutation rate in the relevant genes survive better than those with less changeable sequences. Although the changes that occur in the DNA of the targeted region are random, there is adaptive specificity in targeting the mutations in the first place.

- Eva Jablonka - Evolution in Four Dimensions (Pages 95-96)

Of course, immune systems also work in a similar fashion to counter these attacks, resulting in a never ending cat-and-mouse game. They can not store every information about every single attack experienced in the past. Instead, what they do is to store the essential motifs and complete the remaining missing pieces to achieve perfect fit via statistical honing-in processes. This approach, by the way, results in faster pattern recognition with some more false positives along the way.

Biology 2: “When sharks and other ocean predators can’t find food, they abandon Brownian motion, the random motion seen in swirling gas molecules, for Lévy flight — a mix of long trajectories and short, random movements found in turbulent fluids.” (Source)

Brownian Motion

Brownian Motion

Levy Flight

Levy Flight

Perhaps you should plan your career through a Lévy flight as well. Otherwise finding your passion within the abstract space of all possible endeavors will take a very long time! (Notice that randomization periods during a Levy flight have a natural association with mental stress, which is actually what leads to behavioral randomness in the first place.)

Biology 3: Critically random geological structures give rise to the most diverse ecological systems:

We have built civilization’s cornerstones on amorphous, impermanent stuff. Coasts, rivers, deltas, and earthquake zones are places of dramatic upheaval. Shorelines are constantly being rewritten. Rivers fussily overtop their banks and reroute themselves. With one hand, earthquakes open the earth, and with the other they send it coursing down hillsides. We settled those places for good reason. What makes them attractive is the same thing that makes them dangerous. Periodic disruption and change is the progenitor of diversity, stability, and abundance. Where there is disaster, there is also opportunity. Ecologists call it the “intermediate disturbance hypothesis.”

The intermediate disturbance hypothesis is one answer to an existential ecological question: Why are there so many different types of plants and animals? The term was first coined by Joseph Connell, a professor at UC Santa Barbara, in 1978. Connell studied tropical forests and coral reefs, and during the course of his work, he noticed something peculiar. The places with the highest diversity of species were not the most stable. In fact, the most stable and least disturbed locations had relatively low biodiversity. The same was true of the places that suffered constant upheaval. But there, in the middle, was a level of disturbance that was just right. Not too frequent or too harsh, but also not too sparing or too light. Occasional disturbances that inflict moderate damage are, ecologically speaking, a good thing.

Tim de Chant - Why We Live in Dangerous Places

Computer Science: Evolution explores fitness landscapes in a pseudo-random fashion. We assist our neural networks so that they can do the same.

... backpropagation neural networks cannot always be trusted to find just the right combination of connection weights to make correct classifications because of the possibility of being trapped in local minima. The instruction procedure is one in which error is continually reduced, like walking down a hill, but there is no guarantee that continuous walking downhill will take you to the bottom of the hill. Instead, you may find yourself in a valley or small depression part way down the hill that requires you to climb up again before you can complete your descent. Similarly, if a backpropagation neural network finds itself in a local minimum of error, it may be unable to climb out and find an adequate solution that minimizes the errors of its classifications. In such cases, one may have to start over again with a new set of initial random connection weights, making this a selectionist procedure of blind variation and selection.

Gary Cziko - Without Miracles (Page 255)

Mathematics: Various theorems hint at both the existence and lack of structure in the distribution of primes among the natural numbers.

There are two facts about the distribution of prime numbers of which I hope to convince you so overwhelmingly that they will be permanently engraved in your hearts. The first is that, despite their simple definition and role as the building blocks of the natural numbers, the prime numbers grow like weeds among the natural numbers, seeming to obey no other law than that of chance, and nobody can predict where the next one will sprout. The second fact is even more astonishing, for it states just the opposite: that the prime numbers exhibit stunning regularity, that there are laws governing their behavior, and that they obey these laws with almost military precision

- Don Zagier (Source)

You can of course enlist all primes using a deterministic algorithm that one-by-one goes through all positive integers and checks whether each is divisible by only one and itself. But this algorithm does not tell you where the next prime will be. In order to be able to know where the next prime is, you need to know where all the primes are! In other words, you need a deterministic description of the whole set of prime numbers at once. (Being able to list the primes does not help.) So far no such description has been found.

On the other hand, prime numbers is not a completely random creature carved out of natural numbers. For instance, we know that as you get away and away from zero, primes get sparser at a certain rate. (This is implied by the Prime Number Theorem.) Hence a pick among smaller positive integers is more likely to yield a prime.

(For an illustrative comparison consider "the set of primes" and "the set of all integers divisible by 3". You can generate each of these two sets using short algorithms. Hence, viewed as strings, they have similar algorithmic complexities. But generation of the former set will take place at a slower rate. Therefore, if one measures relative complexity by comparing the speeds of generating algorithms, primes look more complex.)

Here is how the first 14,683 primes look like on the Ulam Spiral:

And here is how a randomly-selected 14,683 odd numbers look like on the Ulam Spiral (Recall that all primes are located among the odd numbers.):

I leave it to your judgement whether the obvious qualitative differences hint at existence of a "structured randomness" underlying the first picture.


CHIMERA STATES

It is nice to have total synchrony. But is it not better to have synchrony and noise coexisting so that the latter can enrich the evolution of the former?

Experimental observation of such states in natural systems, neural or not, would also be extremely informative. As a matter of fact, although chimera states do not need extra structure to exist, they are not destroyed by small disorder either, which certainly strengthens their prospects for real systems. More important, additional structure can lead to a myriad of other possible behaviours, including quasiperiodic chimeras and chimeras that “breathe”, in the sense that coherence in the desynchronized population cycles up and down.

Motter - Spontaneous Synchrony Breaking

This discussion has relevance for even wealth redistribution debates. Should we let the random market forces determine the distributive outcome, or should we interfere and equalize all incomes?

Differences in wealth can seem unfair. But as we increase our understanding of complex systems, we are discovering that diversity and its associated tensions are an essential fuel of the life of these systems. A moderate income disparity (Gini between 0.25 and 0.4) encourages entrepreneurship in the economy—much lower appears to stifle dynamism, but much higher appears to engender a negativity that is not productive. A certain degree of randomness is another necessary ingredient for the vitality of a system. In many sectors, a successful enterprise requires dynamics of increasing returns as well as a good dose of luck, in addition to skill and aptitude. These vital ingredients of diversity and randomness can often seem at odds with ideals of ‘fairness’. On the flip side, too much diversity and randomness elicit calls for regulation to control the excesses.

Kitterer - Die Ausgestaltung der Mittelzuweisungen im Solidarpakt II

Here the "synchronised" region corresponds to the stable population region where there is essentially no movement between income classes. The "de-synchronised" region, on the other hand, corresponds to the unstable population region where individuals move up and down the income classes. The existence of the unstable regions requires a prior "symmetry breakdown". (This, in the previous case, was called "synchrony breakdown".) For instance, upon witnessing the recycling of individuals from rags to riches and from riches to rags, people in the stable region will be more willing to believe in the possibility of a change and more ready to exhibit assertive and innovative behaviour. This will, in turn, cause further "symmetry breakdown".


CHAOS vs RANDOMNESS

Of course it is tough to prove that an experimental phenomenon is actually governed by a structured random process. From the final signal alone, it may be impossible to disentangle the deterministic component from the stochastic one. (The words "stochastic" and "random" are synonyms.)

Moreover, deterministic systems can get very chaotic, and can generate final signals that look stochastic. Also, even if the underlying dynamics is totally deterministic, the time series generated by the dynamics may exhibit some noise due to the deficiencies in the experimental set-up.

All methods for distinguishing deterministic and stochastic processes rely on the fact that a deterministic system always evolves in the same way from a given starting point. Thus, given a time series to test for determinism, one can:

1- Pick a test state;
2- Search the time series for a similar or 'nearby' state; and
3- Compare their respective time evolutions.

Define the error as the difference between the time evolution of the 'test' state and the time evolution of the nearby state. A deterministic system will have an error that either remains small (stable, regular solution) or increases exponentially with time (chaos). A stochastic system will have a randomly distributed error. (Source)

Of course, the task gets a lot more complicated when there is interaction between the stochastic and deterministic components of the observed phenomenon:

When a non-linear deterministic system is attended by external fluctuations, its trajectories present serious and permanent distortions. Furthermore, the noise is amplified due to the inherent non-linearity and reveals totally new dynamical properties. Statistical tests attempting to separate noise from the deterministic skeleton or inversely isolate the deterministic part risk failure. Things become worse when the deterministic component is a non-linear feedback system. In presence of interactions between nonlinear deterministic components and noise, the resulting nonlinear series can display dynamics that traditional tests for nonlinearity are sometimes not able to capture. (Source)


KNIGHTIAN UNCERTAINTY

Even a pure random process in mathematics comes with structure. Although this is intuitively obvious, many students completely miss it.

Mathematics can not deal with absolute uncertainty. For instance, in order to define a stochastic process, one needs to know not just the distribution of its possible values for each point in time but also the statistical relationships between each pair of these distributions. Most people on the street would not call such a process "random". Our intuitive notion of randomness is equivalent to "complete lack of structure". From this point of view, the only kind of randomness that mathematics can talk about is "structured randomness".

There are knowns, known unknowns and unknown unknowns. Mathematics can only deal with knowns and known unknowns.

What about unknown unknowns? In other words, what if the phenomenon at hand has no structure at all? Then we are clueless. Then we have "chaos" as Dante would have used the term. Scary, but surprisingly common! In fact, economists gave it a special name: Knightian Uncertainty.

Uncertainty must be taken in a sense radically distinct from the familiar notion of Risk, from which it has never been properly separated.... The essential fact is that 'risk' means in some cases a quantity susceptible of measurement, while at other times it is something distinctly not of this character; and there are far-reaching and crucial differences in the bearings of the phenomena depending on which of the two is really present and operating.... It will appear that a measurable uncertainty, or 'risk' proper, as we shall use the term, is so far different from an unmeasurable one that it is not in effect an uncertainty at all.

Knight - Risk, Uncertainty and Profit (Page 19)

Economists do not like to publish research papers on Knightian Uncertainty because the subject does not lend itself to much analysis.

Paul Samuelson once went so far as to argue that economics must surrender its pretensions to science if it cannot assume the economy is “ergodic”, which is a fancy way of saying that Fortune’s wheel will spin tomorrow much as it did today (and that tomorrow's turn of the wheel is independent of today's). To relax that assumption, Mr Samuelson has argued, is to take the subject “out of the realm of science into the realm of genuine history”.

The scientific pose has great appeal. But this crisis is reminding us again of its intellectual costs. Knightian uncertainty may be fiendishly hard to fathom, but ignoring it, as economists tend to do, makes other phenomena devilishly hard to explain. The thirst for liquidity—the sudden surge in the propensity to hoard—is one example. If risks are calculable, then investors will place their bets and roll the dice. Only if they are incalculable will they try to take their chips off the table altogether, in a desperate scramble for cash (or near-cash). As Keynes put it, “our desire to hold money as a store of wealth is a barometer of the degree of our distrust of our own calculations and conventions concerning the future.”

Olivier Blanchard - (Nearly) Nothing to Fear But Fear Itself


MATHEMATICS ITSELF

What about mathematics itself? In some sense, mathematics is the study of a large set of artificially introduced structured randomnesses. Definitions are made to delineate the objects of study, and these definitions often lie on a subtle mid-ground between generality and specificity. You do not want your object to be too general. Otherwise you will not be able to make numerous interesting deductions about it. You do not want your object to be too specific neither. Otherwise the deduced results will have very little significance in terms of applicability and universality.

Now let's try to replace the words specific with "overly-determined" and general with "overly-random".

If your definition is too general, the possibility of a "randomly picked" structure (from the class of all possible mathematical structures) fitting your description is very high. In other words, a highly randomized mechanism will pick up satisfactory objects with a high probability of success. If your definition is too specific, then such a randomized mechanism will fail miserably. In that case, you will need a procedure with a deterministic component that can seek out the few satisfactory structures among the many.

It is tough to make these ideas mathematically precise. For example, what does it mean to say that there are less commutative rings than rings? Counting becomes a tricky business when you are dealing with different magnitudes of infinities. Recall Ostrowski's Theorem: "Up to isomorphisms which preserve absolute values, real numbers and complex numbers are the only fields that are complete with respect to an archimedean absolute value." Hence, intuitively it makes sense to say that there are more "fields" than "fields that are complete with respect to an achimedean absolute value." But again we are dealing isomorphism classes here. In other words, we are facing collections that are too large to be considered as sets! Even if we cap the collection above by restricting ourselves to sets of cardinality less than some K, we still can not legitimately say that there are "more" fields.

One way to make our use of "more than" mathematically precise is as follows: If all instances of A are also instances of B, then we say there are more B's than A's. This will allow us to circumvent the size issues. (This is actually the perspective adopted in Model Theory: Theory T is nested inside Theory S if every model of S is also a model of T.) This approach has a serious drawback though: We can not compare objects that are not nested inside each other!

"One way to understand a complicated object is to study properties of a random thing which matches it in a few salient features,"
- Persi Diaconis

Diaconis said this in relation to the physicists' use of random matrices as models of complex quantum systems. (By the way, the referred quantum phenomena are mysteriously related to the distribution of prime numbers.) Nevertheless the remark suits our context as well. For instance, in order to understand integers we study rings, integral domains etc. The set of integers is an "overly-determined" object, while ring is a "critically-random" object that satisfies some important features of integers. Integers form a ring, but not every ring is the set of integers.

perceived randomness

Tossing a fair coin is not really a random experiment. In fact, it is not even a well-defined experiment. Every time you toss the coin, you are actually conducting a different experiment. The coin you toss this round is never exactly the same one you tossed the previous round. (Perhaps now it is wetter and a little more damaged.) Environmental conditions have also changed since the last toss. (Perhaps now the air has less oxygen and the table has more scratches.) Moreover it is literally impossible for you to be able to toss the coin in exactly the same fashion as you did the last time.

Each round of tossing is a different experiment, each of which is governed by the deterministic Newtonian laws. Hence the observed randomness is not due to the underlying dynamics. It is entirely to due to false perception. (Even if you collate all these experiments under one heading, you will not be able to simulate randomness. Some regularity will inevitably pop up in the distribution. Here is a recent demonstration.)

In any case, calling coin flipping random amounts to an on-purpose ignorance of all relevant scientific knowledge:

The mathematician tends to think of a random experiment as an abstraction – really nothing more than a sequence of numbers. To define the ‘nature’ of the random experiment, he introduces statements – variously termed assumptions, postulates, or axioms – which specify the sample space and assert the existence, and certain other properties, of limiting frequencies. But, in the real world, a random experiment is not an abstraction whose properties can be defined at will. It is surely subject to the laws of physics; yet recognition of this is conspicuously missing from frequentist expositions of probability theory. Even the phrase ‘laws of physics’ is not to be found in them.

Jaynes - Probability Theory (Page 315)

A holdout can always claim that tossing the coin in any of the four specific ways described is ‘cheating’, and that there exists a ‘fair’ way of tossing it, such that the ‘true’ physical probabilities of the coin will emerge from the experiment. But again, the person who asserts this should be prepared to define precisely what this fair method is, otherwise the assertion is without content... It is difficult to see how one could define a ‘fair’ method of tossing except by the condition that it should result in a certain frequency of heads; and so we are involved in a circular argument... It is sufficiently clear already that analysis of coin and die tossing is not a problem of abstract statistics, in which one is free to introduce postulates about ‘physical probabilities’ which ignore the laws of physics. It is a problem of mechanics, highly complicated and irrelevant to probability theory except insofar as it forces us to think a little more carefully about how probability theory must be formulated if it is to be applicable to real situations.

Jaynes - Probability Theory (Page 321)

Brownian motion is also not truly random. It is just easier for us to model the phenomenon as if the underlying behaviour is random. Since there are so many particles involved, it is practically impossible to calculate the future behaviour of the whole system from first principles. Therefore we are forced to reason in a statistical way. We conjure up the notion of an "average" particle etc. (Of course the average particle never exists, for the same reason as there is often no person in class that has the average height of the class.)

What about the randomness observed in quantum mechanics? Is it a "perceived" one as in coin flipping and Brownian motion? The Copenhagen interpretation claims that this is not the case, and asserts that the randomness observed here is a real part of the physical process.

Under the Copenhagen interpretation, quantum mechanics is nondeterministic, meaning that it generally does not predict the outcome of any measurement with certainty. Instead, it tells us what the probabilities of the outcomes are. This leads to the situation where measurements of a certain property done on two apparently identical systems can give different answers. The question arises whether there might be some deeper reality hidden beneath quantum mechanics, to be described by a more fundamental theory that can always predict the outcome of each measurement with certainty. In other words if the exact properties of every subatomic particle and smaller were known the entire system could be modeled exactly using deterministic physics similar to classical physics. In other words, the Copenhagen interpretation of quantum mechanics might be an incomplete description of reality. Physicists supporting the Bohmian interpretation of quantum mechanics maintain that underlying the probabilistic nature of the universe is an objective foundation/property — the hidden variable. Others, however, believe that there is no deeper reality in quantum mechanics — experiments have shown a vast class of hidden variable theories to be incompatible with observations. (Source)

Although such experiments prove that there can not be a local deterministic scheme behind the observed randomness, they have not so far disproved the possibility of a determinism of non-local kind being at work. Since Bohmian interpretation puts forth a non-local theory, it is not in the above mentioned class (of hidden variable theories) that is ruled out by observations.

Bohmian interpretation postulates that the randomness in quantum mechanics is just like the randomness in statistical mechanics (e.g. Brownian motion). Namely it is due to our incomplete knowledge of the underlying deterministic factor(s). Hence, if Bohmian interpretation is correct, then we can conclude that the physical reality exhibits two layers of "perceived randomness":

Determinism in Bohmian mechanics -> Appearance of randomness in quantum mechanics -> Disappearance of randomness in Newtonian mechanics -> Reappearance of randomness in statistical mechanics

Important point: The disappearance of randomness at Newtonian level happens naturally in Bohmian interpretation. It is not an ad hoc imposition as in Copenhagen interpretation. ("Classical limit emerges from the theory rather than having to be postulated. Classical domain is where wave component of matter is passive and exerts no influence on corpuscular component."- Source)

Here are a few explanatory excerpts from Wikipedia articles. Hopefully these will clarify some of the subtleties involved:

In physics, the principle of locality states that an object is influenced directly only by its immediate surroundings... Local realism is the combination of the principle of locality with the "realistic" assumption that all objects must objectively have pre-existing values for any possible measurement before these measurements are made. Einstein liked to say that the Moon is "out there" even when no one is observing it... Local realism is a significant feature of classical mechanics, general relativity and electrodynamics, but quantum mechanics largely rejects this principle due to the presence of distant quantum entanglements, most clearly demonstrated by the EPR paradox and quantified by Bell's inequalities.

In most of the conventional interpretations, such as the version of the Copenhagen interpretation, where the wavefunction is not assumed to have a direct physical interpretation of reality, it is realism that is rejected. The actual definite properties of a physical system "do not exist" prior to the measurement, and the wavefunction has a restricted interpretation as nothing more than a mathematical tool used to calculate the probabilities of experimental outcomes, in agreement with positivism in philosophy as the only topic that science should discuss.

In the version of the Copenhagen interpretation where the wavefunction is assumed to have a physical interpretation of reality (the nature of which is unspecified) the principle of locality is violated during the measurement process via wavefunction collapse. This is a non-local process because Born's Rule, when applied to the system's wave function, yields a probability density for all regions of space and time. Upon measurement of the physical system, the probability density vanishes everywhere instantaneously, except where (and when) the measured entity is found to exist. This "vanishing" would be a real physical process, and clearly non-local (faster than light) if the wave function is considered physically real and the probability density converged to zero at arbitrarily far distances during the finite time required for the measurement process.

The Bohm interpretation preserves realism, and it needs to violate the principle of locality to achieve the required correlations [of the EPR paradox]. (Source)

...Physicists such as Alain Aspect and Paul Kwiat have performed experiments that have found violations of these inequalities up to 242 standard deviations (excellent scientific certainty). This rules out local hidden variable theories, but does not rule out non-local ones.(Source)

The currently best-known hidden-variable theory, the Causal Interpretation, of the physicist and philosopher David Bohm, created in 1952, is a non-local hidden variable theory. (Source)

Bohmian mechanics is an interpretation of quantum theory. As in quantum theory, it contains a wavefunction - a function on the space of all possible configurations. Additionally, it also contains an actual configuration, even in situations where nobody observes it. The evolution over time of the configuration (that is, of the positions of all particles or the configuration of all fields) is defined by the wave function via a guiding equation. The evolution of the wavefunction over time is given by Schrödinger's equation. (Source)

The Bohm interpretation postulates that a guide wave exists connecting what are perceived as individual particles such that the supposed hidden variables are actually the particles themselves existing as functions of that wave. (Source)

(Actually a subclass of non-local hidden-variables theories have recently been disproved as well. Check this experiment out. Note that this subclass does not include the Bohmian interpretation.)

Here is Bell's own reaction to Bohm's discovery: (By "orthodox version" Bell is referring to the "conventional" version of Copenhagen interpretation.)

But in 1952 I saw the impossible done. It was in papers by David Bohm. Bohm showed explicitly how parameters could indeed be introduced, into nonrelativistic wave mechanics, with the help of which the indeterministic description could be transformed into a deterministic one. More importantly, in my opinion, the subjectivity of the orthodox version, the necessary reference to the ‘observer,’ could be eliminated...

But why then had Born not told me of this ‘pilot wave’? If only to point out what was wrong with it? Why did von Neumann not consider it? More extraordinarily, why did people go on producing ‘‘impossibility’’ proofs, after 1952, and as recently as 1978? ... Why is the pilot wave picture ignored in text books? Should it not be taught, not as the only way, but as an antidote to the prevailing complacency? To show us that vagueness, subjectivity, and indeterminism, are not forced on us by experimental facts, but by deliberate theoretical choice? (Source)

Jaynes would have certainly concurred with Bell:

In classical statistical mechanics, probability distributions represented our ignorance of the true microscopic coordinates – ignorance that was avoidable in principle but unavoidable in practice, but which did not prevent us from predicting reproducible phenomena, just because those phenomena are independent of the microscopic details.

In current quantum theory, probabilities express our own ignorance due to our failure to search for the real causes of physical phenomena; and, worse, our failure even to think seriously about the problem. This ignorance may be unavoidable in practice, but in our present state of knowledge we do not know whether it is unavoidable in principle; the ‘central dogma’ simply asserts this, and draws the conclusion that belief in causes, and searching for them, is philosophically na¨ıve. If everybody accepted this and abided by it, no further advances in understanding of physical law would ever be made; indeed, no such advance has been made since the 1927 Solvay Congress in which this mentality became solidified into physics. But it seems to us that this attitude places a premium on stupidity; to lack the ingenuity to think of a rational physical explanation is to support the supernatural view.

Jaynes - Probability Theory (Page 329)

There seems to be something fundamentally wrong with the Copenhagen interpretation. Even if there is absolute randomness in nature, we would not be able to detect it. All inferences about nature are bound to be subjective and data-dependent. Hence, as observers, we have fundamental epistemological limitations. In particular, we can not ascertain physicality to the notion of probability. This point is summarized succinctly in slogan of Bayesian interpretation of probability: "Probability is degree of belief."

Orthodox Bayesians in the style of de Finetti recognize no rational constraints on subjective probabilities beyond: (i)conformity to the probability calculus, and (ii) a rule for updating probabilities in the face of new evidence, known as conditioning. An agent with probability function P1, who becomes certain of a piece of evidence E, should shift to a new probability function P2 related to P1 by: (Conditioning) P2(X) = P1(X | E) (provided P1(E) > 0).

A Bayesian would not care whether the perceived randomness is due to lack of information or due to true indeterminism. He would not even find the question meaningful.

P.S. There exist many interpretations of quantum mechanics. Each one is in agreement with known results and experiments. To learn more about Bohmian interpretation read this and this. In order to see how quantum-level randomness appears theoretically in Bohmian interpration, read this.

subjective randomness

People are extremely good at finding structure embedded in noise. This sensitivity to patterns and regularities is at the heart of many of the inductive leaps characteristic of human cognition, such as identifying the words in a stream of sounds, or discovering the presence of a common cause underlying a set of events. These acts of everyday induction are quite different from the kind of inferences normally considered in machine learning and statistics: human cognition usually involves reaching strong conclusions on the basis of limited data, while many statistical analyses focus on the asymptotics of large samples. The ability to detect structure embedded in noise has a paradoxical character: while it is an excellent example of the kind of inference at which people excel but machines fail, it also seems to be the source of errors in tasks at which machines regularly succeed. For example, a common demonstration conducted in introductory psychology classes involves presenting students with two binary sequences of the same length, such as HHTHTHTT and HHHHHHHH, and asking them to judge which one seems more random. When students select the former, they are told that their judgments are irrational: the two sequences are equally random, since they have the same probability of being produced by a fair coin. In the real world, the sense that some random sequences seem more structured than others can lead people to a variety of erroneous inferences, whether in a casino or thinking about patterns of births and deaths in a hospital.

- Griffiths & Tenenbaum - From Algorithmic to Subjective Randomness

We perceive the more orderly pattern HHHHHHHH to be a less likely outcome of the coin-tossing experiment, while in reality it is as likely as the other pattern HHTHTHTT.

Why do we expect all random uniform processes (i.e. experiments with uniform probability distributions) to generate visually-disordered outcomes? Recall that the second law of thermodynamics dictates that the entropy of an isolated system tends to increase over time. In other words an isolated system constantly evolves towards the more-likely-to-happen states. Since such states are often the more-disorderly-looking ones, it is not surprising that we developed the just-mentioned-expectation. Most of what-is-perceived-to-be-random (i.e. entropy-driven) in nature does in fact result in visual disorder.

The paper (from which I extracted the above quotation) suggests that our subjective interpretation of randomness is more in line with what is called algorithmic complexity. (i.e. Greater complexity is equated with greater randomness.) This observation is not surprising neither. Why? Because the-more-disorderly-looking patterns tend to have higher algorithmic complexity. (Here I put "tend" in italics because a pattern may be algorithmically simple but nevertheless visually ugly.)

There is a small caveat though. In some rare cases, the more-likely-to-happen states that an isolated system evolves towards may not at all look disorderly. In fact, the final equilibrium stage may have a lot of visual structure. Here is a nice example:

Individual particles such as atoms often arrange into a crystal because their mutual attraction lowers their total energy. In contrast, entropy usually favors a disordered arrangement, like that of molecules in a liquid. But researchers long ago found, in simulations and experiments, that spheres without any attraction also crystallize when they are packed densely enough. This entropy-driven crystallization occurs because the crystal leaves each sphere with some space to rattle around. In contrast, a random arrangement becomes "jammed" into a rigid network of spheres in contact with their neighbors. The entropy of the few "rattlers" that are still free to move can't make up for the rigidity of the rest of the spheres.

Read this for further details.

structural information inside DNA

I had always thought that structural symmetry was strictly a product of evolution due to its phenotypical advantages. Most animals for example have bilateral symmetry. Plants on the other hand exhibit other types of symmetries. In nature one rarely encounters structures that are devoid of such geometrical patterns.

While reading an article on algorithmic complexity, it immediately dawned upon me that there may be another important reason why symmetry is so prevalent.

First, here is a short description of algorithmic complexity:

Given an entity (this could be a data set or an image, but the idea can be extended to material objects and also to life forms) the algorithmic complexity is defined as the length (in bits of information) of the shortest program (computer model) which can describe the entity. According to this definition a simple periodic object (a sine function for example) is not complex, since we can store a sample of the period and write a program which repeatedly outputs it, thereby reconstructing the original data set with a very small program.

Geometrical patterns allow economization. Presence of symmetries can drastically reduce the amount of information that needs to be encoded in the DNA for the orchestration of biochemical processes responsible for the structural development of the organism. Same may be true for more complicated morphological shapes that are still mathematically simple to describe. An example:

Researchers discovered a simple set of three equations that graphed a fern. This started a new idea - perhaps DNA encodes not exactly where the leaves grow, but a formula that controls their distribution. DNA, even though it holds an amazing amount of data, could not hold all of the data necessary to determine where every cell of the human body goes. However, by using fractal formulas to control how the blood vessels branch out and the nerve fibers get created, DNA has more than enough information.

discoveries by chance

Most innovative scientific discoveries are hit upon in a random fashion. Insight is important of course. But maintaining an alert mind that is open to peripheral developments is even more important.

Deductive thinking is vital but it will probably not take you anywhere beyond the already-beaten paths.

Here is an interesting article containing examples from medical disciplines.

While I was producing electronic music with Umut Eldem, I sometimes had the feeling that we were tinkering like scientists do in their labs. Although there was a lot of trials and errors, our searches were not aimless. In fact we always had some not-so-well-defined goals in our minds. But these goals often ended up being modified on the way. There was no method to our production. Instrumentation, and composition took place simultaneously. We frequently worked on entirely different things. Most of our independent little-findings would be discarded later on, but some would occasionally merge into beautiful and spontaneous pieces.

Our sessions were extremely fun to say the least. None lasted more than two days, and we turned out at least one complete song in each one of them. We connected and complemented each other well.

I remember one specific occasion when the importance of chance in compositions became really clear to us.

We had hit upon an incredible sound while playing around with an extremely complex synthesizer that had 50 continuous and 10 discrete variable-parameters. (Please keep in mind that I am only an amateur bass player. So when I say "playing around" I really mean playing around.) The sound was a digital reconstruction of something that is familiar to all classical-concert-goers. Just before a concert begins there is a brief, discordant period in which musicians settle down into their seats, flip a couple of pages and do some final checks on their instruments. By turning only two knobs on our synthesizer we could literally recreate this "settling-down-period." It was simply unbelievable. We were shocked. We were awed.

Then my computer suddenly crashed. Despite all our efforts we could not recover nor recreate the sound. There was a lot non-linearities involved. A tiny push on a relevant parameter was drastically changing the outcome. The sound was gone for good. It was very sad indeed.