(A code is uniquely decipherable if every finite sequence of code characters corresponds to at most one message. If no code word is a prefix of another code word then the code is also called instantaneous. Although a uniquely decipherable code is not necessarily instantaneous, the existence of such a code implies the existence of an instantaneous one.)
Let's try to apply this theorem to genetics...
The genetic code alphabet consists of only four letter: A,C,G,U. Therefore we have D=4 in the above formula.
We need to assign a code word to each of 20 existing amino acids: Alanine, Arginine, Asparagine, Aspartic Acid, Cysteine, Glutamic Acid, Glutamine, Glycine, Histidine, Isoleucine, Leucine, Lysine, Methionine, Phenylalanine, Proline, Serine, Threonine, Tryptophan, Tyrosine, Valine. Therefore we have M=20 in the above formula.
In vertebrates the relative observed frequencies of the above amino acids are respectively 7.4, 4.2, 4.4, 5.9, 3.3, 5.8, 3.7, 7.4, 2.9, 3.8, 7.6, 7.2, 1.8, 4.0, 5.0, 8.1, 6.2, 1.3, 3.3 and 6.8 percent. (Note that these numbers sum up to 100.) In other words, 7.4 percent of the amino acids needed for the next typical protein synthesis will be Alanine. 4.2% of them will be Arginine, 4.4% will be Asparagine etc. This vector of percentages will be our probability variables "p_i". (e.g. x_2=Arginine and p_2=0,042)
Let's enforce the additional requirement that the length of each code-word is equal. In other words, for all "i" we set "n_i" equal to some "n". (There may be some structural justifications for this extra assumption. I will not make any speculations though since my knowledge of molecular biology is close to nil.)
After inserting the numbers into their appropriate places, the theorem reveals that the lower bound for the probability-weighted average of code-word lengths is 2.1. Since we set all "n_i" equal to "n", the probability-weighted average is simply "n". Note that code-word length "n" has to be an integer. Hence what noiseless coding theorem tells us is that the minimum value "n" can assume is 3.
Guess the actual word-length that occurs in genetics! It is 3. If there was a single more letter in the code alphabet, then "n" could be 2! (Due to the anti-parallel structure of DNA the alphabet size has to remain even. Therefore I should have probably written "If there were two more letters...".) However, with only four letters, the word-length can not theoretically be pushed below 3.
Given the size of the alphabet, nature is as efficient as it can theoretically get. It even opportunistically exploits the difference between 2.1 and 3 by assigning more than a single codon (namely a three letter code) to some of the amino acids. The number of code-representations belonging to each amino acid is repectively 4, 6, 2, 2, 2, 2, 2, 4, 2, 3, 6, 2, 1, 2, 4, 6, 4, 1, 2 and 4.
There are some suggestions that the genetic code has evolved in an error-minimizing fashion. When a single-nucleotide mutation transforms UUU into UUC, tRNA still codes for Phenylalanine. Hence, in some sense, greater number of representations entails less sensitivity to mutations and operational errors during the decoding process.
When you compare the number of representations against the observed frequency of occurrence, the following pattern emerges: