9 February 2012

A quick break from medicine... The infinite monkey theorem



"A monkey hitting keys at random on a typewriter for an infinite amount of time will almost surely type a given text, such as the complete works of William Shakespeare".

First we need to get a few things straight:

  • It doesn't have to be a monkey - it is just a metaphor for an abstract device that produces random sequence of letters and symbols.
  • 'Almost surely' has a specific mathematical meaning.
  • It doesn't matter if there's an infinite number of monkeys with typewriters, one monkey for an infinite amount of time, or an infinite number of monkeys for an infinite amount of time.

Almost surely?
The difference between an event being 'almost sure' and 'sure' is the same as the subtle difference between something happening with probability 1 and happening always. If an event is sure, then it will always happen. No other event can possibly occur even if the other event’s probability is 1. If an event is almost sure, then other events are theoretically possible in a given sample space (a sample space is a set of the possible outcomes), however as the size of the sample space increases, the probability of any other event nears zero.

The point is, that given an infinite amount of time, every single combination of letters, words, symbols and numbers ever, will be written. One monkey may type out the complete works of Shakespeare but misspell Romeo as Rameo, one monkey may be one full stop away from completion, and stop typing.

Anderson's model
Programmer Jesse Anderson wanted to give Amazon’s Web Services — cloud-based storage, data processing, virtual private servers, etc. — a try so he decided to test out the Infinite Monkey Theorem. The monkeys were replaced with software that outputs random letters and their writing is being combined to recreate the complete works of Shakespeare. 3,696,348 characters in total.
Using Anderson’s method each individual has to type just nine characters that match the source text before the program adds that string to the collected pool of Shakespeare text. This should make the project go much more quickly than if a single monkey had to accidentally recreate a work that is tens of thousands of characters long, which is how the theory generally goes.
This is an image of how much of the complete works of Shakespeare the virtual monkeys have covered so far (green).