« "You are a little dubious as to the veracity of the claims of this shifty, grinning merchant trying to sell you bottles of Auroran Drop Bear repellant [sic]." | Main | "I can not have an aide who will not look up. You will be forever walking into things." »
July 01, 2006
"A foolish expedient for making idle people believe they are doing something clever when they are only wasting their time."
En lieu of a real post with content, here're some more Wikipedia links. This time, the topic, rather than being research for a creative writing project, is the role of hard math in public-private key cryptography.
1. Alice and Bob agree on a finite cyclic group G and a generating element g in G. (This is usually done long before the rest of the protocol; g is assumed to be known by all attackers.) We will write the group G multiplicatively.
2. Alice picks a random natural number a and sends ga to Bob.
3. Bob picks a random natural number b and sends gb to Alice.
4. Alice computes (gb)a.
5. Bob computes (ga)b.
Both Alice and Bob are now in possession of the group element gab which can serve as the shared secret key. The values of (gb)a and (ga)b are the same because groups are power associative.
Jaina mathematicians in ancient India first conceived of logarithms from around the 2nd century BC. By the 2nd century AD, they performed a number of operations using logarithmic functions to base 2, and by the 8th century, Virasena described logarithms to bases 2, 3 and 4. By the 13th century, logarithmic tables were produced by Muslim mathematicians.
In the 17th century, Joost Bürgi, a Swiss clockmaker in the employ of the Duke of Hesse-Kassel, first discovered logarithms as a computational tool; however he did not publish his discovery until 1620. The method of logarithms was first publicly propounded in 1614, in a book entitled Mirifici Logarithmorum Canonis Descriptio, by John Napier, Baron of Merchiston in Scotland, four years after the publication of his memorable discovery. This method contributed to the advance of science, and especially of astronomy, by making some difficult calculations possible. Prior to the advent of calculators and computers, it was used constantly in surveying, navigation, and other branches of practical mathematics. It supplanted the more involved prosthaphaeresis, which relied on trigonometric identities, as a quick method of computing products. Besides their usefulness in computation, logarithms also fill an important place in the higher theoretical mathematics.
At first, Napier called logarithms "artificial numbers" and antilogarithms "natural numbers". Later, Napier formed the word logarithm, a portmanteau, to mean a number that indicates a ratio: λόγος (logos) meaning proportion, and αριθμoς (arithmos) meaning number. Napier chose that because the difference of two logarithms determines the ratio of the numbers for which they stand, so that an arithmetic series of logarithms corresponds to a geometric series of numbers.
In public key cryptography, the private key is kept secret, while the public key may be widely distributed. In a sense, one key "locks" a lock; while the other is required to unlock it. It should not be possible to deduce the private key of a pair given the public key, and in high quality algorithms no such technique is known.
An analogy which can be used to understand the advantages of an asymmetric system is to imagine two people, Alice and Bob, sending a secret message through the public mail. In this example, Alice has the secret message and wants to send it to Bob, after which Bob sends a secret reply.
With a symmetric key system, Alice first puts the secret message in a box, and then locks the box using a padlock to which she has a key. She then sends the box to Bob through regular mail. When Bob receives the box, he uses an identical copy of Alice's key (which he has somehow obtained previously, maybe by a face-to-face meeting) to open the box, and reads the message. Bob can then use the same padlock to send his secret reply.
In an asymmetric key system, Bob and Alice have separate padlocks. First, Alice asks Bob to send his open padlock to her through regular mail, keeping his key to himself. When Alice receives it she uses it to lock a box containing her message, and sends the locked box to Bob. Bob can then unlock the box with his key and read the message from Alice. To reply, Bob must similarly get Alice's open padlock to lock the box before sending it back to her.
The critical advantage in an asymmetric key system is that Bob and Alice never need send a copy of their keys to each other. This substantially reduces the chance that a third party (perhaps, in the example, a corrupt postal worker) will copy a key while it is in transit, allowing said third party to spy on all future messages sent between Alice and Bob. In addition, if Bob were to be careless and allow someone else to copy his key, Alice's messages to Bob would be compromised, but Alice's messages to other people would remain secret, since the other people would be providing different padlocks for Alice to use.
Mathematicians sometimes use the notion of "polynomial time on the length of the input" as a definition of a "fast" computation, as opposed to "super-polynomial time", which is anything slower than that. Exponential time is one example of a super-polynomial time.
The complexity class of decision problems that can be solved on a deterministic sequential machine in polynomial time is known as P. The class of decision problems that can be verified in polynomial time is known as NP. Equivalently, NP is the class of decision problems that can be solved in polynomial time on a non-deterministic Turing machine (NP stands for Nondeterministic Polynomial time).
Computational complexity theory
A single "problem" is an entire set of related questions, where each question is a finite-length string. For example, the problem FACTORIZE is: given an integer written in binary, return all of the prime factors of that number. A particular question is called an instance. For example, "give the factors of the number 15" is one instance of the FACTORIZE problem.
The time complexity of a problem is the number of steps that it takes to solve an instance of the problem as a function of the size of the input (usually measured in bits), using the most efficient algorithm. To understand this intuitively, consider the example of an instance that is n bits long that can be solved in n² steps. In this example we say the problem has a time complexity of n². Of course, the exact number of steps will depend on exactly what machine or language is being used. To avoid that problem, we generally use Big O notation. If a problem has time complexity O(n²) on one typical computer, then it will also have complexity O(n²p(n)) on most other computers for some polynomial p(n), so this notation allows us to generalize away from the details of a particular computer.
Example: Mowing grass has linear complexity because it takes double the time to mow double the area. However, looking up something in a dictionary has only logarithmic complexity because a double sized dictionary only has to be opened one time more (e.g. exactly in the middle - then the problem is reduced to the half).
Problems that are solvable in theory, but cannot be solved in practice, are called intractable. What can be solved "in practice" is open to debate, but in general only problems that have polynomial-time solutions are solvable for more than the smallest inputs. Problems that are known to be intractable include those that are EXPTIME-complete. If NP is not the same as P, then the NP-complete problems are also intractable.
To see why exponential-time solutions are not usable in practice, consider a problem that requires 2n operations to solve (n is the size of the input). For a relatively small input size of n=100, and assuming a computer that can perform 1010 (10 giga) operations per second, a solution would take about 4*1012 years, much longer than the current age of the universe.
An example of an NP-hard problem is the decision problem SUBSET-SUM which is this: given a set of integers, does any non empty subset of them add up to zero? That is a yes/no question, and happens to be NP-complete.
There are also decision problems that are NP-hard but not NP-complete, for example the halting problem. This is the problem "given a program and its input, will it run forever?" That's a yes/no question, so this is a decision problem. It is easy to prove that the halting problem is NP-hard but not NP-complete. For example the Boolean satisfiability problem can be reduced to the halting problem by transforming it to the description of a Turing machine that tries all truth value assignments and when it finds one that satisfies the formula it halts and otherwise it goes into an infinite loop. It is also easy to see that the halting problem is not in NP since all problems in NP are decidable and the halting problem is not.
The complexity class EXPTIME-complete is also a set of decision problems. A decision problem is in EXPTIME-complete if it is in EXPTIME, and every problem in EXPTIME has a polynomial-time many-one reduction to it. In other words, there is a polynomial-time algorithm that transforms instances of one to instances of the other with the same answer. EXPTIME-complete might be thought of as the hardest problems in EXPTIME. Notice that although we don't know if NP-complete is a subset of P or not, we do know that EXPTIME-complete lies outside P; none of these problems can possibly be solved in polynomial time.
In computability theory, one of the basic undecidable problems is that of deciding whether a deterministic Turing machine (DTM) accepts a particular input. One of the most fundamental EXPTIME-complete problems is a simpler version of this which asks if a DTM accepts an input in at most k steps. It is in EXPTIME because a trivial simulation requires O(k) time, and the input k is encoded using O(log k) bits.4 It is EXPTIME-complete because, roughly speaking, we can use it to determine if a machine solving an EXPTIME problem accepts in an exponential number of steps; it will not use more.
Other examples of EXPTIME-complete problems include the problem of looking at a generalized Chess, Checkers, or Go position, and determining whether the first player can force a win. These games are EXPTIME-complete because games can last for a number of moves that is exponential in the size of the board. By contrast, generalized games that can last for a number of moves that is polynomial in the size of the board are often PSPACE-complete.
Notice that the NP-complete problem resembles a typical puzzle: is there some way to plug in values that solves the problem? The PSPACE-complete problem resembles a game: is there some move I can make, such that for all moves my opponent might make, there will then be some move I can make to win? The question alternates existential and universal quantifiers. Not surprisingly, many puzzles turn out to be NP-complete, and many games turn out to be PSPACE-complete.
Examples of games that are PSPACE-complete (when generalized so that they can be played on an n × n board) are the games hex and Reversi and the solitaire games Rush Hour, mahjong, Atomix and Sokoban. Some other generalized games, such as chess, checkers (draughts), and go are EXPTIME-complete because a game between two perfect players can be very long, so they are unlikely to be in PSPACE.
Note that the definition of PSPACE-complete is based on asymptotic complexity: the time it takes to solve a problem of size n, in the limit as n grows without bound. That means a game like checkers (which is played on an 8 × 8 board) could never be PSPACE-complete (in fact, they can be solved in constant time and space using a very large lookup table). That is why all the games were modified by playing them on an n × n board instead; in some cases, such as for Chess, these extensions are somewhat artificial and subjective.
Posted by Jon Rubin at July 1, 2006 10:57 PM
Trackback Pings
TrackBack URL for this entry:
http://www.ubiquit.us/movabletype/mt-tb.cgi/196