Introduction to Computational Theory: Areas of Research

A Turing MachineIn my last post I started explaining the theory of computation, starting with its central principle: The Church-Turing Thesis. In this post, I’m going to explain several areas of research in computational theory that, as per the Church-Turing Thesis, are based on the realization that all (full featured) computers are equivalent.

Turing Machines as Simplified Computers

Since Turing Machines are known to be equivalent in expressive power to modern computers, it turns out this means that Turing Machines can serve as a very simplified version of a modern computer — or any conceivable computer!

This makes Turing Machines quite useful for exploration of the Theory of Computation. Mathematicians have been able to come up with ‘programs’ written for Turing machines and then – because Turing Machines are so simple – come up with consistent ways of how to measure how fast the program runs given any number of inputs.

So let’s say you wanted to write a program that sorted a list of integers. There are, in fact, several ways to write such a program and not all of them are equally fast. To determine which is the fastest you could ‘write’ the program on a simple Turing Machine and then evaluate how many ‘steps’ the Turing Machine would have to take per integer to be sorted to complete the sorting process. Because we now know that Turing Machines are equivalent to all computational machines (including modern digital computers) the result of an analysis on a Turing Machine will also be approximately true for a modern computer.

An Example of Using a Turing Machine to Measure the Speed of a Program

Let’s try a real life example. Pretend like you are going to write such a program. Further, pretend that the list of integers you wish to sort are:

5, 3, 4, 8, 1, 2, 9, 6, 7, 0

How would you go about sorting such a list into a list like this: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9?

So let’s say that you decide that to sort the list you are going to take the first two numbers (5 and 3) and compare them. You know that 5 is greater than 3 and thus 3 needs to come first. So you switch the order on those two. The result is that the list now looks like this:

3, 5, 4, 8, 1, 2, 9, 6, 7, 0

The list is still nowhere close to sorted, but it’s a bit more sorted than a moment ago!

Now imagine comparing every two integers next to each other and swapping them if they are out of order. So we now compare 5 and 4 and do a swap. The result is:

3, 4, 5, 8, 1, 2, 9, 6, 7, 0

Now here’s the problem. Once you go through the whole list you still have not sorted the list entirely. In fact it looks like this:

3, 4, 5, 1, 2, 8, 6, 7, 0, 9

That’s okay, it’s still looking a lot better than a moment ago. So just start over and try again. This time the list will look even more sorted. In fact, just keep starting over and doing a digit by digit swap until you no longer find any more to swap! Once there are no more to swap, the list is sorted.

This little program for sorting integers will actually work for any type of sorting where two things can be compared and one will always be greater than or equal to the other. We can imagine it also working for alphabetical ordering, for example.

Now here is the question: how many
“comparisons” between two digits would you have to make before the list is sorted? Think about that question for a moment. Doesn’t the answer depend on how sorted the list was to begin with? If the list actually started out already sorted, then it would take 9 comparisons before we’d realize the list is already sorted. Not bad, really, considering how fast a computer is.

However, what if the list had started out like this?

9, 8, 7, 6, 5, 4, 3, 2, 1, 0

Now how many comparisons would it take to sort the list? You your imagination or mathematical abilities a bit and you’ll be able to see that you’d have to run through the list 9 times to doing 9 comparisons each before the list would be sorted.

It’s possible to generalize this result now. If we have 10 integers to compare, it will take 10-1=9 to the second power comparisons to sort the list. Or, to put this algebraically, it takes:

(X-1)2 comparisons (where “X” is the number of integers to sort) [Note: someone check me on this. I suck big time at measuring complexity.]

To sort the list using this program – in a worst case scenario that is. On average it probably takes about half that.

Now it’s not hard to see that this program gets exponentially worse as we add integers to the sort list. That means that no matter how fast our computer is, eventually it won’t be able to sort a list of integers in a reasonable amount of time if the list gets too long. So we have found a sort of limitation to how many integers we can realistically sort using this program.

As it turns out, there are far more efficient ways to sort a list of integers. Some of these ways will, even in the worst case scenario, not grow exponentially as the list of integers expands.

However, as bad as “exponential growth” sounds, it honestly could have been a lot worse.

Time Complexity

An area for research in Computational Theory is ‘time complexity.’ In a nut shell, there are certain types of programs that can be written on a Turing Machine but will take so long to run that the sun will explode before the computation is completed.

In our simple example above, we found that if we had 10 integers it would take (X-1)2 comparisons to sort the list (at least using our rather stupid program, that is.) In computational theory-ese we say that the program take X2 (I’m approximating) time to run.

But as I said, it could have been much worse. X2 is actually a lot better than a program that takes 2X comparisons to run. And there are such programs!

Just to check my numbers on this, where X is 10, 102  = 100 while 210 = 1024!

Okay, even 1024 isn’t a bad number for a modern computer. It would still finish in the blink of an eye. But, just for a moment, pretend like we’re trying to sort a list of a million integers. Now compare the two types of programs:

1,000,000,0002 =  1,000,000,000,000,000,000

Whereas

21,000,000,000 = a number so big that my spreadsheet won’t compute it!

Computational Theorists have created two important classifications for programs that represent if the program can run in a reasonable amount of time or not. Our X2 program falls into a category called “Class P” which stands for “Class Polynomial Time.”  (If you remember way back to high school algebra, a “polynomial” is just any equation that consists of any number of variables to a ‘fixed’ power. So a polynomial would include equations like X3 + Y2 = Z or something like that. It excludes equations like 3X + 2Y = Z.)

Now we saw that a “polynomial time” program might take a very long time to run. Our simple X2 program might run for a very long time for some very large X. But we also saw that a Class-P (i.e. “polynomial time”) program is going to run a lot faster than a program that isn’t in Class-P and has X as its exponent.

Just to drive this point home, imagine a program that takes 0.01 seconds for each comparison but has to run 3X comparisons. That means for 100 items we want to compute, it will take 0.01 seconds times 3X. We’re talking about  5,153,775,207,320,110,000,000,000,000,000,000,000,000,000,000 seconds to run a mere 100 items!

Now granted that if you only needed to run, say, 10 items, you’d get a reasonable result still. (590 seconds). But the complexity of the computation in terms of the amount of time required to calculate grows so exponentially fast that even 100 items is too much. You’re essentially limited by the very nature of the program to never having more than 10 or so items that you can run it on!

Is there a name for a class of programs that runs with X as the exponent? Maybe a “Class-E” for “Exponential Time”? Well, not exactly. There are actually different sorts of exponential time programs, so they don’t ever talk about “Class-E” per se. Instead they break it down to a bunch of subclasses (of our hypothetical and never used “Class-E”) and talk about those classes instead.

So for the moment, imagine two classes of programs. The first class is Class-P and the other is everything that is not Class-P.

Time Classification: Class NP

One common other classification is the Class NP. This class contains many sorts of exponential time programs. (However, it doesn’t include all such exponential programs and so is not the same as our hypothetical “Class-E”.) I don’t have space here to get into what defines Class NP. It’s essentially a program that can run in Polynomial Time (just like Class-P, but for normal Turing Machines) on a special sort of non-deterministic Turing machine that allows for a sort of massive parallel processing. If it was physically possible to build such a machine in real life (it’s not!) then it would be possible to run Class NP programs in a reasonable amount of time just like Class-P programs.

But for the moment, we’ll pretend like our “Non-Class-P” class is roughly the same as “Class-NP.” (For those interested, the NP stands for Non-Deterministic Polynomial Time.)

Decidability and The Halting Problem

Once the Church-Turing Thesis was accepted, it became possible to study programs written for theoretical Turing machines and then generalize to all possible types of computers. One of the first areas for research (by Turing himself, as well as others) was into what is called “The Halting Problem.”

Essentially, this problem asks if it is possible to build a computer program (we’ll call it “program 1” that can read another computer program (program 2) and then determine if program 2 will run to completion or if it will get into an infinite loop.

Anyone that has played with computer programs, even if just in high school using BASIC, knows that it’s easy to write a program that loops forever. That is to say, the program never terminates and gives a final result. It just keeps running and running forever. This is generally considered a bad thing. It’s often what happens when your computer “locks up” and you have to restart the program or, worse, reboot.

Now imagine a program that calculates something really complex. It might take, say, 10 years to compute all the variations of clothes Lady Gaga uses. When we start running the program, we might not realize that it’s going to take 10 years to run. We might run it and then wait for all day and then get frustrated and think we programmed it wrong. Maybe the program has a bug and got stuck into an infinite loop? So how would you tell if the program is stuck in an infinite loop vs. just taking forever to compute the result? So it would be very useful to have a computer program that could read another program and then tell the programmer if the first program, in its current state, will run forever or not.

As it turns out, computational theoreticians were able to prove that such a program is impossible to write on a Turning Machine. Since a Turing Machine is equivalent to all types of computers that means no computer can or – baring the discovery of a new theory of computation — ever will be able to run such a program!

So the question of “will this program run forever or not” is a question that it is impossible for any computer program to decide – at least for the general case. [1]

This means that there is a logical upper limit to computations. There is at least one question that computers can’t ever ‘decide’ the answer to. This is a piece of knowledge that is ‘unknowable’ to any sort of computation.

Reducibility

Researchers have discovered that it’s possible to prove that there is a whole class of such undecidable questions. What they did is they discovered it’s possible to mathematically prove that, in some cases, one type of program is ‘reducible’ to another.

Imagine that you are a researcher that works with both Turing machines and also Finite Automata. (See here for explanation of what those are.) For some problems you use Finite Automata and for some you use Turing Machines. One day you realize that you could save a lot of time for yourself if you could write a program that could decide for you if a certain Turing Machine had a Finite Automata that was equivalent to it.

Remember, Finite Automata can’t do everything a Turing Machine can do. But clearly some specific Turing Machines do have an equivalent Finite Automata. So all you need to do is write a program that makes that determination, right?

Except that, as it turns out, this program is just as impossible to write as the Halting program was. The way Computational Researches were able to prove this was rather clever.

Essentially, they started with the assumption that there existed a program that would determine if a Turing Machine is equivalent to some Finite Automata. Having made that assumption, then then proved that this would be sufficient for them to then go on and solve the Halting problem! But since the Halting problem is provably unsolvable, they could conclude for certain that there is no such thing as a ‘Turing Machine to a Finite Automata Equivalence Program’! Via reducio ad absurdum they had proven that their assumption that there is such a program must be a false assumption. Clever indeed!

This is the essence of the concept of reducibility. If you can make a program that translates one type of program into a Halting program, then you have proven that the one program can be ‘reduced’ to the other. Therefore, neither must exist.

Class NP-Complete – Trying to Prove Class-P does not equal Class-NP

One thing that has eluded Computational Theorists is that there is no known proof that Class-P and Class-NP aren’t the same class.

This seems strange, at first. Consider our example above. Clearly the 3X program is not in class P, right? Or is it? What if some clever programmer comes up with way to take the same computation we are doing in 3100 time and figures out how to do it in 1003 time instead?

This is a problem with computational theory. One never knows what programs we might come up with or discover in the future. So we can’t say for sure that our 3100 program isn’t really a computation in class P if we just knew how to make the program run better. (Remember that our example of sorting integers was a bit fake because it’s actually possible to sort integers faster than X2 time.)

Given that is true, how can we ever know any program is not really in Class-P? Maybe all programs are really in class-P and it’s really just our ignorance that keeps it in Class-NP as of today.

Computational Theorists are not comfortable with this state of affairs. So they produced ‘observational evidence’ (not a proof) that Class-P and Class-NP are not equivalent. (This is a great example of how Computational Theory, and therefore mathematics, is actually an empirical science, by the way! The Church-Turing Thesis was also an example of how Computational Theory is an empirical science.)

Essentially what they did is they came up with a program that is known to be in Class-NP and is also known that any program in Class-NP can be ‘reduced’ to it. So if there is some program out that that allows the problem to be reclassified into Class-P that would mean that all Class-NP programs will now really be in Class P!

This seems incredibly unlikely. Despite not being a ‘proof’ this satisfies most computational theorists that Class-P and Class-NP must not be equivalent.

Any program that has both of those attributes (i.e. its in Class-NP and any other Class-NP program can be reduced to it) is considered to be in a special sub-class of Class-NP called Class-NP-Complete.

Conclusions

This post gave a very quick overview of areas for research and some discoveries for Computational Theory. Computational Theory is built on the back of the Church-Turing Thesis which states that a Turing Machine is the maximum possible logical power available; it is impossible to create a computational computer that can run programs that a Turing Machine can’t run.

Based on that (unproven but well observationally tested) thesis Computational Theorists research the following areas:

  1. Time Complexity is the study of how long it takes for a program to run.
  2. Decidability is the study of whether or not it is even possible to write a program that can compute a certain result
  3. Reducibility is a key tool for determining Decidability because it allows us to judge whether or not a certain type of program is really reducible to the Halting Problem or some other problem that is already known to not be Decidable.
  4. NP-Complete is a special class of programs that any Class-NP program can be reduced to. If any NP-Complete program could be translated to any Class-P program, then all Class-NP programs could be translated to Class-P and we would have proven that Class-P is equivalent to Class-NP. This seems unlikely and so most researches believes this means that Class-P is not equivalent to Class-NP.

Notes

[1] General Case. Obviously it should be possible to write a program that can decide if certain types of computer programs will run forever or not. It might even be possible to write a program that can evaluates lots of different types of programs as to whether or not they will run forever without halting. But to write one that can evaluate any program is not possible.

3 thoughts on “Introduction to Computational Theory: Areas of Research

Comments are closed.