The Efficiency Of Algorithms

Скачать: The Efficiency Of Algorithms


Some mathematical problems can be solved only by methods too
slow for even the fastest computers. More efficient methods have not been
found, but neither has it been proved, that there are no better methods by
Harry R.Lewis and Christos H. Paradimitrou
Suppose you were asked to plan an itinerary for a traveling
salesman who must visit a number of cities. You are given a map on which the
distances between the cities are marked and you are asked to find the shortest
route that passes through all the cities and returns to the starting point. An
approach to this problem that is certain to give the correct answer is to trace
all the possible routes, measure their length and pick the shortest one. If the
tour included more than a few cities, however, hundreds or thousands of routes
would have to be checked. If there were 100 cities, then even the fastest
computers would require weeks of calculation to find the shortest path.
In the search for a quicker solution you might try some less
rigorous methods. One idea that seems reasonable is always to visit nearby
cities before going on to those farther away. You would soon discover, however,
that this procedure does not invariably yield the correct answer. Other
shortcuts also fail. In fact, the best methods known for solving the problem
are not much better than the obvious but laborious procedure of checking all
the possible itineraries. Mathematicians now suspect that this problem and many
others like it may forever remain beyond our ability to solve in any efficient
way. That speculation itself, however, is unconfirmed; although no faster
methods of solution have been found, neither has it been proved that faster
methods do not exist.
In the problem of the traveling salesman's tour it is not
the solution for a particular set of cities that is of the greatest importance
but a general method for finding the solution for any cities. Such a method is
called an algorithm: it is a precisely stated procedure or set of instructions
that can be applied in the same way to all instances of a problem. If the
problem is to be solved with the aid of a computer, an algorithm is
indispensable, because only those procedures that can be stated in the explicit
and unambiguous form of an algorithm can be presented to a computer.
Instructions that are vague or that rely on intuition are unacceptable.
An example of an algorithm is the procedure taught in the
schools for the subtraction of whole numbers. If each of the steps in this
procedure is applied correctly one at a time, the algorithm will always yield
the correct result. What is more, once the algorithm has been learned or stored
in the memory of a computer or embodied in the circuitry of an electronic
calculator, it can be applied to an infinite set of subtraction problems. With
this one algorithm the difference between any two whole numbers can be determined.
In principle any problem for which an algorithm can be
devised can be solved mechanically. It may therefore seem surprising that there
are problems for which algorithms exist but for which we so far have no
practical general solution. The algorithms for solving these problems always
give a correct answer, but they often require an inordinate amount of time. The
problem of the traveling salesman's tour is among these intractable tasks.
The efficiency of computer algorithms is a topic of obvious
practical importance. It is also of interest in more formal areas of
mathematics. There are some problems in mathematics and logic for which no
algorithm can ever be written, and there are many others for which efficient,
fast algorithms are already known. Between these two groups is a third class of
problems that can always be solved in principle, but for which there are only
inefficient (and therefore largely unusable) algorithms. For some of these
difficult problems mathematicians have been able to demonstrate that efficient
algorithms can never be designed. For many of the most important problems,
however, there is only the suspicion that good algorithms are impossible.
A given problem can have more than one algorithm for its
solution. For example, children in Europe learn a procedure for subtraction
slightly different from the one taught in the U.S. Both of the subtraction
algorithms, however, give the same result in the same amount of time. That is
not invariably the case with different algorithms for solving the same problem.
One celebrated problem that can be solved by either a "fast"
algorithm or a "slow" one is the problem of the Koenigsberg bridges.
In the I8thcentury German city of Koenigsberg (which is now
the Russian city of Kaliningrad) a park was built on the banks of the river
Pregel and on two islands in the river. Within the park seven bridges connected
the islands and the riverbanks. A popular puzzle of the time asked if it was
possible to walk through the park by a route that crossed each of the bridges
once and only once.
For the solution of the problem the size and shape of the
islands and the length of the bridges are immaterial: the only essential
information is the pattern of interconnections. This information can be
presented compactly in the mathematical structure known as a graph, which is
merely a set of points with lines drawn to join them. In the case of the
Koenigsberg park each of the riverbanks, and each of the islands is condensed
to a single point, and each of the bridges is represented by a line between two
points. Thus the graph consists of four points and seven lines. If the lines
are labeled, any path through the park can be specified by a simple listing of
labels.
The obvious approach to the problem is to list all the paths
that cross all the bridges and to eliminate from consideration those that cross
any bridge more than once. This is the technique of exhaustive search, similar
to the one employed in the problem of the traveling salesman. When the
mathematician Leonard Euler was presented with the problem of the Koenigsberg
bridges, he recognized the limitations of the technique and found another
method. In recognition of his contribution a path that traverses each line of a
graph exactly once is now called an Eulerian path.
Euler wrote: "The particular problem of the seven
bridges of Koenigsberg could be solved by carefully tabulating all possible
paths, thereby ascertaining by inspection which of them, if any, met the
requirement. This method of solution, however, is too tedious and too difficult
because of the large number of possible combinations, and in other problems
where many more bridges are involved it could not be used at all."
Euler's alternative method is much simpler. He showed that a
tour of the kind sought must exist if the graph meets two conditions. First, it
must be possible to go from any point in the graph to any other point by
following the lines of the graph: in other words, the graph may not be
disconnected. Second, every point of the graph, with two possible exceptions,
must be at the junction of an even number of lines.
It is not hard to understand why a graph cannot have an
Eulerian path unless it meets these conditions. All regions of the graph must
be connected to one another if there is to be any path that traverses all the
lines. Each point must have an even number of lines because half of them are
required to reach the point and the other half to leave it. Two points with an
odd number of lines can be allowed if they are chosen as the starting and
finishing points of the path. Demonstrating that any graph that meets these
conditions actually has an Eulerian path requires a somewhat more complicated
argument, which we shall not present here, but Euler was able to give a
rigorous proof.
It is an easy matter to express Euler's solution to this
problem in an algorithm that could be executed by a computer. The first
requirement, connectivity, can be established by marking some point of the
graph, then similarly marking all points connected to it by lines, then all the
points connected to the newly marked points, and so on. The graph is connected
if at the end all points have been marked. The second requirement is just as
easily tested: the machine is instructed to examine each point of the graph and
count the number of lines that terminate at that point. If no more than two of
the points have an odd number of lines, the graph has an Eulerian path. The
park at Koenigsberg met the first condition but not the second, and so there
was no Eulerian tour of the seven bridges.
Euler's method is unquestionably the more economical
approach to the problem of Koenigsberg bridges: it requires that each point and
line of the graph be listed just once, whereas the exhaustive search is not
completed until every path that crosses all the bridges has been listed. The
number of such paths is much larger than the number of points and lines in the
graph. In that sense Euler's method is the better algorithm, but how much
better? How can the difference be measured and how can one fell if the
difference is significant?
For a graph with only four points and seven lines both
techniques are fast enough to be considered practical. Suppose, however, that
more islands and bridges were added to the park, or in other words that more
points and lines were added to graph. If the problem is being solved by Euler's
method, each new point merely adds one item to the list of points that must be
checked. If the paths are to be examined by exhaustive search, on the other
hand, then with each new point and line the size of the list is multiplied by
some factor. A moderate increase in the size of the graph results in an
explosive increase in the number of paths. Ultimately the list of paths 'must
become prohibitively long.
In this comparison of the two solutions to Euler's problem
there is the basis for a completely general method of evaluating the speed or
practicality or efficiency of any algorithm. We imagine that the algorithm is
supplied with larger and larger inputs, and we note the rate at which the
execution time of the algorithm increases. In this way it is possible to make
unequivocal judgments of algorithms. Exhaustive search not only is a slower
method; in general it is too slow to be of any value. Euler's method remains
practical for problems of essentially unlimited size.
As the size of the graphs being examined increases, the
lists produced by the method of exhaustive search grow exponentially. Each time
some fixed number of points and lines are added to the graph the size of the
list doubles. Growth of this kind can be described by a mathematical function
such as 2^n, where n is some measure of the size of the graph. Many other
functions have similar or even higher rates of growth. Among them are n^n and n!
( which is read as "n factorial" and signifies n multiplied by all
the integers between 1 and n). For the purposes of this discussion all these
functions can be regarded as having the same property of exponential growth.
Mathematical functions of another kind are known as
polynomials. The simplest members of this class are linear functions, such as 3n,
which designate a simple relation of proportionality. The time needed to solve
the problem of Koenigsberg bridges by Euler's method increases as a linear
function of the size of the graph. Other polynomials are n^2, n^3 and so on,
and the sums of such functions. What distinguishes polynomials from exponential
functions is that n never appears in an exponent.
For sufficiently large values of n any exponential function
will overtake and exceed any polynomial function. It was the certainty of this
result that dismayed 'Ihomas Malthus when he compared the exponential rate of
population growth with the polynomial rate of increase in the food supply. For
small values of n a given polynomial function may well exceed a given
exponential one, but there is always a value of n beyond which the exponential
function is the greater. The exact form of the polynomial makes little
difference, except in changing the point at which the polynomial function is
overtaken.
It is now generally agreed by computer scientists that
algorithms whose execution time increases exponentially as a function of the
size of the input are not of practical value. We shall call algorithms of this
kind "exponential time" algorithms, or simply inefficient algorithms.
The only algorithms that are considered fast or efficient enough for general
application are "polynomial time" algorithms.
Of course, even among efficient algorithms some are faster
than others, but for the purposes of this discussion it is important only to
distinguish polynomialtime algorithms as a class from exponentialtime
algorithms. Moreover, this system of classification has the advantage that it makes
the speed of an algorithm a property of the algorithm itself and independent of
the machine on which it is executed. For sufficiently large problems a
polynomialtime algorithm executed on even the slowest machine will find an
answer sooner than an exponentialtime algorithm on the fastest computer.
The mathematical properties of algorithms were studied in
the 1930's by the British mathematician A. M. Turing, the inventor of the
imaginary computer, now called a Turing machine. The Turing machine was conceived
to be an automaton equipped with an infinite supply of paper tape marked off in
square regions. The machine was capable of just four actions: it could move the
tape one square, it could place a mark in a square, it could erase a mark
already present and at the end of a calculation it could halt. These operations
were to be performed according to a sequence of instructions built into the
internal mechanism. Of course, Turing never built such a machine; it was merely
a conceptual device for automatically solving problems in mathematics and
logic. Indeed, Turing was interested not in actually solving problems with the
machine but rather in investigating what kinds of problems it could solve and
what kinds it could not.
Turing discovered that even a machine as simple as this one
could solve any problem for which an algorithm could be devised. The
computation might be laborious and indirect, but given enough time and paper
tape the machine would eventually find the solution and halt. Reduced to its
essentials the Turing machine is a language for stating algorithms, as powerful
in principle as the more sophisticated languages now employed for communicating
with computers.
In addition to conceiving, these machines Turing
demonstrated, their limitations. In 1936 he showed that there are problems that
cannot be solved by Turing machines, and it follows that these problems cannot
be solved by any automatic computer. They are problems for which algorithms
cannot be written, even in principle. The example first studied by Turing is
the problem of predicting whether a particular Turing machine, once it is set
in motion, will ever finish its calculation and halt. Through an analysis of
this problem he was able to show that there can be no general procedure for
telling whether mathematical propositions are true or false. Since then a
variety of other problems with the same properties have been proposed.
One result of Turing's work was to divide all imaginable
problems in mathematics into two classes. Those problems for which algorithms
can never be written are in a formal sense permanently unsolvable. Some
instances of these problems may be solved by rare perception or by luck, but a
general method for their solution will never be found.
All other problems in mathematics and logic can be solved by
algorithms. As we have seen, however, some algorithms are more useful than
others. The class of solvable problems can therefore be divided into two
subgroups: those for which there are efficient, polynomialtime algorithms and
those for which there are only exponentialtime algorithms. Euler's problem is
a member of the class with polynomialtime solutions, since Euler's method is
itself a polynomialtime algorithm. Problems that can be proved to have only
exponentialtime algorithms are also known, although they are rather obscure.
Although these two groups of problems are quite distinct, it
is not always a straightforward task to assign a problem to one group or the
other. Indeed, a very interesting class of problems seems to fall somewhere
between them. For these problems no efficient algorithms are known and the best
available solutions require exponentially increasing time, yet no one has been
able to prove that the problems do not have polynomialtime solutions.
One such problem was considered in the 19th century by the
Irish mathematician William Rowan Hamilton. Superficially Hamilton's problem is
much like Euler's .The problem is to decide whether a given graph has a path
that takes in each point exactly once (whereas Euler looked for a path that
traversed each line once). Actually the tasks are quite different, and Euler's
method cannot be applied to Hamilton's problem. The graph derived from the plan
of the park at Koenigsberg has a Hamiltonian path, although, as we have seen,
it has no Eulerian path. On the other hand, removing two lines results in a
graph that has an Eulerian path but not a Hamiltonian one. Many other graphs
have neither kind of path.
Hamilton's problem can be solved by exhaustive search:
indeed, the procedure is not substantially different from that employed in
listing all the possible paths that might have the Eulerian property. For
Hamilton's problem, however, no efficient algorithm comparable to Euler's
method has been found. The problem has been pondered by many of the best
mathematicians of the past century, but the most efficient methods available
today are fundamentally no better than exhaustive tabulation. On the other
hand, all attempts to prove that there is no better method have also failed,
and it must be considered a possibility that an efficient algorithm will be
discovered tomorrow.
Problems that are known to have polynomialtime solutions,
such as Euler's problem, are said to be members of the class P (for polynomial).
Hamilton's problem is a member of another class, designated by the letters NP,
signifying "nondeterministic polynomial." The class NP encompasses
all the problems in P, or in other words P is a subset of NP. In addition NP
includes problems whose status is less certain. They are all solvable problems
in principle: they have algorithms, but for now only exponentialtime
algorithms are known. They may also have polynomialtime algorithms (in which
case NP and P are identical) or they may prove to be permanently intractable,
with only inefficient solutions.
The problems considered here, and all problems classified in
this way, can be described as an infinite set of similar questions each of
which can be Another way of defining NP is as the class of yesorno problems
that can be solved by guessing certificates. If one is given an instance of a
problem in NP for which the answer happens to be yes, then with luck one may
discover the required certificate fairly quickly by making a sequence of
guesses; if the answer is no, guessing cannot possibly yield an answer any
faster than an exhaustive search could. For example, in solving Hamilton's
problem one might find a correct path (if there is one) on the first try by
tracing small portions of the path and guessing at each stage how to proceed.
Such a procedure, it should be emphasized, is not an algorithm. It could be
made into an algorithm only by crossing off each trial path as it is tested and
checking all possible paths, but that is equivalent to the method of exhaustive
search.
A mathematical procedure defined in terms of luckly guesses
may seem bizarre, but it is a quite legitimate approach to defining the
problems in the class NP. In principle the procedure could even be mechanized by
building a device called a nondeterministic Turing machine. This device can do
all that an ordinary Turing machine can do; in addition, at some points in its
operation it may have more than one choice of what to do next. Such a machine
would be considered to answer yes to a question if there were some sequence of
choices that could lead it to a yes conclusion. NP, the class of
nondeterministic polynomialtime problems, consists of precisely those problems
whose yes instances can be identified by, machines making comparatively short
guessing computations.
The inclusion of guessing in the definition of these
problems suggests strongly to many mathematicians that P and NP are not the
same set and hence that efficient algorithms can never be found for the intractable
problems in the class NP. If every problem in NP were actually in P, then all
the guesswork and luck could be replaced by some systematic procedure without
great sacrifice in time. It is hard to believe the ability to guess and be
lucky could win so little.
The class NP includes a variety of commonly encountered
problems that seem to defy efficient solution. We have already mentioned
Hamilton's problem and the problem of composite numbers. Another example is
known as the matching problem. It can be considered in terms of the task faced
by the colleges every September, when a new class of freshmen must be assigned
to shared dormitory rooms.
For the sake of simplicity let us assume that all
the information gathered about the students' smoking habits, bed time hours,
taste in music and so forth results in a single yesorno decision as to the
compatibility of each possible pair of students. The entire class can then be
represented as a graph in which the points correspond to students and a line is
drawn answered yes or no. For problems that are formally unsolvable, such as
the problem of predicting whether a Turing machine will halt, these questions
simply cannot be answered by any algorithmic procedure. For problems of the
class P the questions can invariably be answered, whether the answer turns out
to be yes or no, by an efficient procedure. In order for a problem to qualify
for the class NP there need not be an efficient means of answering the
yesorno questions. What is required is that whenever the answer is yes there
be a short and convincing argument proving it.
Hamilton's problem, for example, meets this condition. It is
not possible to tell by any efficient means known today whether a graph has a
Hamiltonian path, but if it does, then the path itself can be exhibited. Hence
for every Hamiltonian graph it is possible to issue a "certificate"
that proves its membership in this special class of graphs. Such a certificate
would name the lines in the graph in the order the path traverses them. Finding
the path might require weeks of tabulation, but once it has been found it can
easily be exhibited. Another problem that belongs to the class NP is the
question of whether a whole number is composite, that is, whether it can be
written as the product of two other numbers. Again there is no known efficient
procedure for answering the question, but if the number is indeed composite
there is a succinct proof of that fact, namely a correctly workedout
multiplication with the number on the bottom line.
Care must be taken when asking the yesorno question
of a problem in the class NP, since the complementary nooryes problem might
not be in the same class. For example, the complement of Hamilton's problem, in
which one is asked to show that a graph does not have a path passing once
through each point, may well not be in the class NP. For now the only way to
demonstrate the absence of such a path is to list all possible paths, and such
a proof is too lengthy to qualify as a certificate of membership in NP. On the
other hand, the complement of the compositenumber problem, which asks if a
number is prime, turns out to be in the class NP. The reason, which is far from
obvious, is that relatively short proofs demonstrating that a number has no factors
other than 1 and itself were discovered in 1975 by Vaughan Pratt of the
Massachusetts Institute of Technology. Still, it is not known whether the
compositenumber problem and its complement are in the class P.
It is easy to show that every problem in the class P
is also in the class NP. If a problem is in P, then by definition there is an
efficient algorithm for it. To produce a short and convincing proof that the
answer to some instance of the problem is yes, all we need to do is follow the
algorithm: a record of its operation constitutes the required certificate.
Another way of defining NP is as the class of yesorno
problems that can be solved by guessing certificates. If one is given an
instance of a problem in NP for which the answer happens to be yes, then with
luck one may discover the required certificate fairly quickly by making a
sequence of guesses; if the answer is no, guessing cannot possibly yield an
answer any faster than an exhaustive search could. For example, in solving
Hamilton's problem one might find a correct path (if there is one) on the first
try by tracing small portions of the path and guessing at each stage how to
proceed. Such a procedure, it should be emphasized, is not an algorithm. It
could be made into an algorithm only by crossing off each trial path as it is
tested and checking all possible paths, but that is equivalent to the method of
exhaustive search.
A mathematical procedure defined in terms of luckly guesses
may seem bizarre, but it is a quite legitimate approach to defining the
problems in the class NP. In principle the procedure could even be mechanized
by building a device called a nondeterministic Turing machine. This device can
do all that an ordinary Turing machine can do; in addition, at some points in
its operation it may have more than one choice of what to do next. Such a
machine would be considered to answer yes to a question if there were some
sequence of choices that could lead it to a yes conclusion. NP, the class of
nondeterministic polynomialtime problems, consists of precisely those problems
whose yes instances can, be identified by machines making comparatively short
guessing computations.
The inclusion of guessing in the definition of these
problems suggests strongly to many mathematicians that P and NP are not the
same set and hence that efficient algorithms can never be found for the
intractable problems in the class NP. If every problem in NP were actually in P,
then all the guesswork and luck could be replaced by some systematic procedure
without great sacrifice in time. It is hard to believe the ability to guess and
be lucky could win so little.
The class NP includes a variety of commonly encountered
problems that seem to defy efficient solution. We have already mentioned
Hamilton's problem and the problem of composite numbers. Another example is
known as the matching problem. It can be considered in terms of the task faced
by the colleges every September, when a new class of freshmen must be assigned
to shared dormitory rooms.
For the sake of simplicity let us assume that all the
information gathered about the students' smoking habits, bed time hours, taste
in music and so forth results in a single yesorno decision as to the
compatibility of each possible pair of students. The entire class can then be
represented as a graph in which the points correspond to students and a line is
drawn connecting every two students who can be placed in the same room. If each
room holds just two students, the assignment can be made efficiently by a clever
polynomialtime algorithm discovered by Jack Edmonds of the University of
Waterloo. If each room is to be shared by three students, however, there is no
known efficient algorithm. The problem is in the class NP, since all yes
instances have succinct certificates: an acceptable room assignment, once it is
discovered, can easily be exhibited. Of course, a solution could be found by
exhaustive search, albeit inefficiently. With luck a suitable assignment, if
there is one, can be guessed quickly.
Map coloring is a problem in the class NP that concerns
mathematicians more than it does cartographers. The question is whether the
countries on a given map can be colored with a given number of colors so that
no two countries that share a border have the same color. It is easy to find
out if a map can be colored with two colors, it can be if there are no places
on the map where an odd number of countries meet at one point. It is even
easier to tell if a map can be colored with four colors; indeed, there is no
need even to look at the map, since Kenneth Appel and Wolfgang Haken of the
University of Illinois proved in 1975 that four colours suffice for any map.
Surprisingly, however, no efficient algorithm is known for determining whether
three colors are enough for a given map. The problem is in the class NP, since
a correctly colored map canserve to certify a yes answer.
Map coloring can be regarded as a special case of another
problem called graph coloring. Any map can be converted into a graph by
reducing each country to a point and drawing a line between two points if the
corresponding countries share a border. Coloring the graph is then equivalent
to coloring the map, subject to the rule that two points connected by a line
cannot have the same color. Graph coloring, however, is a more general problem,
with applications outside graph theory. For example, a graph can represent the
scheduling of work in a factory. Each point of the graph stands for some job to
be done, and two points are connected by a line, if the jobs cannot be done
concurrently, perhaps because they require the same piece of machinery. A
coloring of the graph with three colors would then supply a schedule dividing
the work of the factory into three shifts. Like map coloring, the
graphcoloring problem is in the class NP.
It often happens that if one problem can be solved
efficiently, so can many others. For example, if an efficient algorithm could
be found for the problem of graph coloring, it could be applied with only minor
modifications to the problems of map coloring and factory scheduling. Map
coloring and factory scheduling are therefore said to be efficiently reducible
to graph coloring. In the past several years it has become apparent that some
of the problems in the class NP have a remarkable property: all the problems in
NP are efficiently reducible to them. These elite problems within the class NP
are called NPcomplete. If any one of them has an efficient algorithm, then
every problem in NP can be solved efficiently.
The first proof that a problem is NPcomplete was presented
in 1971 by Stephen A. Cook of the University of Toronto. His reasoning follows
a path essentially parallel to the path of Turing's earlier work on
mathematical machines and their relation to problems of formal logic. Cook
stated his proof in terms of the prepositional calculus, the formal language in
which separate logical statements, which individually may be either true or
false, are joined together by the lexical elements "and," "or"
and "not." In general a sentence in the propositional calculus can be
shown to be either true or false depending on which of its component statements
are assumed to be true or false. Certain sentences, however, cannot be true
under any interpretation because they are selfcontradictory. Sentences that
cannot be made true are said to be unsatisfactory.
Cook employed the propositional calculus to describe the
operation of the nondeterministic Turing machines, the mechanized guessing
devices essential to the definition of the class NP. He showed that the
calculations of any such machine can be described succinctly by sentences of
the propositional calculus. When the machine is given a yes insfance of a problem
in NP, its operation is described by a satisfiable sentence, whereas the
operation of a machine given a no instance is described by a sentence that
cannot be satisfied.
It follows from Cook's proof that if one could efficiently
determine whether a sentence in the propositional calculus can be satisfied,
one could also determine efficiently in advance whether the problem presented
to a nondeterministic Turing machine will be answered yes or no. Since the
problems in the class NP are by definition all those that can be solved by
nondeterministic Turing machines, one would then have an efficient method for
solving all those problems. The catch, of course, is that there is no known
efficient method of determining whether a sentence in the propositional
calculus can be satisfied.
Cook's argument states in essence that the propositional
calculus is a universal language for describing problems in the class NP. Every
instance of such a problem corresponds to a sentence in that language, and if
the sentence is satisfiable, the instance has a yes answer. Many other problems
have since been shown to be NPcomplete because the satisfiability problem can
efficiently be reduced to them. Hamilton's problem, the problem of matching
groups of three roommates and the problem of coloring graphs with three colors
are all NPcomplete. The first to point out the broad applicability of this
theory was Richard M. Karp of the University of California at Berkeley. Similar
investigations were independently conducted by the Russian mathematician P. A.
Levin. Since NPcomplete problems capture the difficulty of all other problems
in NP, it is widely thought today that all NPcomplete problems are
computationally intractable. A proof that a problem is NPcomplete is usually
considered a strong argument for abandoning further efforts to devise an efficient
algorithm for its solution.
Even the assumption that all NPcomplete problems are
intractable would not settle all questions about the class NP. In addition to
the mystery of the NPcomplete problems there is an even more obscure area:
problems in NP for which no efficient algorithms are known but which have not
been proved to be NPcomplete either. The problem of composite numbers is one
of these.
Not all problems that can be solved by a computer are of the
yesorno. type. Another common type is the optimization problem. For example,
suppose one is given the positions of some cities on a map and asked to find
the shortest possible railroad network connecting them. In one version of this
problem one is allowed to lay down a straight section of track between any two
cities, but one is not allowed to install isolated junction points; tracks can
be joined only at cities. One property of the solution to this problem is
immediately apparent: the optimum network can never include a closed circuit,
because if it did, the network could be made shorter simply by omitting one
link in the circuit. Thus the best network always branches like a tree, and
the problem itself is called the spanningtree problem.
The spanningtree problem can be solved correctly and quite
efficiently by a method called the greedy algorithm, devised by Joseph B.
Kruskal of Bell Laboratories. The procedure is simply to connect the closest
pair of cities, then the nextclosest and so on without adding any superfluous
lines (lines joining cities that are already linked indirectly). It is far from
obvious that this method always yields the shortest network, but it does, and
it has the pleasant property of requiring no foresight and no reconsideration
of earlier decisions.
The greedy algorithm can be relied on to find the shortest
network between cities under the rules specified, but in general that network
will not be the shortest possible one. Further savings can be achieved by
establishing junction points between cities. The properties of networks with
such junctions were studied by the Swiss mathematician Jakob Steiner. It can be
shown that any shortest network must be arranged so that each junction point is
made up of three lines that meet at angles of 120 degrees. This rule provides
some guidance in evaluating networks, but there are many possible networks with
Steiner junction points. No algorithm has been discovered that finds the best
network quickly.
The problem of the traveling salesman's tour is closely
related. Again one is given a set of cities, but now one is asked to find the
shortest roundtrip tour. As a first guess the greedy algorithm suggests that
perhaps the salesman should always go to the nearest city he has not yet
visited, but this procedure does not work. Indeed, the problem is notorious for
having resisted all attempts to find an efficient solution.
Optimization problems are not fundamentally different from
those that ask yesorno questions; in fact, every optimization problem can be
rewritten in a yesorno form. In the traveling salesman problem, for example,
we might be given a length along with a set of cities and asked to state
whether a tour can be constructed that does not exceed the specified length.
This yesorno problem cannot be harder than the associated optimization
problem, because if the optimum tour could be found by some efficient method,
it would be a trivial task to determine whether it exceeds a given number.
Hence if the yesorno version is computationally intractable, one cannot hope
to solve the optimization problem itself efficiently. For this reason certain
optimization problems, such as that of the traveling salesman's tour and that
of placing Steiner junction points, are said to be
NPcomplete.
Optimization problems are encountered often in engineering,
economics, operations research and other fields. The discovery that at least
some of these problems are NPcomplete is therefore of considerable practical
interest. Since the NPcomplete problems probably have no efficient algorithms,
there would seem to be little point in expending further effort in seeking
optimum solutions. An alternative that has recently been adopted is to seek
approximate solutions that are good even if they are not precisely optimal.
One technique that has been applied to the traveling
salesman problem offers a solution that may not be optimum but is guaranteed to
be no worse than twice the optimum path. The procedure starts with the shortest
spanning tree, which can be generated efficiently by the greedy algorithm. This
network can be converted into a tour of the cities by traversing each line
twice and returning to the origin. It is known that the optimum spanning tree
must be shorter than any possible tour of the cities, since a tour can be
converted into a spanning tree (albeit one without any branches) by simply
omitting one segment. Thus twice the length of the optimum spanning tree cannot
be longer than twice the optimum tour. The method is a polynomialtime
algorithm. Recently Nicos Christofides of the Imperial College of Science and
Technology in London has found a way to improve the algorithm so that it yields
a tour guaranteed to be no more than half again as long as the optimum.
A more profound compromise gives up not only the requirement
that a solution be optimal but also the insistence that a less than optimum
solution be guaranteed to fall within a specified range. Instead the assurance
is given that the solution will not often deviate far from the optimum. An underlying
assumption of such techniques is that the maps encountered in practice are not
concocted to confound basically plausible techniques; such maps are encountered
frequently only when they are constructed by computer scientists to reveal the
flaws in methods proposed by their colleagues. Indeed, if the salesman's tour
algorithms discussed above are applied to "natural" maps, they
deliver far more than they promise. The resulting tours are not 100 percent or
50 percent longer than the optimum but closer to 5 percent.
A reasonable assumption about the properties of many maps is
that cities are randomly placed. A theorem describing the statistical
properties of optimum tours through such randomly distributed points was proved
in 1958 by Jillian Beardwood, J. H. Halton and John M. Hammersley of the
University of Oxford. Relying on that theorem, Karp has shown that a simple
method of constructing tours almost always yields nearoptimum results, when it
is applied to maps with many cities.
Karp begins by dividing the map into many small regions.
Within each of these regions the cities are sufficiently few to find the
optimum tour by exhaustive search, even though that method involves an
exponentialtime algorithm. The tours of the small areas are then linked by a
variant of the greedy algorithm. Perhaps significantly, the method is not very
different from the method usually adopted by people solving the problem
manually.
Efficient but approximate solutions can be found for many
NPcomplete optimization problems. From the standpoint of mathematics, however,
the important question is whether NP is identical with P. The repeated failure
of attempts to find an efficient algorithm for the NPcomplete problems has
created considerable confidence that NP and P are not the same. There is now
suspicion that they are not identical, but the proof of their distinctness may
be beyond present mathematical capabilities. The question may join that select
group of mathematical enigmas that remain unresolved for decades, and the solution
may have to await the development of new methods in mathematics.
© Реферат плюс
