You are on page 1of 11

Satisfiability

Cooks Theorem
Michael Albert
malbert@cs.otago.ac.nz

7/5/2009 and 12/5/2009

Lectures 18 and 19: Theory of Computing


COSC341 7/5/2009

Satisfiability
S ATISFIABILITY
Instance: A CNF formula over a set of variables V .
Problem: Does the formula have a satisfying assignment?

We aim to prove:

Cooks Theorem (1971) S ATISFIABILITY is N P -complete.

The proof we give follows that of the text (pages 481-


492) with names of variables changed in many instances
to improve readability. It is not so very different from Cooks
original proof.

First well do the easy bit:

Lemma S ATISFIABILITY is in N P .

If a formula is satisfiable then a non-deterministic Turing


machine is entitled to guess the satisfying assignment and
then check it works, or alternatively we can write a satisfying
assignment on an oracle tape. The input size is roughly the
number of clauses times the number of variables. To check
whether a particular clause is satisfied requires no more
steps than the number of variables. So the total processing
time is also roughly the number of clauses times the number
of variables (plus a bit of overhead). This is polynomial in
the input size, so weve confirmed that S ATISFIABILITY is in
N P.

Lectures 18 and 19: Theory of Computing 1


COSC341 7/5/2009

Motivation
The hard part is proving that S ATISFIABILITY is N P -hard.
To do this we must establish a polynomial time reduction
from any problem in N P to S ATISFIABILITY. Where can we
start? Whats the handle? How do we turn the crank?

We have a language L N P . With this language comes


a non-deterministic Turing machine M that accepts L and
has a time bound Anc for inputs of length n.

Thats all we know!

What if M were deterministic? We could take a snapshot


of its configuration at each time step. This would require
variables like:

S TATE(q, t) M is in state q at time t.


H EAD(j, t) The read/write head is at
position j at time t.
S YMBOL(c, j, t) Symbol c is on the tape at
position j at time t.

A computation of M would be represented by a truth


assignment to these variables. The fact that it was a
real computation could be enforced by clauses defining the
restrictions imposed on the operation of M .

Non-determinism muddies this slightly but not too much


as we will see.

Lectures 18 and 19: Theory of Computing 2


COSC341 7/5/2009

Getting started
Lets just look at a few examples before we try to carefully
pin down all the details.

How do we express the fact Position 3 of the tape contains


exactly one symbol at time 23?

What about If position 7 is not under the read/write head at


time 12 then the symbol at that position is the same at time
13?

The read/write head is in some specific place at time 17

Lectures 18 and 19: Theory of Computing 3


COSC341 7/5/2009

Groups of clauses
Our reducer will convert M and the relevant part of the input
w (i.e. only the first Anc characters) into an instance of
S ATISFIABILITY. There are several groups of clauses in this
instance which play different roles:

A group dealing with the representation being valid at


each time step, i.e. things like:
There is a unique symbol at each point on the tape.
The read/write head is in a specific position.
The machine is in some state.
A group representing the initial configuration.
A group representing the final configuration.
A group representing consistency between one frame
(time t) and the next (time t + 1).

At all times we need to check that everything is bounded by


a polynomial in the original input size (n) and there will be
one minor technical issue that arises with the consistency
group.

We assume that M has states q0 through qm which include


special states denoted qa and qr which are the accepting
and rejecting states respectively. These are modeled as
loop states i.e. the machine simply idles in these states
once it reaches one. So we can assume that a computation
on input of length n is run for exactly Anc steps.

Lectures 18 and 19: Theory of Computing 4


COSC341 7/5/2009

Valid tape
There is a symbol at each point on the tape at each time

For 0 t Anc, and 0 j Anc:

_
S YMBOL(c, j, t)
c

A total of O(n2c) clauses each of the same size as the


alphabet. So polynomial.

There are never two symbols at the same point of the tape
at the same time. In other words: For any two symbols, at
least one of them is not on a particular point of the tape at a
particular time.

For 0 t Anc, and 0 j Anc:

0
^ `
S YMBOL(c, j, t) S YMBOL(c , j, t)
c, c0
c 6= c0

A total of O(n2c) more clauses.

Lectures 18 and 19: Theory of Computing 5


COSC341 7/5/2009

Valid machine
At each time step the tape head is somewhere.

For 0 t Anc

_
H EAD(j, t)
0jAnc

And its not in two places at once

0
^ `
H EAD(j, t) H EAD(j , t)
0j<j 0 Anc

At each time step the machine is in some state

For 0 t Anc

_
S TATE(qi, t)
0im

And its not in two states at once

...

Lectures 18 and 19: Theory of Computing 6


COSC341 7/5/2009

Initial/Final
The initial state is q0, the read/write head is at position 0
and the first Anc positions of the tape match w

^
S TATE(q0, 0) H EAD(0, 0) S YMBOL(wj , j, 0).
0jAnc

The final state is qa

c
S TATE(qa, An ).

At this point we have a static representation of a machine


which in the correct initial configuration, and the correct final
configuration and such that the intermediate configurations
are all valid ones. What we dont have is any link between
configurations at successive time steps.

Lectures 18 and 19: Theory of Computing 7


COSC341 7/5/2009

Transitions
Only the symbol under the read/write head can change

or

If the head is somewhere, then no symbol anywhere else


can change.

For 0 t < Anc, 0 j, j 0 Anc, with j 6= j 0 and all


c, c0 with c 6= c0:

` 0 0 0
H EAD(j, t) Symbol(c, j , t) S YMBOL(c , j , t + 1)

But a (b c) is logically equivalent to a b c


so this is a family of O(n3c) clauses.

If we are in state q at position j reading symbol c then we


must move to a state which is allowed by a transition from
q , c.

Suppose that we had only one transition of this sort to state


q 0, writing symbol c0 and moving by  {1, 1}:

S TATE(q, t) H EAD(j, t) Symbol(c, j, t)


S TATE(q 0, t + 1) S YMBOL(c0, j, t + 1) H EAD(j + , t + 1)

Lectures 18 and 19: Theory of Computing 8


COSC341 7/5/2009

Transitions (contd)
Again propositional logic tells us that a (b c) is
equivalent to (a b) (a c) so we can translate
the above into O(n2c) clauses. However . . .

How do we allow for multiple possible transitions?

The easy way is to use a deterministic Turing machine with


an oracle tape (and add enough extra clauses to specify
something is written on the first Anc positions of the oracle
tape initially).

Alternatively, note that allowing for multiple transitions gives


us pieces of the eventual formula of the form:

_
(CNF formula for each transition)
possible transitions

In the disjunction there are at most T terms each in


CNF where T is the maximum size of the set of allowed
transitions for any particular state/symbol combination in M .
This is a fixed value. It turns out that by adding some extra
bookkeeping variables we can prove (Lemma 15.8.4 in the
text) that such a formula can be converted into an equivalent
CNF formula with O(mT 2) terms where m is the maximum
number of clauses of one of the terms.

Lectures 18 and 19: Theory of Computing 9


COSC341 7/5/2009

Overview
We have described a mechanical procedure for converting
in polynomial time, a non-deterministic Turing machine
M with polynomial time complexity and an input word
w into a formula (M, w) in CNF, i.e. an instance of
S ATISFIABILITY. This translation is such that if (M, w)
is satisfiable, then any satisfying assignment represents a
series of snapshots of a valid accepting computation of M
on input w.

Moreover, if such a computation exists, we can use it


to set the variables of (M, w) to produce a satisfying
assignment.

So:

M accepts w if and only if (M, w) is satisfiable.

That is, we have a polynomial time reduction from The


language accepted by M to S ATISFIABILITY. Since M
was an arbitrary non-deterministic Turing machine M with
polynomial time complexity this means that S ATISFIABILITY
is N P -hard, and therefore N P -complete (since we already
know that it is in N P ).

Phew.

Lectures 18 and 19: Theory of Computing 10

You might also like