You are on page 1of 4

Algorithms: CSE 202 — Homework IV

Solve problems 2, 4, 6, 7, and 8


Problem 1: Graphical Steiner tree (KT 8.38)
Consider the following version of the Steiner Tree Problem, which well refer to as Graphical Steiner
Tree. You are given an undirected graph G = (V, E), a set X ⊆ V of vertices, and a number k. You
want to decide whether there is a set F ⊆ E of at most k edges so that in the graph (V, F ), X
belongs to a single connected component. Show that Graphical Steiner Tree is NP-complete.

Problem 2: Perfect assembly (KT 8.32)


The mapping of genomes involves a large array of difficult computational problems. At the most
basic level, each of an organism’s chromosomes can be viewed as an extremely long string (generally
containing millions of symbols) over the four-letter alphabet {A, C, G, T }. One family of approaches
to genome mapping is to generate a large number of short, overlapping snippets from a chromosome,
and then to infer the full long string representing the chromosome from this set of overlapping
substrings.
While we won’t go into these string assembly problems in full detail, here’s a simplified problem
that suggests some of the computational difficulty one encounters in this area. Suppose we have
a set S = {s1 , s2 , . . . , sn } of short DNA strings over a q-letter alphabet; and each string si has
length 2`, for some number ` ≥ 1. We also have a library of additional strings T = {t1 , t2 , . . . , tm }
over the same alphabet; each of these also has length 2`. In trying to assess whether the string sb
might come directly after the string sa in the chromosome, we will look to see whether the library
T contains a string tk so that the first ` symbols in tk are equal to the last ` symbols in sa , and
the last ` symbols in tk are equal to the first ` symbols in sb . If this is possible, we will say that
tk corroborates the pair (sa , sb ). (In other words, tk could be a snippet of DNA that straddled the
region in which sb directly followed sa .)
Now we’d like to concatenate all the strings in S in some order, one after the other with no
overlaps, so that each consecutive pair is corroborated by some string in the library T . That is, we’d
like to order the strings in S as si1 , si2 , . . . , sin , where i1 , i2 , . . . , in is a permutation of {1, 2, . . . , n},
so that for each j = 1, 2, . . . , n − 1, there is a string tk that corroborates the pair (sij , sij+1 ). (The
same string tk can be used for more than one consecutive pair in the concatenation.) If this is
possible, we will say that the set S has a perfect assembly.
Given sets S and T , the Perfect Assembly Problem asks: Does S have a perfect assembly with
respect to T ? Prove that Perfect Assembly is NP- complete.
Example. Suppose the alphabet is {A, C, G, T }, the set S = {AG, T C, T A}, and the set
T = {AC, CA, GC, GT } (so each string has length 2` = 2). Then the answer to this instance of
Perfect Assembly is yes: We can concatenate the three strings in S in the order TCAGTA (so
si1 = s2 , si2 = s1 , and si3 = s3 ). In this order, the pair (si1 , si2 ) is corroborated by the string CA
in the library T , and the pair (si2 , si3 ) is corroborated by the string GT in the library T .

1
Problem 3: Special case of 3-sat (KT 10.8)
Consider the class of 3-sat instances in which each of the n variables occurs — counting positive
and negated appearances combined — in exactly three clauses. Show that any such instance of
3-sat is in fact satisfiable, and that a satisfying assignment can be found in polynomial time.

Problem 4: MaxCut for Trees (KT 10.9)


Give a polynomial-time algorithm for the following problem. We are given a binary tree T = (V, E)
with an even number of nodes, and a nonnegative weight on each edge. We wish to find a partition
of the nodes V into two sets of equal size so that the weight of the cut between the two sets is as
large as possible (i.e., the total weight of edges with one end in each set is as large as possible).
Note that the restriction that the graph is a tree is crucial here, but the assumption that the tree is
binary is not. The problem is NP-hard in general graphs.

Problem 5: Hamiltonian path (KT 10.3)


Suppose we are given a directed graph G = (V, E), with V = {v1 , v2 , . . . , vn }, and we want to
decide whether G has a Hamiltonian path from v1 to vn . (That is, is there a path in G that goes
from v1 to vn , passing through every other vertex exactly once?)
Since the Hamiltonian Path Problem is NP-complete, we do not expect that there is a polynomial-
time solution for this problem. However, this does not mean that all nonpolynomial-time algorithms
are equally “bad.” For example, here’s the simplest brute-force approach: For each permutation of
the vertices, see if it forms a Hamiltonian path from v1 to vn . This takes time roughly proportional
to n!, which is about 3 × 1017 when n = 20.
Show that the Hamiltonian Path Problem can in fact be solved in time O(2n · p(n)), where p(n)
is a polynomial function of n. This is a much better algorithm for moderate values of n; for example,
2n is only about a million when n = 20.
In addition, show that the Hamiltonian Path problem can be solved in time O(2n · p(n)) and in
polynomial space.

Problem 6: Spread maximization (KT 11.7)


You’re consulting for an e-commerce site that receives a large number of visitors each day. For
each visitor i, where i ∈ {1, 2, . . . , n}, the site has assigned a value vi , representing the expected
revenue that can be obtained from this customer.
Each visitor i is shown one of m possible ads A1 , A2 , . . . , Am as they enter the site. The site
wants a selection of one ad for each customer so that each ad is seen, overall, by a set of customers of
reasonably large total weight. Thus, given a selection of one ad for each customer, we will define the
spread of this selection to be the minimum, over j = 1, 2, . . . , m, of the total weight of all customers
who were shown ad Aj .
Example Suppose there are six customers with values 3, 4, 12, 2, 4, 6, and there are m = 3 ads.
Then, in this instance, one could achieve a spread of 9 by showing ad A1 to customers 1, 2, 4, ad
A2 to customer 3, and ad A3 to customers 5 and 6.
The ultimate goal is to find a selection of an ad for each customer that maximizes the spread.
Unfortunately, this optimization problem is NP-hard (you don’t have to prove this). So instead, we
will try to approximate it.

(a) Give a polynomial-time algorithm that approximates the maximum spread to within a factor
of 2. That is, if the maximum spread is s, then your algorithm should produce a selection of

2
one ad for each customer that has spread at least s/2. In designing your algorithm, you may
assumeP that no single customer has a value that is significantly above the average; specifically,
if v = ni=1 vi denotes the total value of all customers, then you may assume that no single
customer has a value exceeding v/(2m).

(b) Give an example of an instance on which the algorithm you designed in part (a) does not find
an optimal solution (that is, one of maximum spread). Say what the optimal solution is in your
sample instance, and what your algorithm finds.

Problem 7: Heaviest first (KT 11.10)


Suppose you are given an n × n grid graph G, as in Figure ?? Associated with each node v is

Figure 1: A grid graph


a weight w(v), which is a nonnegative integer. You may assume that the weights of all nodes are
distinct. Your goal is to choose an independent set S of nodes of the grid, so that the sum of the
weights of the nodes in S is as large as possible. (The sum of the weights of the nodes in S will be
called its total weight.)
Consider the following greedy algorithm for this problem.
Algorithm 1: The “heaviest-first” greedy algorithm
Start with S equal to the empty set
while some node remains in G do
Pick a node vi of maximum weight
add vi to S
Delete vi and its neighbors from G
end while
return S

(a) Let S be the independent set returned by the “heaviest-first” greedy algorithm, and let T be
any other independent set in G. Show that, for each node v ∈ T ,either v ∈ S,or there is a node
v 0 ∈ S so that w(v) ≤ w(v 0 ) and (v, v 0 ) is an edge of G.

(b) Show that the “heaviest-first” greedy algorithm returns an independent set of total weight at
least 41 times the maximum total weight of any independent set in the grid graph G.

3
Problem 8: Knapsack (KT 11.11)
Recall that in the Knapsack Problem, we have n items, each with a weight wi and a value vi . We
also have a weight bound W , and the problem is to select a set of items S of P highest possible value
subject to the condition that the total weight does not exceed W –that is, i∈S wi ≤ W . Here’s
one way to look at the approximation algorithm P that we designed in this chapter. P If we are told
there exists a subset O whose total weight is i∈Owi ≤W and whose total value is P i∈O vi = V for
some V , then our approximation
P algorithm can find a set A with total weight i∈A wi ≤ W and
total value at least i∈A vi ≥ V /(1 + ). Thus the algorithm approximates the best value, while
keeping the weights strictly under W . (Of course, returning the set O is always a valid solution, but
since the problem is NP-hard, we don’t expect to always be able to find O itself; the approximation
bound of 1 +  means that other sets A, with slightly less value, can be valid answers as well.)
Now, as is well known, you can always pack a little bit more for a trip just by “sitting on your
suitcase”-in other words, by slightly overflowing the allowed weight limit. This too suggests a way of
formalizing the approximation question for the Knapsack Problem, but it’s the following, different,
formalization.
Suppose, as before, that you’re given n items with weights and values, P as well as parameters
W and V ; andPyou’re told that there is a subset O whose total weight is i∈O wi ≤ W and whose
total value is i∈O vi = V for some V . ForPa given fixed  > 0, design Pa polynomial-time algorithm
that finds a subset of items A such that i∈A wi ≤ (1 + )W and i∈A vi ≥ V . In other words,
you want A to achieve at least as high a total value as the given bound V , but you’re allowed to
exceed the weight limit W by a factor of 1 + .
Example. Suppose you’re given four items, with weights and values as follows:

(w1 , v1 ) = (5, 3), (w2 , v2 ) = (4, 6)

(w3 , v3 ) = (1, 4), (w4 , v4 ) = (6, 11)


You’re also given W = 10 andV = 13 (since, indeed, the subset consisting of the first three items
has total weight at most 10 and has value 13). Finally, you’re given  = .1. This means you need to
find (via your approximation algorithm) a subset of weight at most (1 + .1) ∗ 10 = 11 and value at
least 13. One valid solution would be the subset consisting of the first and fourth items, with value
14 ≤ 13. (Note that this is a case where you’re able to achieve a value strictly greater than V , since
you’re allowed to slightly overfill the knapsack.)

You might also like