You are on page 1of 57

Minimax search algorithm

For now, we assume that exhaustive search is possible.


1. Generate the entire game tree. Assume it has depth d.
2. For each terminal state, apply the payoff function to get its
score.
3. Back-up the scores at level d to assign a score to each
node at level d-1. If level d is MAX's move, then select the
best score from the node's children. If level d is MIN's
move, then select the worst score from the node's
children.
4. Backup the scores all the way up the tree, until the root
node gets to choose the maximum score among its
children. This is the minimax decision that determines the
best move to make!

The Evaluation Function


If we do not reach the end of the game how do we
evaluate the payoff of the leaf states?
Use a static evaluation function.
A heuristic function that estimates the utility of board positions.
Desirable properties
Must agree with the utility function
Must not take too long to evaluate
Must accurately reflect the chance of winning

An ideal evaluation function can be applied directly to the


board position.
It is better to apply it as many levels down in the game
tree as time permits

Evaluation Function for Chess


Relative material value
Pawn = 1, knight = 3, bishop = 3, rook = 5,
queen = 9

Good pawn structure


King safety

Revised Minimax Algorithm


For the MAX player
1. Generate the game as deep as time permits
2. Apply the evaluation function to the leaf states
3. Back-up values

At MIN ply assign minimum payoff move


At MAX ply assign maximum payoff move

4. At root, MAX chooses the operator that led to the


highest payoff

Minimax Procedure
minimax(board, depth, type)
If depth = 0 return Eval-Fn(board)
else if type = max
cur-max = -inf
loop for b in succ(board)
b-val = minimax(b,depth-1,min)
cur-max = max(b-val,cur-max)
return cur-max
else (type = min)
cur-min = inf
loop for b in succ(board)
b-val = minimax(b,depth-1,max)
cur-min = min(b-val,cur-min)
return cur-min

Minimax
max
min
max
min

Minimax
max
min

10
10

max
min

10
10

14
9

14

2
13

24
1

24

Note: exact values do not matter

Problems with fixed depth search


Most interesting games cannot be searched
exhaustively, so a fixed depth cutoff must be applied.
But this can cause problems ...
Quiescence: If you arbitrarily apply the evaluation
function at a fixed depth, you might miss a huge
swing that is about to happen. The evaluation
function should only be applied to quiescent (stable)
positions. (Requires game knowledge!)
The Horizon Effect: Search has to stop somewhere,
but a huge change might be lurking just over the
horizon. There is no general fix but heuristics can
sometimes help.

Pruning
Suppose your program can search 1000
positions/second.
In chess, you get roughly 150 seconds per move so you
can search 150,000 positions.
Since chess has a branching factor of about 35, your
program can only search 34 ply!
An average human plans 68 moves ahead so your
program will act like a novice.
Fortunately, we can often avoid searching parts of the
game tree by keeping track of the best and worst
alternatives at each point. This is called pruning the
search tree.

Alpha-Beta Pruning
Alpha-beta pruning is used on top of minimax search to
detect paths that do not need to be explored. The intuition
is:
The MAX player is always trying to maximize the score. Call
this .
The MIN player is always trying to minimize the score. Call
this .
When a MIN node has <= the of its MAX ancestors,
then this path will never be taken. (MAX has a better
option.) This is called an -cutoff.
When a MAX node has >= the of its MIN ancestors, then
this path will never be taken. (MIN has a better option.) this
is called an -cutoff

Bounding Search
The minimax procedure explores every path of length
depth. Can we do less work?

MAX
B

MIN
E

G H

Bounding Search

MAX
B (3)

(3)

MIN
E (3) F (12) G (8) H

Bounding Search

MAX
B (3)

(3)

C (<-5)

MIN
E (3) F (12) G (8) H (-5) I

Bounding Search

A (3)

MAX
B (3)

C (<-5)

D (2)

MIN
E (3) F (12) G (8) H (-5) I

J (15)

K (5) L (2)

2. - pruning: search cutoff


Pruning: eliminating a branch of the search tree from
consideration without exhaustive examination of each
node
- pruning: the basic idea is to prune portions of the
search tree that cannot improve the utility value of the
max or min node, by just considering the values of nodes
seen so far.
Does it work? Yes, in roughly cuts the branching factor
from b to b resulting in double as far look-ahead than
pure minimax

- pruning: example
6

MAX

MIN

12

- pruning: example
6

MAX

MIN

12

- pruning: example
6

MAX

MIN

12

- pruning: example
6

MAX
Selected move
MIN

12

- pruning: general principle


Player

Opponent

If > v then MAX will chose m so


prune tree under n
Similar for for MIN

Player

Opponent

Properties of Alpha-Beta
Pruning

Alpha-beta pruning is guaranteed to find the same best move as


the minimax algorithm by itself, but can drastically reduce the
number of nodes that need to be explored.
The order in which successors are explored can make a
dramatic difference!
In the optimal situation, alpha-beta pruning only needs to
explore O(bd/2 ) nodes.
Minimax search explores O(b d ) nodes, so alpha-beta pruning
can afford to double the search depth!
If successors are explored randomly, alpha-beta explores about
O(b3d/4).
In practice, heuristics often allow performance to be closer to
the best-case scenario.

AlphaBeta Pruning Algorithm


procedure alpha-beta-max(node, , )
if leaf node(node) then return evaluation(node);
foreach (successor s of node)
:= max(,alpha-beta-min(s, , ));
if >= then return ;
return ;
procedure alpha-beta-min(node, , )
if leaf node(node) then return evaluation(node);
foreach (successor s of node)
:= min(,alpha-beta-max(s, , );
if <= then return ;
return ;
To begin, we invoke: alpha-beta-max(node,1,1)

Alpha-beta pruning
Pruning does not affect final result
Alpha-beta pruning
Asymptotic time complexity
O((b/log b)d)

With perfect ordering, time complexity


O(bd/2)
means we go from an effective branching factor of
b to sqrt(b) (e.g. 35 -> 6).

Procedure
minimax-(board, depth, type, , )
If depth = 0 return Eval-Fn(board)
else if type = max
cur-max = -inf
loop for b in succ(board)
b-val = minimax-(b,depth-1,min, , )
cur-max = max(b-val,cur-max)
= max(cur-max,
if cur-max >= finish loop
return cur-max
else type = min
cur-min = inf
loop for b in succ(board)
b-val = minimax-(b,depth-1,max, , )
cur-min = min(b-val,cur-min)
= min(cur-min,
if cur-min <= finish loop
return cur-min

Pruning Example
max
min
max
min

Pruning Example
max
min

10
10

max
min

10
10

14
9

14

Now, you do it!

Max

Min

Max

Min

Move Ordering Heuristics


Good move ordering improves effectiveness of pruning
MAX

A (3)

MIN

B (3) C (<-5) D (2)

E F G H I
(3) (12) (8) (-5)

J
K L
(15) (5) (2)

Original Ordering

A (3)
B (3) C (<-5) D (<2)
E F G H I
(3) (12) (8) (-5)

L
K J
(2) (5) (15)

Better Ordering

Using Book Moves


Use catalogue of solved positions to extract the
correct move.
For complicated games, such catalogues are not
available for all positions
Often, sections of the game are well-understood
and catalogued
E.g. openings and endings in chess

Combine knowledge (book moves) with search


(minimax) to produce better results.

http://www.gametheory.net/applets/

Games with Chance


How to include
chance
Add chance node

Decision Making in Game of Chance


Chance nodes
Branches leading from each chance node denote the
possible dice rolls
Labeled with the roll and the chance that it will occur

Replace MAX/MIN nodes in minimax with


expected MAX/MIN payoff
Expectimax value of C
Expectimin value
expectimax(C )

P(d ) max
i

s S ( C , di )

(utility ( s ))

expectimin(C ) P(di ) min s S (C , di )(utility ( s ))


i

Position evaluation in games with chance


nodes
For minimax, any order-preserving
transformation of the leaf values
does not affect the choice of move
With chance node, some order-preserving
transformations of the leaf values
do affect the choice of move

Position evaluation in games with chance


nodes (contd)

The behavior of the algorithm is sensitive even to a linear


transformation of the evaluation function.

Another Example of expectimax

Complexity of expectiminimax
The expectiminimax considers all the possible dice-roll
sequences
It takes O(bmnm)
where n is the number of distinct rolls
Whereas, minimax takes O(bm)

Problems
The extra cost compared to minimax is very high
Alpha-beta pruning is more difficult to apply

Games in real life

Context
Terrorists do a lot of different things
The U.S. can try and anticipate all kinds of
things in defense of these attacks
If the U.S. fails to invest wisely, then we
lose important battles.

A Smallpox Exercise
The U.S. government is concerned about
the possibility of smallpox bioterrorism.
Terrorists could make no smallpox attack, a
small attack on a single city, or
coordinated attacks on multiple cities (or
do other things).

The U.S. has four defense strategies:


Stockpiling vaccine
Stockpiling and increasing bio-surveillance
Stockpiling and inoculating first responders
and/or key personnel
Inoculating all consenting people with
healthy immune systems.

Using Game Theory to make a


decision
Classical game theory uses a matrix of
costs to determine optimal play.
Optimal play is usually defined as a
minimax strategy, but sometimes one can
minimize expected loss instead.
Both methods are unreliable guides to
human behavior.

Game Theory Matrix


No Attack

Small Attack

Big Attack

Stockpile

C11

C12

C13

Surveillance

C21

C22

C23

First Responders

C31

C32

C33

Mass Inoculation

C41

C42

C43

Minimax Strategy
The U.S. should choose the defense with smallest rowwise max cost.
The terrorist should choose the attack with largest columnwise min cost.
If these are not equal then a randomized strategy is better.

Extensive-form game theory invites decision


theory criteria based upon minimum
expected loss.
In our smallpox exercise, we shall implement
this by assuming that the U.S. decisions are
known to the terrorists, and that this affects
their probabilities of using certain kinds of
attacks.

Game Theory Critique


Game theory does not take account of
resource limitations.
It assumes that both players have the
same cost matrix.
It assumes both players act in synchrony
(or in strict alternation).
It assumes all costs are measured without
error.

Adding Risk Analysis


Statistical risk analysis makes probabilistic
statements about specific kinds of
threats.
It also treats the costs associated with
threats as random variables. The total
random cost is developed by analysis of
component costs.

Cost Example
To illustrate a key idea, consider the
problem of estimating the cost C11 in the
game theory matrix. This is the cost
associated with stockpiling vaccine when
no smallpox attack occurs.
Some components of the cost are fixed,
others are random.

C11 = cost to test diluted Dryvax +


cost to test Aventis vaccine +
cost to make 209 x 106 doses +
cost to produce VIG +
logistic/storage/device costs.

The other costs in the matrix are also random


variables, and their distributions can be
estimated in similar ways.
Note that different matrix costs are not
independent; they often have components in
common across rows and columns.

More examples
Cost to treat one smallpox case; this is normal with
mean $200,000 and s.d. $50,000.
Cost to inoculate 25,000 people; this is normal with
mean $60,000 and s.d. $10,000.
Economic costs of a single attack; this is gamma
with mean $5 billion and s.d. $10 billion.

Games + Risk
Game theory and statistical risk analysis can
be combined to give arguably useful
guidance in threat management.
We generate many random tables,
according to the risk analysis, and find
which defenses are best.

Minimum Expected Loss


The table in the lower right shows the
elicited probabilities of each kind of attack
given that the corresponding defense has
been adopted.
These probabilities are used to weight the
costs in calculating the expected loss.

Conclusions
For our rough risk analysis, minimax favors universal
inoculation, minimum expected loss favors stockpiling.
This accords with the public and federal thinking on threat
preparedness.

You might also like