You are on page 1of 37

Adversarial Search

Chapter 5

Outline

Optimal decisions
- pruning
Imperfect, real-time decisions
Stochastic Games

Games vs. search problems


"Unpredictable" opponent specifying a
move for every possible opponent reply
Time limits since it is unlikely to find the
goal, agent must approximate

Exercise
1. Tough question . Do you
recognize this?
2. Name the game.
3. Is this an interesting or a dull
game? Why?
4. Characterize the game:
Performance Measure,
Environment, Actuators and
Sensors (PEAS)

Game tree (2-player,


deterministic, turns)

Exercise in pairs
Can you think of a heuristic function for the
Tic-Tac-Toe game?
Using the heuristic function, devise a
strategy to play tic-tac-toe.
If both players play their best, what is the
depth of the tree (how many moves)?
(One move in this game corresponds to
two plies, where each ply is one players
turn)
6

Tic-Tac-Toe heuristic function

Exercise in pairs
Compute the average branching factor.
Could the branching factor be reduced?
How?
What would the reduced branching factor
be?
You just prune the search tree.

Minimax
Perfect play for deterministic games
Idea: choose move to position with highest minimax
value
= best achievable payoff against best play
E.g., 2-ply game:

Minimax

10

Minimax algorithm

11

Exercise in pairs
Analyze the minimax algorithm
Compare your strategy for Tic-Tac-Toe
with the minimax algorithm

12

Properties of minimax

Complete? Yes (if tree is finite)


Optimal? Yes (against an optimal opponent)
Time complexity? O(bm)
Space complexity? O(bm) (depth-first exploration)

For chess, b 35, m 100 for "reasonable" games


exact solution completely infeasible

13

Optimal decisions in multiplayer


games

14

- pruning example

15

- pruning example

16

- pruning example

17

- pruning example

18

- pruning example

19

Properties of -
Pruning does not affect final result
Good move ordering improves effectiveness of pruning
With "perfect ordering," time complexity = O(bm/2)
doubles depth of search

A simple example of the value of reasoning about which


computations are relevant (a form of metareasoning)

20

- Prunning
is the value of the
best (i.e., highestvalue) choice found
so far at any choice
point along the path
for max
If v is worse than ,
max will avoid it
prune that branch

Define similarly for


min
21

The - algorithm

22

The - algorithm

23

Imperfect Real-Time Decisions


Suppose we have 100 secs, explore 104
nodes/sec
106 nodes per move

Standard approach:
cutoff test:
e.g., depth limit (perhaps add quiescence search)

evaluation function
= estimated desirability of position
24

Evaluation functions
For chess, typically linear weighted sum of features

EV A L = 1 1 + 2 2 + + =


=1

Where wi = the values of pieces, e.g. 9 for queen, 3 for


bishops and 1 for pawns, with fi(s) = feature of position. fi
could be the numbers of each kind of piece.
It assumes independence between features
Could be modified to include nonlinear combination of
features
wi could be estimated using machine learning
25

Cutting off search


MinimaxCutoff is identical to MinimaxValue except
1. Terminal? is replaced by Cutoff?
2. Utility is replaced by Eval

Does it work in practice?


bm = 106, b=35 m=4
4-ply lookahead is a hopeless chess player!

4-ply human novice


8-ply typical PC, human master
12-ply Deep Blue, Kasparov

Kasparov m=12. If Minimax m=100


26

Stochastic Games
White movement

Element of
chance

Black movement

27

Stochastic Games
Characterize the environment for
Backgammon
Deterministic/Stochastic
Continuous/Discrete
Dynamic/Semidynamic/Static
Fully observable/Partially observable
Episodic/Sequential

28

Stochastic Games

29

Stochastic Games
Positions do not have definite minimax
values; compute expected value of a
position
Generalized minimax value to
expectiminimax value
UT IL IT Y
max EX P E C T IM IN IM A X RE S U L T ,

EX P E C T IM IN IM A X =

if TE R M IN A L TE S T
if PL A Y E R = MAX

min EX P E C T IM IN IM A X RE S U L T ,

if PL A Y E R = MIN

EX P E C T IM IN IM A X RE S U L T ,

if PL A Y E R = CHANCE

r represents a possible dice roll


30

Stochastic Games
EXPECTIMINIMAX, in addition to MIN and MAX,
must also consider the possible dice rolls
O(bmnm) where b=branching factor,
m=depth of search tree, and n is the
number of distinct rolls.
Alternative: Montecarlo Simulation
From start position simulate thousands of
games against itself using random dice rolls
31

Partially Observable Games


Battleship
Kriegspiel: Variant of Chess but each player only sees
his/her pieces on the board. The referee does see all the
pieces
Each player in his/her turn announces move to the referee; the
opponent does not hear the move
The referee announces if the move is legal or illegal; if illegal
player may keep proposing moves until a legal one is found
Referee announces e.g. Capture on square X, or Check by D,
where D is the direction of the check.
Referee also announces checkmate or stalemate.

32

Card games
Question:
Why are most card games different to dice
games?

Is Domino similar to a card game?

33

Domino
Possible algorithm:
Consider all possible deals of the invisible dominos
Solve each one as if it were a fully observable game
Choose the move that has the best outcome averaged
over all the deals, i.e., if each deal s occurs with
probability P(s), then the move to chose is:
argmax

MIN IM A X RE S U L T ,

Run MINIMAX if computationally feasible; or H-MINIMAX


otherwise
35

Domino
Is it computationally feasible to consider all
possible deals of the invisible dominos?
Alternative: Montecarlo simulation
Take a random sample of N deals where the
probability of deal s appearing in the sample is
proportional to P(s):
1
argmax

MIN IM A X RE S U L T ,
=1

This method is called Averaging over


clairvoyance

37

Games in practice
Checkers: Chinook uses alpha-beta search with a database of 39 x
1012 precomputed endgame positions.
Chess: Deep Blue uses 30 IBM RS/6000 processors doing alphabeta search. Uses 480 custom VLSI chess processors, Searched up
to 30 x 109 positions per move with an evaluation function with over
8000 features.
Othello: human champions refuse to compete against computers,
which are too good.
Go: Computer programs in 19x19 board play at advanced amateur
level. In go, b > 300, so most programs use Montecarlo simulation
and pattern knowledge bases to suggest plausible moves.
Bridge: Bridge Baron program won the 1997 computer bridge
championship. Uses complex hierarchical plans involving high level
ideas, but is not optimal.
Scrabble: using a dictionary chooses highest-scoring move. Good
but not expert player since game is partially observable and
stochastic. Quackle defeated world champion David Boys 3-2 in
2006.
38

Summary
Games are fun to work on!
They illustrate several important points
about AI
perfection is unattainable must
approximate
good idea to think about what to think
about (metareasoning)
39

You might also like