Why Do AI Researchers Study Game Playing?

Game Playing
Why do AI researchers study game playing?

1. Its a good reasoning problem, formal and nontrivial.
2. Direct comparison with humans and other computer
programs is easy.
What Kinds of Games?

Mainly games of strategy with the following
characteristics:
1. Sequence of moves to play
2. Rules that specify possible moves
3. Rules that specify a payment for each
move
4. Objective is to maximize your payment
2
Games vs. Search Problems

Unpredictable opponent specifying a
move for every possible opponent reply
Time limits unlikely to find goal, must
approximate
Two-Player Game
Opponents Move
Generate New Position
Game
Over?
yes
no
Generate Successors
Evaluate Successors
Move to Highest-Valued Successor
no
Game
Over?
yes
4
Game Tree (2-player,

Deterministic, Turns)
computers
turn
opponents
turn
computers
turn
The computer is Max.

The opponent is Min.
opponents
turn
leaf nodes
are evaluated
At the leaf nodes, the

utility function
is employed. Big value
5
means good, small is bad.
Mini-Max Terminology
utility function: the function applied to leaf nodes
backed-up value
of a max-position: the value of its largest successor
of a min-position: the value of its smallest successor
minimax procedure: search down several levels;

at the bottom level apply the utility function,
back-up values all the way up to the root node,
and that node selects the move.
Minimax
Perfect play for deterministic games
Idea: choose move to position with highest minimax
value
= best achievable payoff against best play
E.g., 2-ply game:
Minimax Strategy
Why do we take the min value every other
level of the tree?
These nodes represent the opponents
choice of move.
The computer assumes that the human will
choose that move that is of least value to
the computer.
8
Minimax algorithm
Tic Tac Toe

Let p be a position in the game
Define the utility function f(p) by
f(p) =
largest positive number if p is a win for computer
smallest negative number if p is a win for opponent
RCDC RCDO
where RCDC is number of rows, columns and

diagonals in which computer could still win
and RCDO is number of rows, columns and diagonals
in which opponent could still win.
10
Sample Evaluations
X = Computer; O = Opponent
O O X
X X
O
X
X
rows
cols
diags
rows
cols
diags
11
Minimax is done depth-first

max
min
max
leaf
2
1
12
Properties of Minimax
Complete? Yes (if tree is finite)

Optimal? Yes (against an optimal opponent)
Time complexity? O(bm)
Space complexity? O(bm) (depth-first exploration)
For chess, b 35, m 100 for "reasonable" games

exact solution completely infeasible
Need to speed it up.

13
Alpha-Beta Procedure
The alpha-beta procedure can speed up a
depth-first minimax search.
Alpha: a lower bound on the value that a
max node may ultimately be assigned
v>
Beta: an upper bound on the value that a

minimizing node may ultimately be
assigned
v<
14
- pruning example
15
- pruning example
=3
alpha cutoff
16
- pruning example
17
- pruning example
18
- pruning example
19
Alpha Cutoff
=3
>3
3
10
What happens here? Is there an alpha cutoff?
20
Beta Cutoff
=4
<4
>8
cutoff
21
Alpha-Beta Pruning
max
min
max
eval
2 10 11 1 2 2
12
25
22
Properties of -
Pruning does not affect final result. This means that it
gets the exact same result as does full minimax.
Good move ordering improves effectiveness of pruning
With "perfect ordering," time complexity = O(bm/2)
doubles depth of search
A simple example of the value of reasoning about which

computations are relevant (a form of metareasoning)
23
The - algorithm
cutoff
24
The - algorithm
cutoff
25
When do we get alpha cutoffs?
100
< 100
...
< 100
26
Shallow Search Techniques

1. limited search for a few levels
2. reorder the level-1 sucessors
3. proceed with - minimax search
27
Additional Refinements
Waiting for Quiescence: continue the search
until no drastic change occurs from one level to
the next.
Secondary Search: after choosing a move,
search a few more levels beneath it to be sure it
still looks good.
Book Moves: for some parts of the game
(especially initial and end moves), keep a
catalog of best moves to make.
28
Evaluation functions
For chess/checkers, typically linear weighted sum of
features
Eval(s) = w1 f1(s) + w2 f2(s) + + wn fn(s)
e.g., w1 = 9 with
f1(s) = (number of white queens) (number of black
queens), etc.
29
Example: Samuels CheckerPlaying Program

It uses a linear evaluation function
f(n) = a1x1(n) + a2x2(n) + ... + amxm(n)
For example: f = 6K + 4M + U
K = King Advantage
M = Man Advantage
U = Undenied Mobility Advantage (number of
moves that Max has that Min cant jump after)
30
Samuels Checker Player

In learning mode
Computer acts as 2 players: A and B
A adjusts its coefficients after every move
B uses the static utility function
If A wins, its function is given to B
31
How does A change its function?

1. Coefficent replacement
(node ) = backed-up value(node) initial value(node)
if
> 0 then terms that contributed positively are
given more weight and terms that contributed
negatively get less weight
if
< 0 then terms that contributed negatively are
given more weight and terms that contributed
positively get less weight
32
How does A change its function?

2. Term Replacement
38 terms altogether
16 used in the utility function at any one time
Terms that consistently correlate low with the
function value are removed and added to the end of
the term queue.
They are replaced by terms from the front of the
term queue.
33
Kalah
Ps holes
KP
6
Kp
counterclockwise
ps holes
To move, pick up all the stones in one of your holes, and
put one stone in each hole, starting at the next one,
including your Kalah and skipping the opponents Kalah.
34
Kalah
If the last stone lands in your Kalah, you get
another turn.
If the last stone lands in your empty hole, take all
the stones from your opponents hole directly
across from it and put them in your Kalah.
If all of your holes become empty, the opponent
keeps the rest of the stones.
The winner is the player who has the most
stones in his Kalah at the end of the game.
35
Cutting off Search

MinimaxCutoff is identical to MinimaxValue except
1. Terminal? is replaced by Cutoff?
2. Utility is replaced by Eval
Does it work in practice?

bm = 106, b=35 m=4
4-ply lookahead is a hopeless chess player!
4-ply human novice

8-ply typical PC, human master
12-ply Deep Blue, Kasparov
36
Deterministic Games in Practice
Checkers: Chinook ended 40-year-reign of human world champion

Marion Tinsley in 1994. Used a precomputed endgame database
defining perfect play for all positions involving 8 or fewer pieces on
the board, a total of 444 billion positions.
Chess: Deep Blue defeated human world champion Garry Kasparov

in a six-game match in 1997. Deep Blue searches 200 million
positions per second, uses very sophisticated evaluation, and
undisclosed methods for extending some lines of search up to 40
ply.
Othello: human champions refuse to compete against computers,

who are too good.
Go: human champions refuse to compete against computers, who

are too bad. In go, b > 300, so most programs use pattern
knowledge bases to suggest plausible moves.
37
Games of Chance
What about games that involve chance,
such as
rolling dice
picking a card
Use three kinds of nodes:

max nodes
min nodes
chance nodes
min
chance
max
38
Games of Chance
chance node with
max children
c
di
d1
dk
S(c,di)
expectimax(c) = P(di) max(backed-up-value(s))

i
s in S(c,di)
expectimin(c) = P(di) min(backed-up-value(s))

i
s in S(c,di)
39
Example Tree with Chance

max
chance
.4
min
chance
.6
.4
.6
.4
1.2
.6
max
leaf
3 5 1 4 1 2 4 5
40
Complexity
Instead of O(bm), it is O(bmnm) where n is
the number of chance outcomes.
Since the complexity is higher (both time
and space), we cannot search as deeply.
Pruning algorithms may be applied.
41
Summary
Games are fun to work on!
They illustrate several important points about AI.
Perfection is unattainable must approximate.
Game playing programs have shown the world
what AI can do.
42

Why Do AI Researchers Study Game Playing?

Uploaded by

Copyright:

Available Formats

Why Do AI Researchers Study Game Playing?

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Why Do AI Researchers Study Game Playing?

Uploaded by

Copyright:

Available Formats

Game Playing

Why do AI researchers study game playing?

What Kinds of Games?

Games vs. Search Problems

Game Tree (2-player,

The computer is Max.

At the leaf nodes, the

minimax procedure: search down several levels;

Tic Tac Toe

where RCDC is number of rows, columns and

Minimax is done depth-first

Complete? Yes (if tree is finite)

For chess, b 35, m 100 for "reasonable" games

Need to speed it up.

Beta: an upper bound on the value that a

What happens here? Is there an alpha cutoff?

A simple example of the value of reasoning about which

When do we get alpha cutoffs?

Shallow Search Techniques

Example: Samuels CheckerPlaying Program

Samuels Checker Player

Samuels Checker Player

How does A change its function?

Samuels Checker Player

How does A change its function?

Cutting off Search

Does it work in practice?

4-ply human novice

Deterministic Games in Practice

Checkers: Chinook ended 40-year-reign of human world champion

Chess: Deep Blue defeated human world champion Garry Kasparov

Othello: human champions refuse to compete against computers,

Go: human champions refuse to compete against computers, who

Use three kinds of nodes:

expectimax(c) = P(di) max(backed-up-value(s))

expectimin(c) = P(di) min(backed-up-value(s))

Example Tree with Chance

You might also like