0% found this document useful (0 votes)

52 views36 pages

Unit 2c Game Playing (Compatibility Mode)

The document discusses games and adversarial search as a domain for exploring machine intelligence. It provides two key reasons why games were initially seen as a good domain: 1) success or failure can be easily measured, and 2) they did not require large amounts of knowledge and were thought to be solvable through search. However, the document notes that for all but the simplest games, large amounts of knowledge and sophisticated search techniques are actually required due to huge branching factors and search spaces. It introduces the concept of heuristic functions, minimax search, alpha-beta pruning, and discusses how these techniques allow computers to evaluate positions and select optimal moves in games like chess despite immense search spaces.

Uploaded by

Madhav Chaudhary

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

52 views36 pages

Unit 2c Game Playing (Compatibility Mode)

Uploaded by

Madhav Chaudhary

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 36

Games and adversarial search

World Champion chess player Garry

Kasparov is defeated by IBM’s Deep
Blue chess-playing computer in a © Telegraph Group Unlimited 1997
six-game match in May, 1997
(link)
There were two reasons that games appeared to be a
good domain in which to explore machine intelligence:
1. They provide a structured task in which it is very
easy to measure success or failure.
2. They did not obviously require large amounts of
knowledge. They were thought to be solvable by
straightforward search from the starting state to a
winning position.

The first of these reasons remains valid and accounts for

continued interest in the area of game playing by machine.
The second is not true for any but the
simplest games.

For example, consider chess.

• The average branching factor is around
35.
• In an average game, each player might
make 50 moves.
• So 3550 positions to examine
Games vs. single-agent search
• We don’t know how the opponent will act
– The solution is not a fixed sequence of actions from
start state to goal state, but a strategy or policy
(a mapping from state to best move in that state)
• Efficiency is critical to playing well
– The time to make a move is limited
– The branching factor, search depth, and number of
terminal configurations are huge
• In chess, branching factor ≈ 35 and depth ≈ 100, giving
a search tree of 10154 nodes
– Number of atoms in the observable universe ≈ 1080

– This rules out searching all the way to the end of the game
Types of game environments
Deterministic Stochastic

Perfect information
Chess, checkers, go Backgammon,
(fully observable) monopoly

Imperfect information
Battleships Scrabble, poker,
(partially observable) bridge
Game tree
• A game of tic-tac-toe between two players, “max” and “min”
http://xkcd.com/832/
http://xkcd.com/832/
Heuristic function
For chess: A very simple static evaluation function
based on piece advantage was proposed by Turing
for chess—simply add the values of black's pieces
(B), the values of white's pieces (W), and then
compute the quotient W/B.
• A more sophisticated approach was that taken in
Samuel's checkers program, in which the static
evaluation function was a linear combination of several
simple functions, each of which appeared as though it
might be significant.
• Samuel's functions included, in addition to the obvious
one, piece advantage, such things as capability for
advancement, control of the center, threat of a fork, and
mobility.
These factors were then combined by
attaching to each an appropriate weight
and then adding the terms together. The
complete evaluation function had the
form:

c1* pieceadvantage + c2* advancement +

c3* centercontrol ...
THE MINIMAX SEARCH
PROCEDURE
• The minimax search procedure is a depth-first,
depth-limited search procedure.
• The idea is to start at the current position and
use the plausible-move generator to generate
the set of possible successor positions.
• Apply the static evaluation function to those positions
and simply choose the best one.
• After doing so, we could back that value up to the
starting position to represent our evaluation of it.
• The starting position is exactly as good for us as the position
generated by the best move we can make next.
• Here we assume that the static evaluation function returns large
values to indicate good situations for us, so our goal is to
maximize the value of the static evaluation function of the next
board position.
This example assumes a static evaluation function that returns
values ranging from -10 to 10, with 10 indicating a win for us,
-10 a win for the opponent, and 0 an even match.
• At the level representing the opponent's choice, the
minimum value was chosen and backed up. At the level
representing our choice, the maximum value was
chosen.
• Once the values from the second ply are backed up, it becomes
clear that the correct move for us to make at the first level,
given the information we have available, is C, since there is
nothing the opponent can do from there to produce a value
worse than -2.
• This process can be repeated for as many ply as time allows,
and the more accurate evaluations that are produced can be
used to choose the correct move at the top level.
• The alternation of maximizing and minimizing at alternate ply
when evaluations are being pushed back up corresponds to the
opposing strategies of the two players and gives this method
the name minimax.
A more abstract game tree

Terminal utilities (for MAX)

A two-ply game
Game tree search
3

3 2 2

• Minimax value of a node: the utility (for MAX) of being in

the corresponding state, assuming perfect play on both sides
• Minimax strategy: Choose the move that gives the best
worst-case payoff
Computing the minimax value of a node
3

3 2 2

• Minimax(node) =
 Utility(node) if node is terminal
 maxaction Minimax(Succ(node, action)) if player = MAX
 minaction Minimax(Succ(node, action)) if player = MIN
Optimality of minimax

• The minimax strategy is optimal

against an optimal opponent
• What if your opponent is
suboptimal?
– Your utility can only be higher than if
you were playing an optimal opponent!
– A different strategy may work better for
a sub-optimal opponent, but it will
necessarily be worse against an 11
optimal opponent

Example from D. Klein and P. Abbeel

More general games
4,3,2

4,3,2 1,5,2

4,3,2 7,4,1 1,5,2 7,7,1

• More than two players, non-zero-sum

• Utilities are now tuples
• Each player maximizes their own utility at their node
• Utilities get propagated (backed up) from children to parents
ADDING ALPHA-BETA CUTOFFS
The minimax procedure is a depth-first process.

One path is explored as far as time allows, the static evaluation

function is applied to the game positions at the last step of the
path, and the value can then be passed up the path one level at
a time.
One of the good things about depth-first procedures is that their
efficiency can often be improved by using branch-and-bound
techniques in which partial solutions that are clearly worse
than known solutions can be abandoned early.
It requires the maintenance of two threshold values, one
representing a lower bound on the value that a maximizing
node may ultimately be assigned (we call this alpha) and
another representing an upper bound on the value that a
minimizing node may be assigned (this we call beta).
This modified strategy is called alpha-beta pruning.
• At maximizing levels, we can rule out a move early if it
becomes clear that its value will be less than the current
threshold, while at minimizing levels, search will be
terminated if values that are greater than the current
threshold are discovered.

• search at a minimizing level can be terminated when a

value less than alpha is discovered, while a search at a
maximizing level can be terminated when a value greater
than beta has been found.

http://web.cs.ucla.edu/~rosen/161/notes/alphabeta.html
http://people.cs.pitt.edu/~litman/courses/cs2710/lectures/pruningReview.pdf
Alpha-beta pruning
• It is possible to compute the exact minimax decision
without expanding every node in the game tree
Alpha-beta pruning
• It is possible to compute the exact minimax decision
without expanding every node in the game tree

3

3
Alpha-beta pruning
• It is possible to compute the exact minimax decision
without expanding every node in the game tree

3

3 2
Alpha-beta pruning
• It is possible to compute the exact minimax decision
without expanding every node in the game tree

3

3 2 14
Alpha-beta pruning
• It is possible to compute the exact minimax decision
without expanding every node in the game tree

3

3 2 5
Alpha-beta pruning
• It is possible to compute the exact minimax decision
without expanding every node in the game tree

3 2 2
Alpha-beta pruning

• α is the value of the best choice

for the MAX player found so far
at any choice point above node n
• We want to compute the
MIN-value at n
• As we loop over n’s children,
the MIN-value decreases
• If it drops below α, MAX will never
choose n, so we can ignore n’s
remaining children
• Analogously, β is the value of the
lowest-utility choice found so far
for the MIN player
Alpha-beta pruning
Function action = Alpha-Beta-Search(node)
node
v = Min-Value(node, −∞, ∞)
return the action from node with value v

α: best alternative available to the Max player action

β: best alternative available to the Min player …
Function v = Min-Value(node, α, β)
if Terminal(node) return Utility(node) Succ(node, action)
v = +∞
for each action from node
v = Min(v, Max-Value(Succ(node, action), α, β))
if v ≤ α return v
β = Min(β, v)
end for
return v
Alpha-beta pruning
Function action = Alpha-Beta-Search(node)
node
v = Max-Value(node, −∞, ∞)
return the action from node with value v

α: best alternative available to the Max player action

β: best alternative available to the Min player …
Function v = Max-Value(node, α, β)
if Terminal(node) return Utility(node) Succ(node, action)
v = −∞
for each action from node
v = Max(v, Min-Value(Succ(node, action), α, β))
if v ≥ β return v
α = Max(α, v)
end for
return v
Alpha-beta pruning
• Pruning does not affect final result
• Amount of pruning depends on move ordering
– Should start with the “best” moves (highest-value for
MAX or lowest-value for MIN)
– For chess, can try captures first, then threats, then
forward moves, then backward moves
– Can also try to remember “killer moves” from other
branches of the tree
• With perfect ordering, the time to find the best
move is reduced to O(bm/2) from O(bm)
– Depth of search is effectively doubled
Evaluation function
• Cut off search at a certain depth and compute the value of an
evaluation function for a state instead of its minimax value
– The evaluation function may be thought of as the probability of winning
from a given state or the expected value of that state
• A common evaluation function is a weighted sum of features:
Eval(s) = w1 f1(s) + w2 f2(s) + … + wn fn(s)
– For chess, wk may be the material value of a piece (pawn = 1,
knight = 3, rook = 5, queen = 9) and fk(s) may be the advantage in
terms of that piece
• Evaluation functions may be learned from game databases or
by having the program play many games against itself
Cutting off search
• Horizon effect: you may incorrectly estimate the
value of a state by overlooking an event that is
just beyond the depth limit
– For example, a damaging move by the opponent that
can be delayed but not avoided
• Possible remedies
– Quiescence search: do not cut off search at
positions that are unstable – for example, are you
about to lose an important piece?
– Singular extension: a strong move that should be
tried when the normal depth limit is reached
Advanced techniques
• Transposition table to store previously expanded
states
• Forward pruning to avoid considering all possible
moves
• Lookup tables for opening moves and endgames
Chess playing systems
• Baseline system: 200 million node evalutions per move
(3 min), minimax with a decent evaluation function and
quiescence search
– 5-ply ≈ human novice
• Add alpha-beta pruning
– 10-ply ≈ typical PC, experienced player
• Deep Blue: 30 billion evaluations per move, singular
extensions, evaluation function with 8000 features,
large databases of opening and endgame moves
– 14-ply ≈ Garry Kasparov
• More recent state of the art (Hydra, ca. 2006): 36 billion
evaluations per second, advanced pruning techniques
– 18-ply ≈ better than any human alive?

EC1002 ZB Exam Paper (May 2024)
No ratings yet
EC1002 ZB Exam Paper (May 2024)
4 pages
Adversarial Search Two - Persons Game: Russel Norvig (Text) Book and Patrick Henry Winston (Reference Book)
No ratings yet
Adversarial Search Two - Persons Game: Russel Norvig (Text) Book and Patrick Henry Winston (Reference Book)
71 pages
Yapay Zeka - 8
No ratings yet
Yapay Zeka - 8
48 pages
Game Playing
No ratings yet
Game Playing
60 pages
Institute of Southern Punjab Multan: Syed Zohair Quain Haider Lecturer ISP Multan
No ratings yet
Institute of Southern Punjab Multan: Syed Zohair Quain Haider Lecturer ISP Multan
41 pages
Game Playing: MIN-MAX Search
No ratings yet
Game Playing: MIN-MAX Search
6 pages
Adversarial Search
No ratings yet
Adversarial Search
36 pages
AI Lec03 Adversarial Search
No ratings yet
AI Lec03 Adversarial Search
38 pages
AI-Lecture 6 (Adversarial Search)
No ratings yet
AI-Lecture 6 (Adversarial Search)
68 pages
SET394 - AI - Lecture 06 - Adversarial Search
No ratings yet
SET394 - AI - Lecture 06 - Adversarial Search
27 pages
ai_lect_05
No ratings yet
ai_lect_05
39 pages
GamePlaying_Minimax_Unit-2_SPS
No ratings yet
GamePlaying_Minimax_Unit-2_SPS
72 pages
3 GamePlaying - Minimax
No ratings yet
3 GamePlaying - Minimax
75 pages
UNIT-2 AI
No ratings yet
UNIT-2 AI
45 pages
07 Game Playing
No ratings yet
07 Game Playing
30 pages
UNIT II Adversarial Search
No ratings yet
UNIT II Adversarial Search
44 pages
Adversarial Search
No ratings yet
Adversarial Search
42 pages
Unit 5 AI
No ratings yet
Unit 5 AI
80 pages
Oradea: Bucharest Arad Craiova
No ratings yet
Oradea: Bucharest Arad Craiova
53 pages
Lecture11_AdversarialSearch
No ratings yet
Lecture11_AdversarialSearch
74 pages
The Minimax Algorithm
No ratings yet
The Minimax Algorithm
4 pages
6-A Star Search Adversarial Search-09!01!2025
No ratings yet
6-A Star Search Adversarial Search-09!01!2025
42 pages
CSC-411-AI-lec6-Adversarial Search
No ratings yet
CSC-411-AI-lec6-Adversarial Search
38 pages
5.module3 ADVERSARIAL SEARCH 4
No ratings yet
5.module3 ADVERSARIAL SEARCH 4
23 pages
Cs 171 07a Games MiniMax
No ratings yet
Cs 171 07a Games MiniMax
28 pages
AI unit2
No ratings yet
AI unit2
34 pages
L06 (Adversarial Search) Ori
No ratings yet
L06 (Adversarial Search) Ori
46 pages
Chapter 2_ Problem Solving - 2Adversarial Search
No ratings yet
Chapter 2_ Problem Solving - 2Adversarial Search
33 pages
Game Playing. Updated (3)
No ratings yet
Game Playing. Updated (3)
44 pages
Adversial Search
No ratings yet
Adversial Search
39 pages
AI-unit-3
No ratings yet
AI-unit-3
54 pages
4 Adversarial Search
No ratings yet
4 Adversarial Search
14 pages
Unit 3 Updated
No ratings yet
Unit 3 Updated
112 pages
Unit - 2
No ratings yet
Unit - 2
9 pages
Game Playing
No ratings yet
Game Playing
24 pages
Chapter 4
No ratings yet
Chapter 4
27 pages
Adversarial Search MinMax Alpha Beta Pruning
No ratings yet
Adversarial Search MinMax Alpha Beta Pruning
43 pages
Adveserial Search
No ratings yet
Adveserial Search
29 pages
AI-Unit-4
No ratings yet
AI-Unit-4
25 pages
Module 2 (Part 2)
No ratings yet
Module 2 (Part 2)
136 pages
6-GAME
No ratings yet
6-GAME
53 pages
Lecture 7
No ratings yet
Lecture 7
62 pages
Games
No ratings yet
Games
41 pages
AI_03^004
No ratings yet
AI_03^004
48 pages
CC511 Week 4
No ratings yet
CC511 Week 4
57 pages
CS2201.7
No ratings yet
CS2201.7
56 pages
Ai Unit 3
No ratings yet
Ai Unit 3
33 pages
UNIT-2-AI-Notes
No ratings yet
UNIT-2-AI-Notes
26 pages
Adverserial Search
No ratings yet
Adverserial Search
36 pages
6 min max
No ratings yet
6 min max
11 pages
Game Playing
No ratings yet
Game Playing
53 pages
Artificial Inteligence
No ratings yet
Artificial Inteligence
4 pages
Unit 3_ai_ii Aiml Full-1
No ratings yet
Unit 3_ai_ii Aiml Full-1
108 pages
Chapter 4 notes of AI
No ratings yet
Chapter 4 notes of AI
9 pages
5.2 Min Max & AB Pruning
No ratings yet
5.2 Min Max & AB Pruning
31 pages
Adversial Search
No ratings yet
Adversial Search
21 pages
Minimax Search Algorithm: 3. Back-Up The Scores at Level D To Assign A Score To Each
No ratings yet
Minimax Search Algorithm: 3. Back-Up The Scores at Level D To Assign A Score To Each
57 pages
Unit 2 MinMaxScaling With Alpha Beta Pruning
No ratings yet
Unit 2 MinMaxScaling With Alpha Beta Pruning
24 pages
4 Adversel Search Game Tree
No ratings yet
4 Adversel Search Game Tree
51 pages
Minimax: Fundamentals and Applications
From Everand
Minimax: Fundamentals and Applications
Fouad Sabry
No ratings yet
The Tech Interview Playbook: From DSA to System Design
From Everand
The Tech Interview Playbook: From DSA to System Design
Chinmoy Mukherjee
No ratings yet
Porter'S Five Forces VS Resource Based View: A Comparison
No ratings yet
Porter'S Five Forces VS Resource Based View: A Comparison
12 pages
Mod 5
No ratings yet
Mod 5
5 pages
Dynamic Programming For Partially Observable Stochastic Games
No ratings yet
Dynamic Programming For Partially Observable Stochastic Games
7 pages
The Game of Nim
No ratings yet
The Game of Nim
28 pages
05 Dynamic Game Theory
No ratings yet
05 Dynamic Game Theory
29 pages
Strategic Behavior and Game Theory Publication
No ratings yet
Strategic Behavior and Game Theory Publication
30 pages
MIT14 123S10 Notes10
No ratings yet
MIT14 123S10 Notes10
12 pages
Theory and Practice of Auctions
No ratings yet
Theory and Practice of Auctions
20 pages
711 HW 12018
No ratings yet
711 HW 12018
3 pages
Problem set 4
No ratings yet
Problem set 4
2 pages
Lecture 1 Game Theory
No ratings yet
Lecture 1 Game Theory
10 pages
06 - Dynamic Games of Imperfect Information PDF
No ratings yet
06 - Dynamic Games of Imperfect Information PDF
29 pages
Chapter 13 Game Theory
No ratings yet
Chapter 13 Game Theory
25 pages
Ecn 200C: Micro Theory: University of California, Davis - Department of Economics Professor Giacomo Bonanno
No ratings yet
Ecn 200C: Micro Theory: University of California, Davis - Department of Economics Professor Giacomo Bonanno
22 pages
Tugas 9&10
No ratings yet
Tugas 9&10
2 pages
Operations Management Homework 37 PDF
No ratings yet
Operations Management Homework 37 PDF
3 pages
Adversarial Search: - Chapter 6 - Section 1 - 4 - VU-Lecture-10
No ratings yet
Adversarial Search: - Chapter 6 - Section 1 - 4 - VU-Lecture-10
10 pages
Problem Set 4 - With Answer
No ratings yet
Problem Set 4 - With Answer
3 pages
Maths M T.3 Game Theory
No ratings yet
Maths M T.3 Game Theory
2 pages
Operations Management: William J. Stevenson
No ratings yet
Operations Management: William J. Stevenson
19 pages
Unit 2c Game Playing (Compatibility Mode)
No ratings yet
Unit 2c Game Playing (Compatibility Mode)
36 pages
Duality in LPP and GAME Theory Solution
No ratings yet
Duality in LPP and GAME Theory Solution
9 pages
Decision Theory
No ratings yet
Decision Theory
30 pages
3063 Business Model Canvas Template
No ratings yet
3063 Business Model Canvas Template
5 pages
Module 6 Game Theory
No ratings yet
Module 6 Game Theory
21 pages
Chapter 02 Game Theory and Competitive Strategy
No ratings yet
Chapter 02 Game Theory and Competitive Strategy
85 pages
UH - 1 (2) (Jawaban)
No ratings yet
UH - 1 (2) (Jawaban)
37 pages
CHAPTER 6A Game Theory and Competivie Strategy
No ratings yet
CHAPTER 6A Game Theory and Competivie Strategy
37 pages
SBWL Strategy and Data Strategic Thinking and Analysis II
No ratings yet
SBWL Strategy and Data Strategic Thinking and Analysis II
101 pages