SET394 - AI - Lecture 06 - Adversarial Search
SET394 - AI - Lecture 06 - Adversarial Search
SET394 - AI - Lecture 06 - Adversarial Search
Artificial Intelligence
Lecture Six
Game Playing: Adversarial
6 Search
Associate Prof. Dr. Ayman Elshenawy
eaymanelshenawy@azhar.edu.eg 1
Adversarial Search
• A solution for the problems that:
• Other agents are planning against us in the environment (multi-agent).
tic-tac-toe
• There is an "enemy" or "opponent" changing the state of the problem every
step in a direction you do not want.
• You change state, but then you don't control the next state.
• Example: in board games such as chess, tic-tac-toe, checkers. Chess
• Unpredictable ( depend on the enemy movement) and the solution is a
strategy specifying a move for every possible opponent reply.
• Adversarial games, while much studied in AI, are a small part of game
theory in economics. Checkers
Difference between Search & Games
• Search – no adversary
• Solution is (heuristic) method for finding goal
• Heuristic techniques can find optimal solution
• Evaluation function: estimate of cost from start to goal through given node
• Examples: path planning, scheduling activities
• Games – adversary
• Solution is strategy specifies move for every possible opponent reply.
• Optimality depends on opponent.
• Time limits force an approximate solution
• Evaluation function: evaluate “goodness” of game position
• Examples: chess, checkers, Othello, backgammon
Two Players Zero-Sum Games - AI assumptions
• Two agents MAX and MIN whose actions alternate
• MAX moves first and they take turns until the game is over
• Winner gets award, loser gets penalty.
• Utility values for each agent are the opposite of the other
• creates the adversarial situation
• (Perfect Information) ➔ Fully observable environments
• Zero-sum games : What is good for one player is just as bad for the other (win-
win)
• Action ➔ move.
• State ➔ Position
A Game defined as a search problem
A game formally defined with following elements:
• Initial state S0: Specifies how the game started. (current board position/configuration).
• To-move(s): Defines which player has the turn to move in a state. (P={1...N} usually
take turns)
• Action(s): Returns the set of legal moves in a state (depend on player / state)
• Result(s, a): The transition model, which defines the result of a move a on the state S.
• Terminal-Test(s): is true when the game is over, false otherwise.
• Utility(s, p): A utility function (objective function) defines the final numeric value for a
game that ends in terminal state s for a player p.
• Example: tic tac toe, utility is -1, 0, or 1 , win, Loss or draw in chess.
• Search objective:
• Find the sequence of player’s decisions (moves) maximizing its utility
• Consider the opponent’s moves and their utility
A Game defined as a search problem
• State space graph:
• Defined by moves (Actions) function, and RESULT function.
• A graph where the vertices are states, the edges are moves and a state
might be reached by multiple paths.
• Search Tree
• can superimpose over part of the state space graph to determine what
move to make.
• Complete game tree
• A search tree that follows every sequence of moves all the way to a
terminal state.
• May be infinite if the state space itself is unbounded or if the rules of the
game allow for infinitely repeating positions.
Partial Game Trees for the initial state
Tic-Tac-Toc
◼ Alternative moves by MIN(O) and MAX(X), until we eventually reach terminal states S, which can be assigned
utilities according to the rules of the game.
Optimal Decisions in Games
• MAX wants to find a sequence of actions leading to a win, but
MIN has something to say about it.
• This means that MAX’s strategy must be a conditional plan a
contingent strategy specifying a response to each of MIN’s
possible moves.
• In games that have a binary outcome (win or lose), we could use
AND–OR, the desirable outcome must be guaranteed no matter
what the “other side” does.
• For games with multiple outcome scores, we need a slightly more
general algorithm called minimax search.
Minmax Search Strategy
• Basic idea: choose move with highest minimax value
• Max’s goal: get to 1
• Min’s goal: get to -1
• Given a game tree, the optimal strategy can be
determined by working out the minimax value of each
state in the tree, which we write as MINIMAX(s).
• Minimax value of a node:
• If N is terminal, use the utility value
• If N is a Max move, take max of successors
• If N is a Min move, take min of successors
The minimax algorithm
1. Generate game tree completely
2. Determine utility of each terminal
state
3. Propagate the utility values upward
in the tree by applying MIN and
MAX operators on the nodes in
the current level
4. At the root node use minimax
decision to select the move with
the max (of the min) utility value
• Steps 2 and 3 in the algorithm
assume that the opponent will play
perfectly.
MIN
MAX
MIN
MAX
Minimax Search: Example 1
Minimax Search
Exercise 1 Exercise 2
Exercise 3
14
Minmax for multi-players games
• Replace the single value for each node with a vector of values.
• For example, in a three-player game with players A, B, and C, a vector {VA, VB, VC} is
associated with each node.
• For terminal states, this vector gives the utility of the state from each player’s viewpoint.
• The simplest way to implement this is to have the UTILITY function return a vector of utilities.
Properties of Minimax
• Complete: Yes , for finite trees.
• Optimal : Yes, against an optimal opponent
(perfect player).
• Time Complexity: O(bm)
• Space Complexity: O(bm) depth first
exploration of the state space
▪ For chess, b = 35, m = 80 (Ply)
▪ Exact solution is completely infeasible
▪ But do we need to explore the whole tree?
19
Alpha-Beta Algorithm
20
Alpha-Beta Example
Do DF-search until first leaf
Alpha-Beta Example (continued)
Example 1: Alpha-beta pruning
Example 2: Alpha-beta pruning
An example of Alpha-beta pruning
A Few Notes on Alpha-Beta
• Effectiveness depends on order of successors (middle vs. last node of
2-ply example)
• If we can evaluate best successor first, search is O(bd/2) instead of O(bd)
• This means that in the same amount of time, alpha-beta search can
search twice as deep!
• Pruning does not affect the final result
• Good move ordering improves effectiveness of pruning
• With “perfect ordering”, time complexity O(bm/2)
• doubles the depth of search
• can easily reach depth of 8 and play good chess (branching factor of 6 instead of 35)
26
The End ☺
27