Informed Search Sudoku

Making Sudoku commit Seppuku
Eric McCann Department of Computer Science University of Massachusetts Lowell 1 University Drive, Lowell, MA 01854 emccann@cs.uml.edu 978-934-3385
ABSTRACT
In this paper, an application designed to solve Sudoku using a fast and reasonably effective A* implementation that ends up at the goal state by taking the same series of steps a human would likely take to solve the same puzzle. The heuristic and path cost functions are an extremely important part of the overall functioning of the search; therefore, their development will be described. Due to the astronomically sized state space, and variable branching factor, the implementation of the fringe will also be described in relative depth.
Author Keywords
Seppuku emulates, making it far more than a grid coverage brute-force algorithm. When Making Sudoku Commit Seppuku finishes a puzzle, that final state can be traced all the way back to the starting state, and the actions taken to get from the start state to the end state, are the same ones that a human player would probably take. A very important aspect of this projects functioning properly was correctly defining the path cost from a state to its successor, and defining the heuristic function, used in conjunction with that path cost function, to place the successor state in the correct slot of the priority queue.
Sudoku, Artificial Intelligence, Heuristic Search, Informed search, Search algorithms

INTRODUCTION
Sudoku, the popular 9x9 number puzzle, often found near the crossword puzzle in a newspaper, comes in a wide variety of difficulties, where higher difficulties begin with less numbers filled in, and the logical process that can be used to solve simpler Sudoku puzzles is not usable in increasingly numerous phases throughout the problem solving. In addition to its popularity, the enormous state space and varying branching factor make this problem a very interesting problem to solve using informed search.
PROJECT DESCRIPTION
g(s) = #Possible_Numbers_In_All_Empty_Boxes(s.Parent) - #Possible_Numbers_In_All_Empty_Boxes(s) 1. Path Cost Function
h(s) = #Empty_Boxes(s) 2. The Heuristic
Writing a program to solve Sudoku is borderline trivial, in that a Google search can yield a wide array of brute force algorithms to solve it, but casting a number puzzle like Sudoku into an A* search, and successfully solving many different Sudoku puzzles using a step-by-step human-like process is nothing less than pretty sweet. The relationship between Sudoku solutions and solving a Sudoku is a fallacy, in that a Sudoku solution is merely a number grid, covered in accordance with the rules of Sudoku as far as mutual exclusion in its various dimensions, and, of course, a solution has to have all of the original puzzles numbers in the same locations. When a person solves a Sudoku, they do not simply generate the solution, as a brute force algorithm does. They use a systematic process to fill in the grid, reevaluating what values could possibly be placed in each cell after each action. It is this process that Making Sudoku Commit
Efficiently calculating the list of possible values was a significant step in both, enumerating a states successors, and in computing its estimated distance from the state being expanded to the goal. Optimizing these calculations was also an extremely significant step in both, maximizing the rate states can be expanded, and minimizing the memory foot print of the program, especially in cases like, an empty puzzle, where expanding the start state will yield 93 successor states, each of them of equal quality as far as the A* search was concerned. In order to optimize the programs speed, a method of memorization is used, so that a states possible values lists is computed only when the list is requested for any one of the cells in its grid. After that initial computation, the lists are stored for future requests. This significantly improves the speed over an implementation where they are recomputed for both the state, and its parent, every time they are referred to for either the purposes of enumerating successors, or calculating the states path cost from its parent state.
Given the definition of the heuristic, definition of the path cost function, and the huge branching factor of Sudoku puzzles, there are many ties in the fringes priority queue. To break these ties, I experimented with FIFO, LIFO, and random dequeueing, none of which was clearly better than any of the others as far as overall speed, so I simply settled on FIFO to make the runtime between attempts on the same puzzle dependent on other factors in the program, rather than dependent on the random number generation. One of the first hurdles that came up, when the program was still slow and lacked much of the visualization was the problem of ensuring progress. Any dead-end state, where all successors would indubitably fail the Sudoku validity test is identified based on there being an empty cell with no possible values. Such states are not added to the Fringe when their parent state is being expanded. A fairly time consuming aspect of this project, was optimizing many aspects of the search and all of the underlying iterative processes in an effort to maximize its speed. As the number of empty slots decreases, the puzzle's branching factor decreases drastically. On an empty puzzle (worst case for this program), there's so much work being done that only 5 states are expanded per second, resulting in 44,000 states remaining in the queue after 60 seconds. After the puzzle is approximately 3/4 filled in, the program can fly through states at 150 states/second. This is nontrivial, because a state expansion entails the following steps: 1. Instantiating the successor state copying all of the relevant list items into the umpteen different data structures in the successor. Applying the "action" in all of those data structures. Computing all of the possible values for every empty cell in the successor's grid (NOR test on 3 9-element lists for each one of the 81 cells) Estimating that new state's value in terms of the cost to get to it from the current state, and its distance to the goal. Putting it on the priority queue.
The user interface is a very important part of this project. The visualizations it displays not only show its progress in filling the cells, but also clearly show the possible numbers for each cell, which is used to describe a states path cost from its predecessor. Another helpful visualization the interface presents is a histogram of the number of ties in the A* priority queue.
3. Sudoku Grid Visualization. The numbers the puzzle started with are shown in red, the numbers filled in by the search are shown in black, and the possible values for a given cell are shown as the small green squares
2. 3.
4.
5.
The single change that led to the greatest difference in the performance of the program was optimizing the visited list. Originally, it was merely a dynamic-length array. During expansion, every successor for a state was tested against every element in that list to see if it had been visited before. I replaced that list with a seven dimensional list of Booleans, indexed by the seven parameters: States value, ActionFromParents Row, Column, and Number, and the parent.ActionFromParents Row, Column, and Number.
4.
The A* Tie Histogram. The black line shows the lowest key list in the priority queue, and the small yellow bar is the length of that list. When a state is expanded that puts it at a lower estimated distance from the goal, the black line moves to it. When a given values list is emptied, the bar moves to the left to start expanding the next-cheapest list. The Checkered line on the right corresponds to finding a solution, so much like slot cars at a carnival, when the black line hits the finish line, the search is complete.
ANALYSIS OF RESULTS
It may not quickly spit out a solution, but when/if it finds a solution, it's more than a "valid 9x9 number grid", it's a solution arrived at after following a series of steps, those being the ordered steps a human would probably take to arrive at the solution. It can finish simple puzzles quickly; occasionally doing so in less than a second, but it varies greatly depending on the puzzle. Harder puzzles that it has actually finished, took 510 minutes. On extremely difficult projects, the program may never finish because of memory constraints, even on a system with 8 gigabytes of memory, but that can be fixed in the future.
DISCUSSION
CONCLUSION
Sudoku can be solved using a heuristic search, where each puzzle snapshot is a state, and every number entered into a specific cell is an action. It may not be as fast brute-force, but it emulates a human process for solving Sudoku, which is not an accomplishment brute force algorithms can shake a stick at. Whenever I open the project in Visual Studio, I end up tweaking it for about 4 hours, trying to improve it slightly, which is indicative of the fact there is plenty of room for future work in the development of this project.
ACKNOWLEDGMENTS
I had originally begun approaching Sudoku using an A* strategy in the hopes of simply solving it faster than bruteforce algorithms, which, given a very hard Sudoku, this algorithm might struggle with until the computers resources have been exhausted. However, an easy Sudoku can be solved by this algorithm in less than a second, and that solutions significance is derived not solely from its speed, but from the fact that it solves the problem more humanly than any brute-force algorithm could. After the hundreds of hours of development this project entailed, the results value is two-fold. It is both an extremely fast and efficient A* implementation, and an attention-grabbing, intuitive visualization of, not only, an A* search, but an A* search of a problem that is not easily cast into a problem that lends itself easily to A* search.
The work described in this paper was conducted as part of a fall 2010 Artificial Intelligence course, taught in the Computer Science department of the University of Massachusetts Lowell by Prof. Fred Martin. This project could not have been accomplished without the help of Monster, energy drink. If somebody knows a Public Relations representative at Monster, and wants to try to get me hooked up with a sponsorship, it would be very much appreciated.
REFERENCES
Puzzles for testing were manually entered from puzzles generated on http://www.websudoku.com/

Informed Search Sudoku

Uploaded by

Copyright:

Available Formats

Informed Search Sudoku

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Informed Search Sudoku

Uploaded by

Copyright:

Available Formats

Making Sudoku commit Seppuku

Sudoku, Artificial Intelligence, Heuristic Search, Informed search, Search algorithms

g(s) = #Possible_Numbers_In_All_Empty_Boxes(s.Parent) - #Possible_Numbers_In_All_Empty_Boxes(s) 1. Path Cost Function

h(s) = #Empty_Boxes(s) 2. The Heuristic

You might also like