This is a follow up post to SudokuH – Fast Sudoku Solver in Haskell.
Let’s see how SudokuH is made. I will try to be as little technical as possible, but it may still be a long read.
The first surprise! You open up the source code, and what do you see? Pure awesomeness. This has little to do with my pro skills (that’s short for programming skills, if you wonder), and more with Haskell. Let me tell you, I love Haskell. My feelings towards it deserve a separate post, so I will just say that Functional Programming is powerful and elegant. It does take some time to get used to but after this writing in imperative languages (like C, Java) feels… obsolete. If I had to implement the algorithms in SudokuSolver.hs in, say, Java, it might have very well become 500+ lines of code (compared to around 150 right now without the comments). So… if you don’t know Haskell, why not give it a try?
Now, I will not dwell into the implementation any more than that. Instead, let me introduce you to the way SudokuH does it’s magic. As I said, it depends on a nicely optimized CSP solver to do the hard work. CSPs (Constraint Satisfaction Problems) are problems that can be defined as a set of variables and constraints on them. In the case of Sudoku, the variables are the cells and each cell is constrained from 1 to 9, in addition all the cells in each row and column must be different. Imagine the solver as a black box, for now. The idea is that you feed the set of variables and constraints into the solver, it creaks, and spits out an assignment of your variables which satisfies the constraints. Or it fails miserably, if (and only if) they can’t be satisfied.
Complete state space search
So far so good. Now, what happens in the black box? The general procedure my algorithm follows is a simple recursion over all the variables and their values. At each step, search for a value of the given variable that satisfies all the constraints. If such a value is found, assign it to this variable and go to the next. If not, go back to the previously assigned variable and choose a different value for it. Do this until either all the variables are assigned a value, in which case you have found a valid assignment for the problem, or you run out of values for some variable, in which case a valid assignment does not exist. This algorithm is also known as complete state-space search, and is no faster than the average snail. The reason? Sudoku has 81 squares (variables) and 9 possible values for each. This means that at the worst case the recursion would run for around 9^81 = 1.9662705 × 10^77 steps. And each step checks that all the constraints are satisfied. Nowadays, that’s just too slow.
MAC and MRV
Turns out, it’s easy to improve the performance significantly by using two heuristics – MAC and MRV. And they are nothing complex, either. Let me explain:
- MAC (Maintaining Arc Consistency) is an improved version of AC-3 (Arc Consistency Algorithm #3). What it does, in a nutshell, is to keep track of the valid values for each variable which are guaranteed to satisfy all the constraints. In the case of our Sudoku problem, we have lists that contain the numbers 1 to 9 at first. As soon as we assign a value to one of the squares, say seven, the lists are updated in the following way – we remove seven from the lists of all the variables on the same row and column as the assigned square (because they can’t become seven anyway, remember the constraints). Now, every time we are assigning a variable, we just pick a value from it’s list of valid values and remove it from the lists of the variables on the same row and column. Easy peasy.
- MRV (Minimum Remaining Values) optimizes the order in which we process the variables. We follow the strategy that we always pick the most constrained variable, that is, the one which has the fewest possible choices for values. If you think about it, we do the same thing when we solve Sudoku by hand!
As you might have realized, these heuristics make the naive algorithm more intelligent, so at the end it solves Sudoku very much like humans do. It is difficult to theoretically state the improvement, but the difference is quite easy to see in practice. If you were to try the naive algorithm on a sample problem, you’d have to wait for minutes to get an answer. The improved version takes no more than 3.5 seconds, usually less. Win! ^^
If you have any questions, I’ll be happy to elaborate! Just drop me a line below.