The Challenge
Since I came across the first sudoku I was wondering if it could by solved by SQL using queries that select just the right value for a cell.
Well, some day I tried, here is what I have got now.
Previous Part
Rules
Single Candidate
CREATEPROCEDURE ApplyRuleSingleCandidate (@rows SMALLINT OUTPUT) ASBEGIN-- if there is only one candidate for a cell, the cell has this valueINSERT INTO Cell (CellID, [Rule], Value)SELECT t.CellID, t.[Rule], t.ValueFROM (SELECT c.CellID, 'Single Candidate' AS[Rule], MAX(c.Value) AS ValueFROMCandidate cGROUP BY c.CellIDHAVING COUNT(*)=1) tSET @rows=@@ROWCOUNTIF EXISTS(SELECT 2 FROM Violation)THROW 50001, 'Invalid value in cells', 1END;
The main point is COUNT(*)=1. There is only one candidate for a cell. As aggregation I choose MAX, as there is only one row, MIN would do the same. For traceability I use the string 'Single Candidate' as the rule. After inserting I check if any violation exists.
How can violations happen?
Single Token In Group
CREATEPROCEDURE ApplyRuleSingleTokenInGroup (@rows SMALLINT OUTPUT) ASBEGIN-- if, for a given value, there is only one cell in a row, column or square as a candidate for this value, it has to be thereINSERT INTO Cell (CellID, [Rule], Value)SELECT CellID, [Rule], ValueFROM (SELECT MAX(c.CellID) AS CellID, 'Single Token in ' + CASE WHEN l.GroupType='C' THEN 'Column' WHEN l.GroupType='R' then 'Row' ELSE 'Square' END AS [Rule], c.Value, ROW_NUMBER() OVER(PARTITION BY MAX(c.CellID) ORDER BY l.GroupType) AS RowNumberFROMCandidate cINNER JOIN Location l ON c.CellID=l.CellIDGROUP BY l.GroupType, l.GroupValue, c.ValueHAVING COUNT(*)=1) tWHERERowNumber=1 -- we need this to avoid duplicate inserts due to more than one rule that appliesSET @rows=@@ROWCOUNTIF EXISTS(SELECT 2 FROM Violation)THROW 50001, 'Invalid value in cells', 1END;
Again, the main point ist COUNT(*)=1. But grouping is now by value and unit. Unit means row, column or square. The encapsulation in a subquery and restriction to RowNumber=1 is only to avoid duplicate key violation if a value unique in more then one unit, e.g. a row and a column.
Eliminating candidates
Complete Match
CREATEPROCEDURE RemoveCandidateCompleteMatch (@rows SMALLINT OUTPUT) ASBEGINDELETE cFROMCandidate cINNER JOIN CandidateMask m ONc.CellID=m.CellIDINNER JOIN Location l on m.CellID=l.CellIDINNER JOIN (SELECT l1.GroupType, l1.GroupValue, s.SetValue, s.SizeFROM SetAll sINNER JOIN CandidateMask c1 ONs.SetValue = c1.MaskINNER JOIN Location l1 ONc1.CellID=l1.CellIDWHERE s.Size NOT IN (1, 9)GROUP BY l1.GroupType, l1.GroupValue, s.SetValue, s.SizeHAVING COUNT(*) = s.Size) h ON h.GroupType=l.GroupType AND h.GroupValue=l.GroupValueWHERE m.Mask != h.SetValueAND c.Mask & h.SetValue != 0SET @rows=@@ROWCOUNTEND;
If we find a set of candidate values, lets say a set with two members, in a unit exactly two times, the values of this set cant be anywhere else in the unit.
Example: candidate set 1,3 in row 4 two times => 1 and 3 can be removed as candidates from all other cells in row 4.
Partial Match
CREATEPROCEDURE RemoveCandidatePartialSet (@rows SMALLINT OUTPUT) ASBEGINDELETE cFROMCandidate cINNER JOIN CandidateMask m onc.CellID=m.CellIDINNER JOIN Location l on m.CellID=l.CellIDINNER JOIN (SELECT l1.GroupType, l1.GroupValue, s.SetValue, s.SizeFROM SetAll sINNER JOIN CandidateMask c1 ONs.SetValue &c1.Mask = c1.Mask AND ~s.SetValue & c1.Mask = 0INNER JOIN Location l1 ONc1.CellID=l1.CellIDWHERE s.Size NOT IN (1, 9)GROUP BY l1.GroupType, l1.GroupValue, s.SetValue, s.SizeHAVING COUNT(*) = s.Size) h ON h.GroupType=l.GroupType AND h.GroupValue=l.GroupValueWHERE ~h.SetValue & m.Mask != 0 AND c.Mask & h.SetValue != 0SET @rows=@@ROWCOUNTEND;
If a part of a candidate set, lets say with 3 members, are the only candidate members for 3 cells in a unit, we can remove the members of the set from all other candidate values for the unit.
Example: 1,3 is in cell 11, 1,4 is in cell 13 and 3,4 is in cell 22. Our set is 1,3,4, our square is 1, we can remove the values 1,3 and 4 from all candidates from square 1.
Same Row Or Column And Square
CREATEPROCEDURERemoveCandidateValueInSameRowOrColumnAndSquare (@rows SMALLINT OUTPUT) ASBEGINDELETE FROM cFROMCandidate cINNER JOIN Location l ON c.CellID=l.CellIDINNER JOINLocationWide s ON c.CellID=s.CellIDINNER JOIN (SELECT c.Value, ls.[Square], lrc.GroupType, MAX(lrc.GroupValue) as GroupValueFROMCandidate cINNER JOIN LocationWide ls ONc.CellID=ls.CellIDINNER JOIN Location lrc ONc.CellID=lrc.CellID AND lrc.GroupType IN ('C','R')GROUP BY c.Value, ls.[Square], lrc.GroupTypeHAVING COUNT(DISTINCT ls.[Square])=1 AND COUNT(DISTINCT lrc.GroupValue)=1) a ON a.Value=c.Value AND l.GroupType=a.GroupType AND l.GroupValue=a.GroupValueWHERE s.[Square] != a.[Square]SET @rows=@@ROWCOUNTEND;
If a candidate exists within a square multiple times, but only in one row or column, the candidate value can be removed for cells in this row but in another square.
Example: candidate value 3 exists in cell 12 and in cell 13 but nowhere else in square 1. candidate value 3 can be removed for cells 14, 15, 16, 17, 18, 19.
Executing Rules
DECLARE@cells INTDECLARE@affected INTSET @affected=0SET @cells=1WHILE @cells>0BEGINWHILE@cells>0BEGINEXECApplyRuleSingleCandidate @cells OUTPUTprint SPACE(@@nestlevel) + 'Iteration ' + CAST(@iteration AS VARCHAR(5)) + ' ApplyRuleSingleCandidate solves ' + CAST(@cells AS VARCHAR(5)) + ' cells.'EXECApplyRuleSingleTokenInGroup @affected OUTPUTprint SPACE(@@nestlevel) + 'Iteration ' + CAST(@iteration AS VARCHAR(5)) + ' ApplyRuleSingleTokenInGroup solves ' + CAST(@affected AS VARCHAR(5)) + ' cells.'SET@cells=@cells+@affectedSET@iteration=@iteration+1END-- execution of rules that give values directly ends here-- now we apply rules that remove candidatesEXECRemoveCandidateCompleteMatch @cells OUTPUTPRINT SPACE(@@nestlevel) + 'Removed ' + CAST(@cells AS VARCHAR(10)) + ' candidates due to complete match in same group'EXECRemoveCandidatePartialSet @affected OUTPUTPRINT SPACE(@@nestlevel) + 'Removed ' + CAST(@affected AS VARCHAR(10)) + ' candidates due to exclusive partial sets in same group'SET @cells=@cells+@affectedEXECRemoveCandidateValueInSameRowOrColumnAndSquare @affected OUTPUTPRINT SPACE(@@nestlevel) + 'Removed ' + CAST(@affected AS VARCHAR(10)) + ' candidates due to candidate values in a square in only one row or column'SET @cells=@cells+@affectedEND
So that is all we can do right now. If you know more rules, let me know.
Guessing
- Try solving by rules
- If that does not work out, make up a list of choices. Take the cell with most candidates, fill in the first choice. Try step 1. If it does not solve the sudoku, try the next choice.
You see, it is going to be recursive. Here is the code:
DECLARE@guess TABLE (CellID SMALLINT, Value SMALLINT)INSERT INTO @guess (CellID, Value)SELECT c.CellID, c.ValueFROMCandidate cWHERE c.CellID = (SELECT TOP 1 cm.CellIDFROMCandidateMask cmINNERJOIN SetAll s ONs.SetValue=cm.MaskORDERBY s.Size DESC, cm.CellID )ORDER BY c.ValueDECLARE@cellID SMALLINT,@value SMALLINT,@count SMALLINTWHILE (SELECT COUNT(*) FROM Cell)<81BEGINSET @count=(SELECT COUNT(*) FROM @guess)IF (@count=0)THROW50002, 'Error in guessing, all candidates used but not solved.',1IF NOT EXISTS(SELECT 2 FROMCandidate)THROW50002, 'Error in guessing, no candidates.', 1SELECT TOP 1 @cellID=g.CellID, @value=g.ValueFROM @guess gORDER BY g.CellID, g.ValueDELETE FROM @guessWHERECellID=@cellID ANDValue=@valuePRINT SPACE(@@nestlevel) + 'nestlevel ' + CAST(@@NESTLEVEL as VARCHAR(10)) +': guessing value ' + CAST(@value as CHAR(1)) + ' in cell ' + CAST(@cellID as VARCHAR(5))BEGIN TRYSAVE TRANSACTION GuessingINSERT INTO Cell(CellID, Value, [Rule])VALUES (@cellID, @value, 'Guess')EXECTrySolve @iterationEND TRYBEGIN CATCHPRINT SPACE(@@nestlevel) + 'rolling back transaction due to ' + ERROR_MESSAGE()ROLLBACK TRANSACTION GuessingIF (ERROR_NUMBER() NOT IN (50001, 50002))THROWEND CATCHENDEND;
This code together with the code above make up the procedure TrySolve, with is called recursively at EXEC TrySolve @iteration.
Next Part
In the next part I will introduce some convenience features for entering start data and executing the solver.