Sudoku by SQL Part 1 of 3

The Challenge

Since I came across the first sudoku I was wondering if it could by solved by SQL using queries that select just the right value for a cell.

Well, some day I tried, here is what I have got now.

Domain Tables

I decided to set up a few tables to represent information about the game without clearly knowing in advance how I am going to use them in queries. But I was quite sure that I was going to need this data somehow.

Values

CREATETABLE Token(
Character CHAR(1) NOT NULL,
Value SMALLINTNULL,
Mask INT NULL,
CONSTRAINTPK_Token PRIMARY KEYCLUSTERED (
    Character
)
);
INSERTINTO Token(Character, Value, Mask)
VALUES
('.', NULL,NULL)
,('1', 1,   1)
,('2', 2,   2)
,('3', 3,   4)
,('4', 4,   8)
,('5', 5, 16)
,('6', 6, 32)
,('7', 7, 64)
,('8', 8, 128)
,('9', 9, 256)
GO
CREATEVIEW Value AS
SELECT Value, Mask
FROM Token
WHERE Value IS NOT NULL;

The period I intended to use for representing an empty cell. The column Mask was introduced later, it turned out to be helpful.

Location

CREATETABLE Location(
CellID SMALLINTNOT NULL,
GroupType CHAR(1) NOT NULL, -- 'C', 'R', 'S' meaning column, row and square
GroupValue SMALLINTNOT NULL,
CONSTRAINTPK_Location PRIMARY KEYCLUSTERED (
    CellID,
    GroupType
)
);
INSERTINTO Location (CellID, GroupType, GroupValue)
SELECTr.Value*10 + c.Value AS CellID
, gt.GroupType
, CASE WHEN gt.GroupType='C' THEN c.Value WHEN gt.GroupType='R' THEN r.Value ELSE (c.Value-1)/3 + ((r.Value-1)/3)*3 + 1 END AS GroupValue
FROM (VALUES ('C'), ('R'), ('S') )gt(GroupType)
crossjoin Value r
crossjoin Value c
CREATETABLE LocationWide(
CellID SMALLINTNOT NULL,
[Column] SMALLINTNOT NULL,
[Row] SMALLINTNOT NULL,
[Square] SMALLINTNOT NULL,
CONSTRAINTPK_LocationWide PRIMARY KEY CLUSTERED (
    CellID
)
);
INSERTINTO LocationWide (CellID, [Column], [Row], [Square])
SELECTp.CellID, p.C AS [Column], p.R AS [Row], p.S AS [Square]
FROM (
SELECT CellID, GroupType,GroupValue FROM Location
) l
PIVOT (MAX(GroupValue) FOR GroupType IN ([C], [R], )
) p
;

I was going to need information about the location of cells, for grouping to rows, colums and squares. Because I was not sure if it is better to model the location "deep" or "wide", I made both. This data is static, so there is no problem with duplication.
First 20 rows of LocationWide:

CellID	Column	Row	Square
11	1	1	1
12	2	1	1
13	3	1	1
14	4	1	2
15	5	1	2
16	6	1	2
17	7	1	3
18	8	1	3
19	9	1	3
21	1	2	1
22	2	2	1
23	3	2	1
24	4	2	2
25	5	2	2
26	6	2	2
27	7	2	3
28	8	2	3
29	9	2	3
31	1	3	1
32	2	3	1

Cell

CREATETABLE Cell(
CellID SMALLINTNOT NULL,
Value SMALLINTNOT NULL,
Step INT IDENTITY(1,1) NOT NULL,
[Rule] nvarchar(100) NULL,
CONSTRAINTPK_Cell PRIMARY KEYCLUSTERED (
CellID
)
);

The columns Step and Rule are for traceability, I want to know why and when a cell was fillled.

Candidate

CREATETABLE Candidate(
CellID SMALLINTNOT NULL,
Value SMALLINTNOT NULL,
Mask INT NOT NULL,
CONSTRAINTPK_Candidate PRIMARY KEYCLUSTERED (
    CellID,
    Value
)
);
CREATETABLE CandidateMask (
CellID SMALLINTNOT NULL,
Mask INT NOT NULL,
CONSTRAINTPK_CandidateMask PRIMARY KEY (
    CellID
)
);

The table Candidate holds the values for a cell that are considered possible at the moment. The idea is to remove rows from the table candidate by rules. When there is only one candidate left its value is assigned to the cell, which removes further candidates for other cells.
The table CandidateMask was introduced later as ist turned out to be convenient to operate not only on certain values, but on sets of values. That leads us to sets.

Sets

CREATETABLE Set1(
SetValue SMALLINTNOT NULL,
Value1 SMALLINTNOT NULL,
DisplayValue VARCHAR(10) NOT NULL,
CONSTRAINTPK_Set1 PRIMARY KEYCLUSTERED (
SetValue
)
);
INSERTINTO Set1 (SetValue, Value1, DisplayValue)
SELECTv1.Mask, v1.Value, STUFF('.........', v1.Value, 1, CAST(v1.Value AS CHAR(1)))
FROMValue v1
;
GO
CREATETABLE Set2(
SetValue SMALLINTNOT NULL,
Value1 SMALLINTNOT NULL,
Value2 SMALLINTNOT NULL,
DisplayValue VARCHAR(10) NOT NULL,
CONSTRAINTPK_Set2 PRIMARY KEYCLUSTERED (
SetValue
)
);
INSERTINTO Set2 (SetValue, Value1, Value2,DisplayValue)
SELECTs.SetValue+v.Mask, s.Value1, v.Value, STUFF(s.DisplayValue, v.Value, 1, CAST(v.Value AS CHAR(1)))
FROMSet1 s
INNERJOIN Value v ONv.Value>s.Value1
;

The SetN tables hold the combinations of N values. Here are listed only Set1 and Set2, the download links in part 3 give you the complete source. What we have now is, showing the first 20 rows,:

SELECT* FROM SetAll ORDER BY Size, SetValue

SetValue	Size	DisplayValue
1	1	1........
2	1	.2.......
4	1	..3......
8	1	...4.....
16	1	....5....
32	1	.....6...
64	1	......7..
128	1	.......8.
256	1	........9
3	2	12.......
5	2	1.3......
6	2	.23......
9	2	1..4.....
10	2	.2.4.....
12	2	..34.....
17	2	1...5....
18	2	.2..5....
20	2	..3.5....

Domain Logic

Validity

CREATEVIEW Violation AS
SELECT l.GroupType, l.GroupValue, c.Value, COUNT(*) AS Multiplicity
FROM Cell c
INNER JOIN Location l ON c.CellID = l.CellID
GROUP BY l.GroupType, l.GroupValue, c.Value
HAVING COUNT(*) > 1
;

If we get 0 rows from this view, the cell values are valid.

Consistency

CREATETRIGGER TrimCandidates ONCell AFTER INSERTAS
BEGIN
DELETE FROM Candidate WHERECellID IN (SELECT CellID FROM inserted)
DELETE FROM c
FROM inserted i
INNER JOIN Location l1 ONl1.CellID=i.CellID
INNER JOIN Location l2 ONl1.GroupType=l2.GroupType AND l1.GroupValue=l2.GroupValue
INNER JOIN Candidate c ONl2.CellID=c.CellID AND c.Value=i.Value
END
;
GO
CREATETRIGGER TrimCandidateMask ON Candidate AFTER DELETE AS
BEGIN
UPDATE cm
SET cm.Mask=cm.Mask & (~d.Mask)
FROMCandidateMask cm
INNER JOIN (
    SELECT d.CellID, SUM(d.Mask) AS Mask
    FROMdeleted d
    GROUP BY d.CellID) d ON d.CellID=cm.CellID

DELETE FROM CandidateMask
WHERE Mask=0
END
;

If rows are added to table Cell, candidates have to be removed. When removing candidates, table CandidateMask has to be updated.

Initialisation

CREATEPROCEDURE FillCandidates AS
BEGIN
INSERT INTO Candidate (CellID, Value, Mask)
SELECT l.CellID, v.Value, v.Mask
FROM Location l
CROSS JOIN Value v
WHERE l.GroupType='C'
AND NOT EXISTS (SELECT 2 FROM Cell c WHERE c.CellID=l.CellID) -- any CellID that does not have a value yet
AND NOT EXISTS (                                              -- any value that is not present in same group
    SELECT 2 FROM Location lc
    INNER JOIN Cell c
      INNER JOIN Location l2 ON c.CellID=l2.CellID
    ON lc.GroupType=l2.GroupType AND lc.GroupValue=l2.GroupValue
    WHERE lc.CellID=l.CellID AND c.Value=v.Value)
INSERT INTO CandidateMask (CellID, Mask)
SELECT c.CellID, SUM(c.Mask) AS Mask
FROMCandidate c
GROUP BY c.CellID
END;

As a first step, all possible candidates are inserted. It is assumed that there are some rows in table Cell that represent the "start position".

What we have right now

Domain Tables

Next Part

In the next part I will introduce the queries and logic for solving.

The Challenge

Domain Tables

Values

Location

Cell

Candidate

Sets

Domain Logic

Validity

Consistency

Initialisation

What we have right now

Next Part

Rate

Share

Share

Rate

Sudoku by SQL Part 1 of 3

The Challenge

Domain Tables

Values

Location

Cell

Candidate

Sets

Domain Logic

Validity

Consistency

Initialisation

What we have right now

Next Part

Rate

Share

Share

Rate

Related content

Book Review: Big Red - Voyage of a Trident Submarine

Database Mirroring FAQ: Can a 2008 SQL instance be used as the witness for a 2005 database mirroring setup?

Inserting Markup into a String with SQL

Networking - Part 4

Speaking at Community Events - More Thoughts