T-SQL puzzler - given a cell in an N-dimensional space, return the cells that are inline with the given cell along each axis

Question

Post reply

T-SQL puzzler - given a cell in an N-dimensional space, return the cells that are inline with the given cell along each axis

Laurence Neville

SSC Eights!

Points: 889
More actions
November 14, 2013 at 9:33 am

#280203

The cells in an N-dimensional space are modelled with the 2 tables below. A script is needed that takes a single cell (by CellID) and returns all the other cells that are "inline" with the given cell along each axis (including itself).
For example, suppose the space has 3 dimensions (X, Y, Z) and X has 2 positions, Y has 2 and Z has 3. If the cell with coordinates {1,1,1} is given, the result should be:
[font="Courier New"] +----------+---------+
| AxisCode | Cell |
+----------+---------+
| X | {1,1,1} | <- showing coordinates for clarity, but should be CellID
| X | {2,1,1} |
| Y | {1,1,1} |
| Y | {1,2,1} |
| Z | {1,1,1} |
| Z | {1,1,2} |
| Z | {1,1,3} |
+----------+---------+[/font]
I have spent hours on this and only come up with queries that are hard-coded for a specific number of dimensions ....
Please note:
**Changing the schema of the 2 tables is not an option!
The script has to work for N dimensions, and should not involve loops or cursors.**
Compatibility must be MS SQL 2008 R2
Any ideas gratefully received!
create table dbo.Cells(
CellID int not null,
CellValue int not null,
constraint PK_Cells primary key (CellID)
)
create table dbo.AxisPositions(
AxisCode char(1) not null,-- X, Y, Z etc
PositionOnAxis int not null,-- 1, 2, 3, 4 etc
constraint PK_AxisPositions primary key (AxisCode, PositionOnAxis)
)
create table dbo.CellAxes(
CellID int not null,
AxisCode char(1) not null,-- X, Y, Z etc
PositionOnAxis int not null,-- 1, 2, 3, 4 etc
constraint PK_CellAxes primary key (CellID, AxisCode),
constraint FK_CellAxes_Cells foreign key (CellID) references Cells(CellID),
constraint FK_CellAxes_AxisPositions foreign key (AxisCode, PositionOnAxis) references AxisPositions(AxisCode, PositionOnAxis)
)
-- Example data
insert Cells (CellID, CellValue)
values (1, 67), (2, 45), (3, 0), (4, 4), (5, 78), (6, 213), (7, 546), (8, 455), (9, 12), (10, 67), (11, 4), (12, 5)
insert AxisPositions (AxisCode, PositionOnAxis)
values ('X', 1), ('X', 2), ('Y', 1), ('Y', 2), ('Z', 1), ('Z', 2), ('Z', 3)
insert CellAxes (CellID, AxisCode, PositionOnAxis)
values (1, 'X', 1), (1, 'Y', 1), (1, 'Z', 1),
(2, 'X', 2), (2, 'Y', 1), (2, 'Z', 1),
(3, 'X', 1), (3, 'Y', 2), (3, 'Z', 1),
(4, 'X', 2), (4, 'Y', 2), (4, 'Z', 1),
(5, 'X', 1), (5, 'Y', 1), (5, 'Z', 2),
(6, 'X', 2), (6, 'Y', 1), (6, 'Z', 2),
(7, 'X', 1), (7, 'Y', 2), (7, 'Z', 2),
(8, 'X', 2), (8, 'Y', 2), (8, 'Z', 2),
(9, 'X', 1), (9, 'Y', 1), (9, 'Z', 3),
(10, 'X', 2), (10, 'Y', 1), (10, 'Z', 3),
(11, 'X', 1), (11, 'Y', 2), (11, 'Z', 3),
(12, 'X', 2), (12, 'Y', 2), (12, 'Z', 3)

Viewing 15 posts - 1 through 15 (of 24 total)

You must be logged in to reply to this topic. Login to reply

LutzM SSC Guru Points: 107049 More actions · Answer 1

I'm not sure if I fully understood the requirement (especially for N>3)...

My assumption: a cell is "inline" if all but one value match the cell coordinates in question.

"All but one" is used to meet the criteria "N-dimensional" .

Your expected output shows 7 rows, but the cell (1,1,1) is displayed three times, so in fact there are only 5 matching cells.

Here's what I came up with:

WITH sample_cell AS

(

SELECT 'X' AS AxisCode, 1 AS Pos UNION ALL

SELECT 'Y' AS AxisCode, 1 AS Pos UNION ALL

SELECT 'Z' AS AxisCode, 1 AS Pos

)

SELECT

cellid

FROM CellAxes ca

INNER JOIN sample_cell on ca.AxisCode = sample_cell.AxisCode AND sample_cell.Pos=ca.PositionOnAxis

GROUP BY cellid

HAVING COUNT(*) >= (SELECT COUNT(*) -1 AS sub FROM sample_cell)

It's basically the SQL logic format of the above "business logic". The cte sample_cell is used to describe the given cell. Please not that there's one row per axis.

This basic solution does not include any test whether all axis are provided or if the given values are not valid as per the rows in AxisPositions. But this check could be added, if required...

Edit: if you want to remove the original cell, just change the comparison from >= to =

Lutz
A pessimist is an optimist with experience.

How to get fast answers to your question[/url]
How to post performance related questions[/url]
Links for Tally Table [/url] , Cross Tabs [/url] and Dynamic Cross Tabs [/url], Delimited Split Function[/url]

Laurence Neville SSC Eights! Points: 889 More actions · Answer 2

Wow! That is pretty close. However, what I needed was this result set:

+----------+---------+

| AxisCode | CellID |

+----------+---------+

| X | 1 |

| X | 2 |

| Y | 1 |

| Y | 3 |

| Z | 1 |

| Z | 5 |

| Z | 9 |

+----------+---------+

i.e. "if you walk the X dimension you get cells 1 & 2; walk Y you get 1 & 3; walk Z you get 1, 5 & 9. I do actually need to get cell 1 in each dimension.

Laurence Neville SSC Eights! Points: 889 More actions · Answer 3

I now have the result I need. I combined your approach with someone else's help on StackOverflow and came up with this:

declare @CellID int = 1

;with SelectedCell as

(

select CellID, AxisCode, PositionOnAxis

from CellAxes

where CellID = @CellID

)

select x.AxisCode, a2.CellID

from SelectedCell a1

join CellAxes a2 on a2.AxisCode = a1.AxisCode

join CellAxes x on x.CellID = a1.CellID

where (a1.AxisCode = x.AxisCode or a1.PositionOnAxis = a2.PositionOnAxis)

group by x.AxisCode, a2.CellID

having count(*) = (select count(AxisCode) from SelectedCell)

order by x.AxisCode, a2.CellID

Thanks for your help

LutzM SSC Guru Points: 107049 More actions · Answer 4

ok, it's getting slightly more complicated:

WITH sample_cell AS

(

SELECT 'X' AS AxisCode, 1 AS Pos UNION ALL

SELECT 'Y' AS AxisCode, 1 AS Pos UNION ALL

SELECT 'Z' AS AxisCode, 1 AS Pos

), cte_almost as

(

SELECT

cellid

FROM CellAxes ca

INNER JOIN sample_cell on ca.AxisCode = sample_cell.AxisCode AND sample_cell.Pos=ca.PositionOnAxis

GROUP BY cellid

HAVING COUNT(*) = (SELECT COUNT(*) -1 AS sub FROM sample_cell)

), cte_unsortet as

(

SELECT

MAX(CASE WHEN sc.pos <> ca.PositionOnAxis THEN ca.AxisCode ELSE NULL END) as Axis,

cte_almost.cellid

FROM cte_almost

INNER JOIN CellAxes ca ON cte_almost.cellid = ca.Cellid

LEFT OUTER JOIN sample_cell sc ON ca.axisCode=sc.AxisCode

GROUP BY cte_almost.cellid

UNION ALL

SELECT AxisCode, Pos

FROM sample_cell

)

SELECT * FROM cte_unsortet ORDER BY Axis, cellid

The cellid extracted earlier now are used to join them ack to the original table as well as the source cell. Please note that I changed it from COUNT(*) >= (SELECT ... to COUNT(*) = (SELECT to eliminate the cource cell id.

Then a CROSS TAB (or pivot) is performed where the axis name is derived from the axis position that does not match the source cell together with the cellid. (Almost) Finally, the source cell is simply added to the result set using a UNION ALL query.

At the very end, the result set is sorted as per the requirement.

Lutz
A pessimist is an optimist with experience.

How to get fast answers to your question[/url]
How to post performance related questions[/url]
Links for Tally Table [/url] , Cross Tabs [/url] and Dynamic Cross Tabs [/url], Delimited Split Function[/url]

mickyT SSChampion Points: 10360 More actions · Answer 5

I'm not sure if I am understanding the requirements for this correctly, but would this work?

declare @cellid int = 1;

select *

from CellAxes c

inner join (

select cellid

from (select axiscode, positiononaxis from CellAxes where cellid = @cellid) a

inner join CellAxes b on a.axisCode = b.axisCode

group by cellid having sum(case when a.positiononaxis = b.positiononaxis then 0 else 1 end) <= 1

) il on c.cellid = il.cellid

Dwain Camps SSC Guru Points: 86978 More actions · Answer 6

Laurence Neville (11/14/2013)
I now have the result I need. I combined your approach with someone else's help on StackOverflow and came up with this:
declare @CellID int = 1
;with SelectedCell as
(
select CellID, AxisCode, PositionOnAxis
from CellAxes
where CellID = @CellID
)
select x.AxisCode, a2.CellID
from SelectedCell a1
join CellAxes a2 on a2.AxisCode = a1.AxisCode
join CellAxes x on x.CellID = a1.CellID
where (a1.AxisCode = x.AxisCode or a1.PositionOnAxis = a2.PositionOnAxis)
group by x.AxisCode, a2.CellID
having count(*) = (select count(AxisCode) from SelectedCell)
order by x.AxisCode, a2.CellID
Thanks for your help

This may be an alternate solution:

WITH RowsToAxes AS

(

SELECT CellID

,X=MAX(CASE WHEN AxisCode = 'X' THEN PositionOnAxis END)

,Y=MAX(CASE WHEN AxisCode = 'Y' THEN PositionOnAxis END)

,Z=MAX(CASE WHEN AxisCode = 'Z' THEN PositionOnAxis END)

FROM CellAxes

GROUP BY CellID

)

SELECT b.AxisCode, b.CellID

FROM

(

SELECT CellID, a.X, a.Y, a.Z, X1, Y1, Z1

,C=X1 + Y1 + Z1

FROM RowsToAxes a

CROSS APPLY

(

SELECT X1=CASE X WHEN a.X THEN 1 ELSE 0 END

,Y1=CASE Y WHEN a.Y THEN 1 ELSE 0 END

,Z1=CASE Z WHEN a.Z THEN 1 ELSE 0 END

FROM RowsToAxes

WHERE CellID = @CellID

) b

) a

CROSS APPLY

(

VALUES (CASE C WHEN 3 THEN CellID ELSE (1-X1)*CellID END, 'X')

,(CASE C WHEN 3 THEN CellID ELSE (1-Y1)*CellID END, 'Y')

,(CASE C WHEN 3 THEN CellID ELSE (1-Z1)*CellID END, 'Z')

) b (CellID, AxisCode)

WHERE C IN (2,3) AND b.CellID <> 0;

Not sure if it is 100% correct but it does return the desired results for this case. Compared to your original solution, it looks more complicated, but the execution plan gives me:

Original: Cost=41%, Clustered INDEX Scans=1, Clustered INDEX Seeks=3

MickyT: Cost=18%, Clustered INDEX Scans=1, Clustered INDEX Seeks=2

Mine: Cost=8%, Clustered INDEX Scans=1, Clustered INDEX Seeks=1

Note that Cost cannot always be relied upon, just included for reference.

IO Counts -

Original: Scan count 38, logical reads 148, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

MickyT: Scan count 6, logical reads 84, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

Mine: Scan count 2, logical reads 4, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

Edit1: Added MickyT to the comparison.

Edit2: Slightly simplified my suggested solution (no impact to Execution Plan or IO stats).

My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]
My thought question: Have you ever been told that your query runs too fast?

My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St

Dwain Camps SSC Guru Points: 86978 More actions · Answer 7

mickyT (11/14/2013)
I'm not sure if I am understanding the requirements for this correctly, but would this work?
declare @cellid int = 1;
select *
from CellAxes c
inner join (
select cellid
from (select axiscode, positiononaxis from CellAxes where cellid = @cellid) a
inner join CellAxes b on a.axisCode = b.axisCode
group by cellid having sum(case when a.positiononaxis = b.positiononaxis then 0 else 1 end) <= 1
) il on c.cellid = il.cellid

Micky - Tried to include this in my comparison but got this:

Each GROUP BY expression must contain at least one column that is not an outer reference.

My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]
My thought question: Have you ever been told that your query runs too fast?

My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St

Dwain Camps SSC Guru Points: 86978 More actions · Answer 8

Please ignore the error I reported for MickyT's solution. Copy/paste error.

I'm going back to my previous post to include MickyT's results.

My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]
My thought question: Have you ever been told that your query runs too fast?

My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St

mickyT SSChampion Points: 10360 More actions · Answer 9

Laurence Neville (11/14/2013)
Wow! That is pretty close. However, what I needed was this result set:
+----------+---------+
| AxisCode | CellID |
+----------+---------+
| X | 1 |
| X | 2 |
| Y | 1 |
| Y | 3 |
| Z | 1 |
| Z | 5 |
| Z | 9 |
+----------+---------+
i.e. "if you walk the X dimension you get cells 1 & 2; walk Y you get 1 & 3; walk Z you get 1, 5 & 9. I do actually need to get cell 1 in each dimension.

I've been trying to get my result to look like his while allowing for extra dimensions, time to get my vpn going so I can work on it over the weekend. My solution should work over additional dimensions (as far as cellids go), but I'm having problems getting it to output in the format above.

Laurence Neville SSC Eights! Points: 889 More actions · Answer 10

mickeyT - your solution returns too many rows unfortunately.

dwain.c - your solution returns the right rows for 3 dimensions (X,Y,Z) and the query plan is by far the best, but it is hard-coded to assume those specific dimensions and I need a generic N-dimensional solution.

LutzM - your 2nd query returns the right rows for cell ID 1 {1,1,1}. However for other cell IDs it returns the wrong rows.

Thanks for everyone's contributions. For now the best solution is:

declare @CellID int = 3

;with SelectedCell as

(

select CellID, AxisCode, PositionOnAxis

from CellAxes

where CellID = @CellID

)

select x.AxisCode, a2.CellID

from SelectedCell a1

join CellAxes a2 on a2.AxisCode = a1.AxisCode

join CellAxes x on x.CellID = a1.CellID

where (a1.AxisCode = x.AxisCode or a1.PositionOnAxis = a2.PositionOnAxis)

group by x.AxisCode, a2.CellID

having count(*) = (select count(AxisCode) from SelectedCell)

order by x.AxisCode, a2.CellID

Dwain Camps SSC Guru Points: 86978 More actions · Answer 11

Laurence Neville (11/15/2013)
dwain.c - your solution returns the right rows for 3 dimensions (X,Y,Z) and the query plan is by far the best, but it is hard-coded to assume those specific dimensions and I need a generic N-dimensional solution.

I suppose next you'll tell me I can't use dynamic SQL.

DECLARE @SQL NVARCHAR(MAX)

,@SQLParms NVARCHAR(MAX) = N'@Cellid INT'

,@CellID INT = 1;

WITH Axes AS

(

SELECT DISTINCT AxisCode

FROM CellAxes

)

,CountAxes AS

(

SELECT AxesCount=COUNT(*)

FROM Axes

)

SELECT @SQL = N'WITH RowsToAxes AS

(

SELECT CellID ' + CHAR(10) +

(

SELECT (

SELECT ',' + AxisCode + '=MAX(CASE WHEN AxisCode = ''' + AxisCode + ''' THEN PositionOnAxis END)' + CHAR(10)

FROM Axes

FOR XML PATH(''))

) + CHAR(10) +

N' FROM CellAxes

GROUP BY CellID

)

SELECT b.AxisCode, b.CellID

FROM

(

SELECT CellID' + CHAR(10) +

(

SELECT

(

SELECT ',a.' + AxisCode

FROM Axes

FOR XML PATH('')

)

) + CHAR(10) +

(

SELECT

(

SELECT ',' + AxisCode + '1'

FROM Axes

FOR XML PATH('')

)

) + CHAR(10) +',C=' +

(

SELECT STUFF(

(

SELECT '+' + AxisCode + '1'

FROM Axes

FOR XML PATH('')

),1,1,'')

) + CHAR(10) +

' FROM RowsToAxes a

CROSS APPLY

(

SELECT ' + CHAR(10) +

(

SELECT STUFF(

(

SELECT ',' + AxisCode + '1=CASE ' + AxisCode + ' WHEN a.' + AxisCode + ' THEN 1 ELSE 0 END' + CHAR(10)

FROM Axes

FOR XML PATH('')

),1,1,'')

) + CHAR(10) +

' FROM RowsToAxes

WHERE CellID = @CellID

) b

) a

CROSS APPLY

(

VALUES' + CHAR(10) +

(

SELECT STUFF(

(

SELECT ',(CASE C WHEN ' + CAST((SELECT AxesCount FROM CountAxes) AS VARCHAR(5)) +

' THEN CellID ELSE (1-' + AxisCode + '1)*Cellid END, ''' + AxisCode + ''')' + CHAR(10)

FROM Axes

FOR XML PATH('')

),1,1,'')

) + CHAR(10) +

') b (CellID, AxisCode)

WHERE C IN (' + CAST((SELECT AxesCount-1 FROM CountAxes) AS VARCHAR(5)) +

',' + CAST((SELECT AxesCount FROM CountAxes) AS VARCHAR(5)) +

') AND b.CellID <> 0;';

PRINT @SQL

EXEC sp_executesql @SQL, @SQLParms, @CellID = @CellID

My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]
My thought question: Have you ever been told that your query runs too fast?

My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St

mickyT SSChampion Points: 10360 More actions · Answer 12

OK , so I think I have got there, but it has made my query real ugly

declare @cellid int = 1;

select axiscode, cellid

from (

select c.cellid,

case when c.cellid = @cellid then c.axiscode else

(select c.axiscode

from #CellAxes x

where c.axiscode = x.axiscode and x.cellid = @cellid and c.positiononaxis <> x.positiononaxis)

end axiscode

from #CellAxes c

inner join (

select cellid

from (select axiscode, positiononaxis from #CellAxes where cellid = @cellid) a

inner join #CellAxes b on a.axisCode = b.axisCode

group by cellid having sum(case when a.positiononaxis = b.positiononaxis then 0 else 1 end) <= 1

) il on c.cellid = il.cellid

) z

where axiscode is not null

order by axiscode

Laurence Neville SSC Eights! Points: 889 More actions · Answer 13

dwain.c - nope, I have nothing against dynamic SQL!

I have to award the prize to mickeyT's solution - it is the fastest of all the contributions.

Thanks everyone ... I was really banging my head with this one, and when I posted on StackOverflow most responses were "it can't be done in SQL". I guess I was asking in the wrong place!

Dwain Camps SSC Guru Points: 86978 More actions · Answer 14

Laurence Neville (11/18/2013)
dwain.c - nope, I have nothing against dynamic SQL!
I have to award the prize to mickeyT's solution - it is the fastest of all the contributions.
Thanks everyone ... I was really banging my head with this one, and when I posted on StackOverflow most responses were "it can't be done in SQL". I guess I was asking in the wrong place!

Even though you chose MickyT's solution over mine (congratulations Micky!), I am still gratified whenever I hear someone say "it can't be done in SQL" and yet here are multiple solutions showing how it can.

At least you now know where to come next time.

My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]
My thought question: Have you ever been told that your query runs too fast?

My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St