November 14, 2013 at 9:33 am
The cells in an N-dimensional space are modelled with the 2 tables below. A script is needed that takes a single cell (by CellID) and returns all the other cells that are "inline" with the given cell along each axis (including itself).
For example, suppose the space has 3 dimensions (X, Y, Z) and X has 2 positions, Y has 2 and Z has 3. If the cell with coordinates {1,1,1} is given, the result should be:
[font="Courier New"] +----------+---------+
| AxisCode | Cell |
+----------+---------+
| X | {1,1,1} | <- showing coordinates for clarity, but should be CellID
| X | {2,1,1} |
| Y | {1,1,1} |
| Y | {1,2,1} |
| Z | {1,1,1} |
| Z | {1,1,2} |
| Z | {1,1,3} |
+----------+---------+[/font]
I have spent hours on this and only come up with queries that are hard-coded for a specific number of dimensions ....
Please note:
**Changing the schema of the 2 tables is not an option!
The script has to work for N dimensions, and should not involve loops or cursors.**
Compatibility must be MS SQL 2008 R2
Any ideas gratefully received!
create table dbo.Cells(
CellID int not null,
CellValue int not null,
constraint PK_Cells primary key (CellID)
)
create table dbo.AxisPositions(
AxisCode char(1) not null,-- X, Y, Z etc
PositionOnAxis int not null,-- 1, 2, 3, 4 etc
constraint PK_AxisPositions primary key (AxisCode, PositionOnAxis)
)
create table dbo.CellAxes(
CellID int not null,
AxisCode char(1) not null,-- X, Y, Z etc
PositionOnAxis int not null,-- 1, 2, 3, 4 etc
constraint PK_CellAxes primary key (CellID, AxisCode),
constraint FK_CellAxes_Cells foreign key (CellID) references Cells(CellID),
constraint FK_CellAxes_AxisPositions foreign key (AxisCode, PositionOnAxis) references AxisPositions(AxisCode, PositionOnAxis)
)
-- Example data
insert Cells (CellID, CellValue)
values (1, 67), (2, 45), (3, 0), (4, 4), (5, 78), (6, 213), (7, 546), (8, 455), (9, 12), (10, 67), (11, 4), (12, 5)
insert AxisPositions (AxisCode, PositionOnAxis)
values ('X', 1), ('X', 2), ('Y', 1), ('Y', 2), ('Z', 1), ('Z', 2), ('Z', 3)
insert CellAxes (CellID, AxisCode, PositionOnAxis)
values (1, 'X', 1), (1, 'Y', 1), (1, 'Z', 1),
(2, 'X', 2), (2, 'Y', 1), (2, 'Z', 1),
(3, 'X', 1), (3, 'Y', 2), (3, 'Z', 1),
(4, 'X', 2), (4, 'Y', 2), (4, 'Z', 1),
(5, 'X', 1), (5, 'Y', 1), (5, 'Z', 2),
(6, 'X', 2), (6, 'Y', 1), (6, 'Z', 2),
(7, 'X', 1), (7, 'Y', 2), (7, 'Z', 2),
(8, 'X', 2), (8, 'Y', 2), (8, 'Z', 2),
(9, 'X', 1), (9, 'Y', 1), (9, 'Z', 3),
(10, 'X', 2), (10, 'Y', 1), (10, 'Z', 3),
(11, 'X', 1), (11, 'Y', 2), (11, 'Z', 3),
(12, 'X', 2), (12, 'Y', 2), (12, 'Z', 3)
November 14, 2013 at 10:19 am
I'm not sure if I fully understood the requirement (especially for N>3)...
My assumption: a cell is "inline" if all but one value match the cell coordinates in question.
"All but one" is used to meet the criteria "N-dimensional" .
Your expected output shows 7 rows, but the cell (1,1,1) is displayed three times, so in fact there are only 5 matching cells.
Here's what I came up with:
WITH sample_cell AS
(
SELECT 'X' AS AxisCode, 1 AS Pos UNION ALL
SELECT 'Y' AS AxisCode, 1 AS Pos UNION ALL
SELECT 'Z' AS AxisCode, 1 AS Pos
)
SELECT
cellid
FROM CellAxes ca
INNER JOIN sample_cell on ca.AxisCode = sample_cell.AxisCode AND sample_cell.Pos=ca.PositionOnAxis
GROUP BY cellid
HAVING COUNT(*) >= (SELECT COUNT(*) -1 AS sub FROM sample_cell)
It's basically the SQL logic format of the above "business logic". The cte sample_cell is used to describe the given cell. Please not that there's one row per axis.
This basic solution does not include any test whether all axis are provided or if the given values are not valid as per the rows in AxisPositions. But this check could be added, if required...
Edit: if you want to remove the original cell, just change the comparison from >= to =
November 14, 2013 at 12:37 pm
Wow! That is pretty close. However, what I needed was this result set:
+----------+---------+
| AxisCode | CellID |
+----------+---------+
| X | 1 |
| X | 2 |
| Y | 1 |
| Y | 3 |
| Z | 1 |
| Z | 5 |
| Z | 9 |
+----------+---------+
i.e. "if you walk the X dimension you get cells 1 & 2; walk Y you get 1 & 3; walk Z you get 1, 5 & 9. I do actually need to get cell 1 in each dimension.
November 14, 2013 at 1:07 pm
I now have the result I need. I combined your approach with someone else's help on StackOverflow and came up with this:
declare @CellID int = 1
;with SelectedCell as
(
select CellID, AxisCode, PositionOnAxis
from CellAxes
where CellID = @CellID
)
select x.AxisCode, a2.CellID
from SelectedCell a1
join CellAxes a2 on a2.AxisCode = a1.AxisCode
join CellAxes x on x.CellID = a1.CellID
where (a1.AxisCode = x.AxisCode or a1.PositionOnAxis = a2.PositionOnAxis)
group by x.AxisCode, a2.CellID
having count(*) = (select count(AxisCode) from SelectedCell)
order by x.AxisCode, a2.CellID
Thanks for your help
November 14, 2013 at 1:11 pm
ok, it's getting slightly more complicated:
WITH sample_cell AS
(
SELECT 'X' AS AxisCode, 1 AS Pos UNION ALL
SELECT 'Y' AS AxisCode, 1 AS Pos UNION ALL
SELECT 'Z' AS AxisCode, 1 AS Pos
), cte_almost as
(
SELECT
cellid
FROM CellAxes ca
INNER JOIN sample_cell on ca.AxisCode = sample_cell.AxisCode AND sample_cell.Pos=ca.PositionOnAxis
GROUP BY cellid
HAVING COUNT(*) = (SELECT COUNT(*) -1 AS sub FROM sample_cell)
), cte_unsortet as
(
SELECT
MAX(CASE WHEN sc.pos <> ca.PositionOnAxis THEN ca.AxisCode ELSE NULL END) as Axis,
cte_almost.cellid
FROM cte_almost
INNER JOIN CellAxes ca ON cte_almost.cellid = ca.Cellid
LEFT OUTER JOIN sample_cell sc ON ca.axisCode=sc.AxisCode
GROUP BY cte_almost.cellid
UNION ALL
SELECT AxisCode, Pos
FROM sample_cell
)
SELECT * FROM cte_unsortet ORDER BY Axis, cellid
The cellid extracted earlier now are used to join them ack to the original table as well as the source cell. Please note that I changed it from COUNT(*) >= (SELECT ... to COUNT(*) = (SELECT to eliminate the cource cell id.
Then a CROSS TAB (or pivot) is performed where the axis name is derived from the axis position that does not match the source cell together with the cellid. (Almost) Finally, the source cell is simply added to the result set using a UNION ALL query.
At the very end, the result set is sorted as per the requirement.
November 14, 2013 at 5:50 pm
I'm not sure if I am understanding the requirements for this correctly, but would this work?
declare @cellid int = 1;
select *
from CellAxes c
inner join (
select cellid
from (select axiscode, positiononaxis from CellAxes where cellid = @cellid) a
inner join CellAxes b on a.axisCode = b.axisCode
group by cellid having sum(case when a.positiononaxis = b.positiononaxis then 0 else 1 end) <= 1
) il on c.cellid = il.cellid
November 14, 2013 at 7:23 pm
Laurence Neville (11/14/2013)
I now have the result I need. I combined your approach with someone else's help on StackOverflow and came up with this:declare @CellID int = 1
;with SelectedCell as
(
select CellID, AxisCode, PositionOnAxis
from CellAxes
where CellID = @CellID
)
select x.AxisCode, a2.CellID
from SelectedCell a1
join CellAxes a2 on a2.AxisCode = a1.AxisCode
join CellAxes x on x.CellID = a1.CellID
where (a1.AxisCode = x.AxisCode or a1.PositionOnAxis = a2.PositionOnAxis)
group by x.AxisCode, a2.CellID
having count(*) = (select count(AxisCode) from SelectedCell)
order by x.AxisCode, a2.CellID
Thanks for your help
This may be an alternate solution:
WITH RowsToAxes AS
(
SELECT CellID
,X=MAX(CASE WHEN AxisCode = 'X' THEN PositionOnAxis END)
,Y=MAX(CASE WHEN AxisCode = 'Y' THEN PositionOnAxis END)
,Z=MAX(CASE WHEN AxisCode = 'Z' THEN PositionOnAxis END)
FROM CellAxes
GROUP BY CellID
)
SELECT b.AxisCode, b.CellID
FROM
(
SELECT CellID, a.X, a.Y, a.Z, X1, Y1, Z1
,C=X1 + Y1 + Z1
FROM RowsToAxes a
CROSS APPLY
(
SELECT X1=CASE X WHEN a.X THEN 1 ELSE 0 END
,Y1=CASE Y WHEN a.Y THEN 1 ELSE 0 END
,Z1=CASE Z WHEN a.Z THEN 1 ELSE 0 END
FROM RowsToAxes
WHERE CellID = @CellID
) b
) a
CROSS APPLY
(
VALUES (CASE C WHEN 3 THEN CellID ELSE (1-X1)*CellID END, 'X')
,(CASE C WHEN 3 THEN CellID ELSE (1-Y1)*CellID END, 'Y')
,(CASE C WHEN 3 THEN CellID ELSE (1-Z1)*CellID END, 'Z')
) b (CellID, AxisCode)
WHERE C IN (2,3) AND b.CellID <> 0;
Not sure if it is 100% correct but it does return the desired results for this case. Compared to your original solution, it looks more complicated, but the execution plan gives me:
Original: Cost=41%, Clustered INDEX Scans=1, Clustered INDEX Seeks=3
MickyT: Cost=18%, Clustered INDEX Scans=1, Clustered INDEX Seeks=2
Mine: Cost=8%, Clustered INDEX Scans=1, Clustered INDEX Seeks=1
Note that Cost cannot always be relied upon, just included for reference.
IO Counts -
Original: Scan count 38, logical reads 148, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
MickyT: Scan count 6, logical reads 84, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Mine: Scan count 2, logical reads 4, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Edit1: Added MickyT to the comparison.
Edit2: Slightly simplified my suggested solution (no impact to Execution Plan or IO stats).
My thought question: Have you ever been told that your query runs too fast?
My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.
Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St
November 14, 2013 at 7:25 pm
mickyT (11/14/2013)
I'm not sure if I am understanding the requirements for this correctly, but would this work?
declare @cellid int = 1;
select *
from CellAxes c
inner join (
select cellid
from (select axiscode, positiononaxis from CellAxes where cellid = @cellid) a
inner join CellAxes b on a.axisCode = b.axisCode
group by cellid having sum(case when a.positiononaxis = b.positiononaxis then 0 else 1 end) <= 1
) il on c.cellid = il.cellid
Micky - Tried to include this in my comparison but got this:
Each GROUP BY expression must contain at least one column that is not an outer reference.
My thought question: Have you ever been told that your query runs too fast?
My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.
Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St
November 14, 2013 at 7:30 pm
Please ignore the error I reported for MickyT's solution. Copy/paste error.
I'm going back to my previous post to include MickyT's results.
My thought question: Have you ever been told that your query runs too fast?
My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.
Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St
November 14, 2013 at 11:13 pm
Laurence Neville (11/14/2013)
Wow! That is pretty close. However, what I needed was this result set:+----------+---------+
| AxisCode | CellID |
+----------+---------+
| X | 1 |
| X | 2 |
| Y | 1 |
| Y | 3 |
| Z | 1 |
| Z | 5 |
| Z | 9 |
+----------+---------+
i.e. "if you walk the X dimension you get cells 1 & 2; walk Y you get 1 & 3; walk Z you get 1, 5 & 9. I do actually need to get cell 1 in each dimension.
I've been trying to get my result to look like his while allowing for extra dimensions, time to get my vpn going so I can work on it over the weekend. My solution should work over additional dimensions (as far as cellids go), but I'm having problems getting it to output in the format above.
November 15, 2013 at 2:43 am
mickeyT - your solution returns too many rows unfortunately.
dwain.c - your solution returns the right rows for 3 dimensions (X,Y,Z) and the query plan is by far the best, but it is hard-coded to assume those specific dimensions and I need a generic N-dimensional solution.
LutzM - your 2nd query returns the right rows for cell ID 1 {1,1,1}. However for other cell IDs it returns the wrong rows.
Thanks for everyone's contributions. For now the best solution is:
declare @CellID int = 3
;with SelectedCell as
(
select CellID, AxisCode, PositionOnAxis
from CellAxes
where CellID = @CellID
)
select x.AxisCode, a2.CellID
from SelectedCell a1
join CellAxes a2 on a2.AxisCode = a1.AxisCode
join CellAxes x on x.CellID = a1.CellID
where (a1.AxisCode = x.AxisCode or a1.PositionOnAxis = a2.PositionOnAxis)
group by x.AxisCode, a2.CellID
having count(*) = (select count(AxisCode) from SelectedCell)
order by x.AxisCode, a2.CellID
November 15, 2013 at 3:47 am
Laurence Neville (11/15/2013)
dwain.c - your solution returns the right rows for 3 dimensions (X,Y,Z) and the query plan is by far the best, but it is hard-coded to assume those specific dimensions and I need a generic N-dimensional solution.
I suppose next you'll tell me I can't use dynamic SQL.
DECLARE @sql NVARCHAR(MAX)
,@SQLParms NVARCHAR(MAX) = N'@Cellid INT'
,@CellID INT = 1;
WITH Axes AS
(
SELECT DISTINCT AxisCode
FROM CellAxes
)
,CountAxes AS
(
SELECT AxesCount=COUNT(*)
FROM Axes
)
SELECT @sql = N'WITH RowsToAxes AS
(
SELECT CellID ' + CHAR(10) +
(
SELECT (
SELECT ',' + AxisCode + '=MAX(CASE WHEN AxisCode = ''' + AxisCode + ''' THEN PositionOnAxis END)' + CHAR(10)
FROM Axes
FOR XML PATH(''))
) + CHAR(10) +
N' FROM CellAxes
GROUP BY CellID
)
SELECT b.AxisCode, b.CellID
FROM
(
SELECT CellID' + CHAR(10) +
(
SELECT
(
SELECT ',a.' + AxisCode
FROM Axes
FOR XML PATH('')
)
) + CHAR(10) +
(
SELECT
(
SELECT ',' + AxisCode + '1'
FROM Axes
FOR XML PATH('')
)
) + CHAR(10) +',C=' +
(
SELECT STUFF(
(
SELECT '+' + AxisCode + '1'
FROM Axes
FOR XML PATH('')
),1,1,'')
) + CHAR(10) +
' FROM RowsToAxes a
CROSS APPLY
(
SELECT ' + CHAR(10) +
(
SELECT STUFF(
(
SELECT ',' + AxisCode + '1=CASE ' + AxisCode + ' WHEN a.' + AxisCode + ' THEN 1 ELSE 0 END' + CHAR(10)
FROM Axes
FOR XML PATH('')
),1,1,'')
) + CHAR(10) +
' FROM RowsToAxes
WHERE CellID = @CellID
) b
) a
CROSS APPLY
(
VALUES' + CHAR(10) +
(
SELECT STUFF(
(
SELECT ',(CASE C WHEN ' + CAST((SELECT AxesCount FROM CountAxes) AS VARCHAR(5)) +
' THEN CellID ELSE (1-' + AxisCode + '1)*Cellid END, ''' + AxisCode + ''')' + CHAR(10)
FROM Axes
FOR XML PATH('')
),1,1,'')
) + CHAR(10) +
') b (CellID, AxisCode)
WHERE C IN (' + CAST((SELECT AxesCount-1 FROM CountAxes) AS VARCHAR(5)) +
',' + CAST((SELECT AxesCount FROM CountAxes) AS VARCHAR(5)) +
') AND b.CellID <> 0;';
PRINT @sql
EXEC sp_executesql @sql, @SQLParms, @CellID = @CellID
My thought question: Have you ever been told that your query runs too fast?
My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.
Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St
November 15, 2013 at 1:34 pm
OK , so I think I have got there, but it has made my query real ugly
declare @cellid int = 1;
select axiscode, cellid
from (
select c.cellid,
case when c.cellid = @cellid then c.axiscode else
(select c.axiscode
from #CellAxes x
where c.axiscode = x.axiscode and x.cellid = @cellid and c.positiononaxis <> x.positiononaxis)
end axiscode
from #CellAxes c
inner join (
select cellid
from (select axiscode, positiononaxis from #CellAxes where cellid = @cellid) a
inner join #CellAxes b on a.axisCode = b.axisCode
group by cellid having sum(case when a.positiononaxis = b.positiononaxis then 0 else 1 end) <= 1
) il on c.cellid = il.cellid
) z
where axiscode is not null
order by axiscode
November 18, 2013 at 8:20 am
dwain.c - nope, I have nothing against dynamic SQL!
I have to award the prize to mickeyT's solution - it is the fastest of all the contributions.
Thanks everyone ... I was really banging my head with this one, and when I posted on StackOverflow most responses were "it can't be done in SQL". I guess I was asking in the wrong place!
November 18, 2013 at 6:05 pm
Laurence Neville (11/18/2013)
dwain.c - nope, I have nothing against dynamic SQL!I have to award the prize to mickeyT's solution - it is the fastest of all the contributions.
Thanks everyone ... I was really banging my head with this one, and when I posted on StackOverflow most responses were "it can't be done in SQL". I guess I was asking in the wrong place!
Even though you chose MickyT's solution over mine (congratulations Micky!), I am still gratified whenever I hear someone say "it can't be done in SQL" and yet here are multiple solutions showing how it can.
At least you now know where to come next time.
My thought question: Have you ever been told that your query runs too fast?
My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.
Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St
Viewing 15 posts - 1 through 15 (of 23 total)
You must be logged in to reply to this topic. Login to reply