May 20, 2012 at 3:35 pm
Dev (5/20/2012)
Nice Article Jeff!!!
Thanks, Dev. I appreciate both the read and the feedback.
--Jeff Moden
Change is inevitable... Change for the better is not.
July 1, 2012 at 5:34 pm
Great article Jeff.
One thing to mention... in the article, you pointed out that you cannot directly add / subtract a number to the new DATE data time. This applies to all of the new date data types: DATETIME2, DATETIMEOFFSET and TIME as well as DATE.
Wayne
Microsoft Certified Master: SQL Server 2008
Author - SQL Server T-SQL Recipes
December 16, 2012 at 7:15 pm
Jeff, thanks for the great articles (part 1 and part 2 and look forward to part3).
Is there a way to pick a random value (like a color) from a subquery? here is what I tried but getting the same value on every run:
SELECT TOP (100)
ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) as ID,--Sequential number from 1 to ..
(SELECT top 1 color FROM ( VALUES (0,'Red'),(1,'Green'),(2,'Yellow') ) colors(id,color) ORDER BY NEWID()) as RandomColor,
FROM sys.all_columns ac1
CROSS JOIN sys.all_columns ac2
but if I run this piece alone, I am getting different colors:
SELECT top 1 color FROM ( VALUES (0,'Red'),(1,'Green'),(2,'Yellow') ) colors(id,color) ORDER BY NEWID()
December 16, 2012 at 9:09 pm
No... not from a sub-query. At least I haven't figured out a way to do it with a TOP 1/ORDER BY like that.
You could do it using a CASE function based on ABS(CHECKSUM(NEWID()))%3 but that will actually come up with 4 values because, since NEWID() isn't deterministic, the CASE function will recalculate each and every WHEN even if you use CASE ABS(CHECKSUM(NEWID()))%3. You could then change the formula to ABS(CHECKSUM(NEWID()))%2 and use ELSE but that will give you an uneven distribution.
So, the only thing to do is to gen the numbers that control the color separately and then CASE that number. Here's one way of doing that in a single query.
DROP TABLE #MyHead;
WITH
cteRandom AS
( --=== Generate the random number first
SELECT TOP (1000)
ID = ROW_NUMBER() OVER (ORDER BY (SELECT NULL)),
RandomColor# = ABS(CHECKSUM(NEWID()))%3
FROM sys.all_columns ac1
CROSS JOIN sys.all_columns ac2
) --=== Now, pick the color according to the generated number
SELECT ID,
RandomColor = CASE RandomColor#
WHEN 1 THEN 'Red'
WHEN 2 THEN 'Green'
ELSE 'Yellow'
END
INTO #MyHead
FROM cteRandom
;
--===== Show the distribution
SELECT RandomColor, COUNT(*)
FROM #MyHead
GROUP BY RandomColor
ORDER BY RandomColor
--Jeff Moden
Change is inevitable... Change for the better is not.
December 16, 2012 at 9:13 pm
Here's another way... same idea, though.
DROP TABLE #MyHead;
SELECT TOP (1000)
ID = ROW_NUMBER() OVER (ORDER BY (SELECT NULL)),
RandomColor = CASE RandomColor#
WHEN 1 THEN 'Red'
WHEN 2 THEN 'Green'
ELSE 'Yellow'
END
INTO #MyHead
FROM sys.all_columns ac1
CROSS JOIN sys.all_columns ac2
CROSS APPLY (SELECT ABS(CHECKSUM(NEWID()))%3) ca (RandomColor#)
;
--===== Show the distribution
SELECT RandomColor, COUNT(*)
FROM #MyHead
GROUP BY RandomColor
ORDER BY RandomColor
;
--Jeff Moden
Change is inevitable... Change for the better is not.
December 17, 2012 at 7:31 am
Jeff Moden (12/16/2012)
Here's another way... same idea, though.
awesome, it will work! thanks for the quick response, Jeff! and I am going to use this opportunity to say BIG THANK YOU for all your articles and knowledge you shared with us!
December 17, 2012 at 6:13 pm
Thanks, Boriskey :blush: I really appreciate the feedback. I aim to please... I sometimes miss but I'm always aiming. π
--Jeff Moden
Change is inevitable... Change for the better is not.
August 28, 2013 at 12:31 pm
Jeff, moving the (SELECT ABS(CHECKSUM(NEWID()))%3) from the CROSS APPLY into the CASE seems to work as well. When should CROSS APPLY be used, or is it equivalent in this case?
SELECT TOP (1000)
ID = ROW_NUMBER() OVER (ORDER BY (SELECT NULL)),
RandomColor = CASE (SELECT ABS(CHECKSUM(NEWID()))%3)
WHEN 1 THEN 'Red'
WHEN 2 THEN 'Green'
ELSE 'Yellow'
END
FROM sys.all_columns ac1
CROSS JOIN sys.all_columns ac2
;
Thanks much. I always enjoy reading your articles and comments. And I always learn or re-learn something.
August 28, 2013 at 9:25 pm
Sorry... I didn't see the new code correctly. I'll be back.
--Jeff Moden
Change is inevitable... Change for the better is not.
August 28, 2013 at 9:47 pm
John Rempel (8/28/2013)
Jeff, moving the (SELECT ABS(CHECKSUM(NEWID()))%3) from the CROSS APPLY into the CASE seems to work as well. When should CROSS APPLY be used, or is it equivalent in this case?
SELECT TOP (1000)
ID = ROW_NUMBER() OVER (ORDER BY (SELECT NULL)),
RandomColor = CASE (SELECT ABS(CHECKSUM(NEWID()))%3)
WHEN 1 THEN 'Red'
WHEN 2 THEN 'Green'
ELSE 'Yellow'
END
FROM sys.all_columns ac1
CROSS JOIN sys.all_columns ac2
;
If you look deep in the properties of some of the compute scalars in the actual execution plan, you see that this has the same problem I spoke of before. Here's where...
Scalar Operator(CASE WHEN [Expr1006]=(1) THEN 'Red' ELSE CASE WHEN [Expr1007]=(2) THEN 'Green' ELSE 'Yellow' END END)
Expr1006 and Expr1007 are separate copies of the Scalar Operator(abs(checksum(newid()))%(3)). What ends up happening is that Yellow will turn up more than the other two colors because if the first random forumula isn't a 1, then it calculates a new random formula. Each random number generator only has a 1 out of 3 chance of finding its mark. That means that 2/3rds of the time, each will miss its mark. I don't know what that works out to odds wise (no coffee in the last 5 hours), but it means that Yellow will always come out with the most hits followed by Red, followed by Green. The CROSS APPLY doesn't have the same problem.
Thanks much. I always enjoy reading your articles and comments. And I always learn or re-learn something.
Thank you very much for the feedback. I'm glad that I can return something to the community that has helped me so much.
--Jeff Moden
Change is inevitable... Change for the better is not.
August 29, 2013 at 4:35 am
boriskey (12/16/2012)
Jeff, thanks for the great articles (part 1 and part 2 and look forward to part3).Is there a way to pick a random value (like a color) from a subquery? here is what I tried but getting the same value on every run:
SELECT TOP (100)
ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) as ID,--Sequential number from 1 to ..
(SELECT top 1 color FROM ( VALUES (0,'Red'),(1,'Green'),(2,'Yellow') ) colors(id,color) ORDER BY NEWID()) as RandomColor,
FROM sys.all_columns ac1
CROSS JOIN sys.all_columns ac2
but if I run this piece alone, I am getting different colors:
SELECT top 1 color FROM ( VALUES (0,'Red'),(1,'Green'),(2,'Yellow') ) colors(id,color) ORDER BY NEWID()
You have to make two changes to your subquery to make this work:
1. Correlate it to the outer query, or it will run once per query rather than once per row
2. Make it do something early on, or it will still get "optimized leftwards" (what the hell am I talking about? Is there a proper term for this?)
What you are looking for in the constant scan operator at the rightmost end of the subquery branch is 'Number of Executions = 1000' and 'Actual number of Rows' = 3000.
Something like this:
SELECT TOP 1000 ID = ROW_NUMBER() OVER (ORDER BY (SELECT NULL)),
x.*
FROM sys.all_columns ac1
OUTER APPLY (
SELECT TOP 1
RandomColor = color,
IDi = ac1.object_id + ac1.column_id + ROW_NUMBER() OVER (ORDER BY (SELECT NULL))
FROM (VALUES (0,'Red'),(1,'Green'),(2,'Yellow')) colors (id,color)
ORDER BY NEWID()
) x
If you remove IDi from the outer SELECT, the whole optimiser spoof collapses π
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
Viewing 11 posts - 31 through 40 (of 40 total)
You must be logged in to reply to this topic. Login to reply