May 24, 2011 at 12:57 pm
Hi All,
I am trying to populate a column in my table with data from a set of potential values. To do that I am using the following as part of the INSERT statement:
SELECT
CASE (ABS(CAST(CAST(NEWID() AS VARBINARY) AS INT)) % 10)
WHEN 0 THEN 22000
WHEN 1 THEN 10600
WHEN 2 THEN 10300
WHEN 3 THEN 19300
WHEN 4 THEN 11700
WHEN 5 THEN 10600
WHEN 6 THEN 19500
WHEN 7 THEN 10000
WHEN 8 THEN 15000
WHEN 9 THEN 900
WHEN 10 THEN 2000
ELSE 666
END
For some reason the 'ELSE' value (666) is by far the most common value inserted. My expectation was that the ELSE value would never have been inserted. In theory, the '% 10' should return values for 0- 9, and I have each of those CASES coded.
Any idea what I am missing? Thank you.
BTW: SQL Server 2008 R2
May 24, 2011 at 1:23 pm
I have not encountered a single else case with this statement listed below
Declare @myvarint
set @myvar= CAST(CAST(NEWID() AS VARBINARY) AS INT)
SELECT @myvar,
CASE (ABS(@myvar) % 10)
WHEN 0 THEN 22000
WHEN 1 THEN 10600
WHEN 2 THEN 10300
WHEN 3 THEN 19300
WHEN 4 THEN 11700
WHEN 5 THEN 10600
WHEN 6 THEN 19500
WHEN 7 THEN 10000
WHEN 8 THEN 15000
WHEN 9 THEN 900
WHEN 10 THEN 2000
ELSE 666
END
---------------------------------------------------------------------------------------
It begins by taking the first step.
May 24, 2011 at 1:30 pm
Thanks for your quick reply.
I see the exact same behavior you describe when I set "(ABS(CAST(CAST(NEWID() AS VARBINARY) AS INT)) % 10)" to a parameter and then test the parameter.
However, if i dont use the parameter, then i get the unexpected 'ELSE' values.
What happens if yout dont use the paramenter and instead use "(ABS(CAST(CAST(NEWID() AS VARBINARY) AS INT)) % 10)" as the test expression?
thanks again.
May 24, 2011 at 1:32 pm
I was kind of intrigued by what you say happened so I had to run your code and discovered that you are correct with the way are doing it. I changed you query a little to get 1,000 records and see what you are seeing. So i tossed it into a ***cough***loop***cough*** to make it easy to test and got exactly none to hit the else.
first is your select to return 1,000 records.
SELECT top 1000
CASE (ABS(CAST(CAST(NEWID() AS VARBINARY) AS INT)) % 10)
WHEN 0 THEN 22000
WHEN 1 THEN 10600
WHEN 2 THEN 10300
WHEN 3 THEN 19300
WHEN 4 THEN 11700
WHEN 5 THEN 10600
WHEN 6 THEN 19500
WHEN 7 THEN 10000
WHEN 8 THEN 15000
WHEN 9 THEN 900
WHEN 10 THEN 2000
ELSE 666
END as NewVal
from sysobjects
order by NewVal
This got a lot of else condition but can't really figure out why.
However, to test this for real I did the following.
create table #Values
(
IntVal int,
NewVal int,
ID uniqueidentifier
)
declare @ID uniqueidentifier
declare @Counter int = 1
while @Counter <= 1000000
begin
set @ID = NEWID()
insert #Values
SELECT
ABS(CAST(CAST(@ID AS VARBINARY) AS INT)) % 10,
CASE (ABS(CAST(CAST(@ID AS VARBINARY) AS INT)) % 10)
WHEN 0 THEN 22000
WHEN 1 THEN 10600
WHEN 2 THEN 10300
WHEN 3 THEN 19300
WHEN 4 THEN 11700
WHEN 5 THEN 10600
WHEN 6 THEN 19500
WHEN 7 THEN 10000
WHEN 8 THEN 15000
WHEN 9 THEN 900
WHEN 10 THEN 2000
ELSE 666
END as Val, @ID
set @Counter = @Counter + 1
end
select IntVal, NewVal, COUNT(*) from #Values group by IntVal, NewVal
drop table #Values
This is not exactly fast but i ran it about 20 times and it never returned 666. Interesting that the distribution is fairly consistent across all numbers each time too.
_______________________________________________________________
Need help? Help us help you.
Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.
Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.
Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/
May 24, 2011 at 2:04 pm
The only possible explanation I can give is it's falling into a "When" weirdness case.
May be the "When" invokes the newid() for each "when" and if the case has already passed then it falls into the Else case.
Which is exactly why when passing it to a variable does not make the when call newid() every time and the "else" case is not encountered.
Which prompts me for another question for you. Why are you using this logic, is it for randomization then use the RAND function.
Thanks
---------------------------------------------------------------------------------------
It begins by taking the first step.
May 24, 2011 at 2:17 pm
Basically this is how the CASE is executed by SQL Server...
SELECT CASE
WHEN (ABS(CAST(CAST(NEWID() AS VARBINARY) AS INT)) % 10) = 0 THEN 22000
WHEN (ABS(CAST(CAST(NEWID() AS VARBINARY) AS INT)) % 10) = 1 THEN 10600
WHEN (ABS(CAST(CAST(NEWID() AS VARBINARY) AS INT)) % 10) = 2 THEN 10300
WHEN (ABS(CAST(CAST(NEWID() AS VARBINARY) AS INT)) % 10) = 3 THEN 19300
WHEN (ABS(CAST(CAST(NEWID() AS VARBINARY) AS INT)) % 10) = 4 THEN 11700
WHEN (ABS(CAST(CAST(NEWID() AS VARBINARY) AS INT)) % 10) = 5 THEN 10600
WHEN (ABS(CAST(CAST(NEWID() AS VARBINARY) AS INT)) % 10) = 6 THEN 19500
WHEN (ABS(CAST(CAST(NEWID() AS VARBINARY) AS INT)) % 10) = 7 THEN 10000
WHEN (ABS(CAST(CAST(NEWID() AS VARBINARY) AS INT)) % 10) = 8 THEN 15000
WHEN (ABS(CAST(CAST(NEWID() AS VARBINARY) AS INT)) % 10) = 9 THEN 900
WHEN (ABS(CAST(CAST(NEWID() AS VARBINARY) AS INT)) % 10) = 10 THEN 2000
ELSE 666
END as NewVal
.
.
.
...and you're hitting the ELSE most of the time because the individual executions generate a new value based on NEWID and do not hit the right hand target of the equality statement on that particular CASE condition.
Credit to this post where I initially learned of the behavior:
http://www.sqlservercentral.com/Forums/Topic989646-338-1.aspx
There are no special teachers of virtue, because virtue is taught by the whole community.
--Plato
May 24, 2011 at 2:25 pm
Excellent. I had not dug down to quite that level of understanding and was assuming it due to the NEWID() inside the case, especially since i was interested to see what values would product the seemingly impossible case else. The only way to do that is to move the function outside the select so you can see the guid and the result of the case at the same time. 😉 Chock up another interesting bit of learning to Jeff.
_______________________________________________________________
Need help? Help us help you.
Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.
Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.
Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/
May 24, 2011 at 3:19 pm
I recently found out that when using RAND in a SELECT statement, the RAND number is always the same. That's why i am not using RAND.
Here is the link where i discovered that tidbit: http://www.sql-server-helper.com/tips/generate-random-numbers.aspx.
@opc.three et al.
Thanks for the explanation; i too learned something new. But that leaves me stuck with my original problem: How do I insert - in a random-ish order - values from a fixed set of possible values?
Below is the code that I am using, which is not working as I hoped it would. Basically, I have a table of 'locations'. I want to create some 'segments' where each segment has a 'begin' location, a 'end' location, and an ADT value. I am hoping to insert an ADT value that is randomly selected from a fixed set of possible values. My current code:
-- Locations are Mile points along a highway.
-- a given location has a key identifying the highway and a 'measure' along the highway.
-- the measure is not the same as the mile point
CREATE TABLE [Location](
[LOC_ID] [varchar](24) NOT NULL,
[LRS_KEY] [int] NOT NULL,
[LRS_MEAS] [decimal](6, 2) NOT NULL,
[MP_NO] [decimal](5, 2) NULL
)
GO
-- a segment is defined as having a begin point and an end point
-- the begin and end points are 'Locations', and cannot be the same location
CREATE TABLE [Segment](
[SEG_ID] [varchar](24) NOT NULL,
[BEG_LOC_ID] [varchar](24) NOT NULL,
[END_LOC_ID] [varchar](24) NOT NULL,
[SEG_ACTL_LEN] [decimal](6, 2) NOT NULL,
[SEG_LEN_TYP] [decimal](3, 2) NOT NULL,
[ADT_WT] [int] NOT NULL
)
GO
ALTER TABLE [Segment] WITH CHECK ADD CONSTRAINT [FK_SEG__END_LOC_ID] FOREIGN KEY([END_SPIS_LOC_ID])
REFERENCES [Location] ([SPIS_LOC_ID])
GO
ALTER TABLE [Segment] WITH CHECK ADD CONSTRAINT [FK_SEG__BEG_LOC_ID] FOREIGN KEY([BEG_SPIS_LOC_ID])
REFERENCES [Location] ([SPIS_LOC_ID])
GO
-- insert some new segments
-- get the locations from two stretches of highway, sorted by their measure values
WITH CTE AS
(SELECT LOC_ID, MP_NO,
ROW_NUMBER() OVER (ORDER BY LRS_MEAS) AS RN
FROM [Location]
WHERE LRS_KEY in (100001, 100002))
-- create some segments based on the locations from above
INSERT INTO [Segment]
(
SEG_ID,
BEG_LOC_ID,
END_LOC_ID,
SEG_ACTL_LEN,
SEG_LEN_TYP,
ADT_WT
)
SELECT
C.RN,
C.LOC_ID,
endLoc.LOC_ID,
endLoc.MP_NO - C.MP_NO,
0.10,
-- insert a value for ADT randomly selected from a fixed list of possible values
CASE (ABS(CAST(CAST(NEWID() AS VARBINARY) AS INT)) % 10)
WHEN 0 THEN 22000
WHEN 1 THEN 10600
WHEN 2 THEN 10300
WHEN 3 THEN 19300
WHEN 4 THEN 11700
WHEN 5 THEN 10600
WHEN 6 THEN 19500
WHEN 7 THEN 10000
WHEN 8 THEN 15000
WHEN 9 THEN 900
WHEN 10 THEN 2000
ELSE 666
END
FROM CTE C INNER JOIN
[Location] endLoc on C.MP_NO + 0.09 = endLoc.MP_NO
ORDER BY C.RN
May 24, 2011 at 3:38 pm
How about something along these lines:
WITH cte(test_num)
AS (
SELECT TOP 1000
(ABS(CAST(CAST(NEWID() AS VARBINARY) AS INT)))
FROM master.sys.objects o1
CROSS JOIN master.sys.objects o2
)
SELECT test_num % 10 as number_to_use_in_case
FROM cte ;
There are no special teachers of virtue, because virtue is taught by the whole community.
--Plato
May 24, 2011 at 3:44 pm
I was playing with this problem for about an hour before reading the excellent observation by opc.three. I also like his pre-calculation in the cte above.
As another alternative, you could also create a UDF
create function RandomADT (@a int)
returns int
as
begin
return CASE @a
WHEN 0 THEN 22000
WHEN 1 THEN 10600
WHEN 2 THEN 10300
WHEN 3 THEN 19300
WHEN 4 THEN 11700
WHEN 5 THEN 10600
WHEN 6 THEN 19500
WHEN 7 THEN 10000
WHEN 8 THEN 15000
WHEN 9 THEN 900
WHEN 10 THEN 2000
ELSE 666
END
end
and call it in your query like so:
INSERT INTO [Segment]
(
SEG_ID,
BEG_LOC_ID,
END_LOC_ID,
SEG_ACTL_LEN,
SEG_LEN_TYP,
ADT_WT
)
SELECT
C.RN,
C.LOC_ID,
endLoc.LOC_ID,
endLoc.MP_NO - C.MP_NO,
0.10,
dbo.RandomADT((ABS(CAST(CAST(NEWID() AS VARBINARY) AS INT)) % 10))
FROM CTE C INNER JOIN
[Location] endLoc on C.MP_NO + 0.09 = endLoc.MP_NO
ORDER BY C.RN
______________________________________________________________________________
How I want a drink, alcoholic of course, after the heavy lectures involving quantum mechanics.
May 24, 2011 at 3:58 pm
Thank you all for your help. I took opc.three's suggestion and moved the newid() computation into the CTE. Now it works like hoped for.
Thanks again.
WITH CTE AS
(SELECT LOC_ID, MP_NO,
ROW_NUMBER() OVER (ORDER BY LRS_MEAS) AS RN, ABS(CAST(CAST(NEWID() AS VARBINARY) AS INT)) AS adt_base
FROM [Location]
WHERE LRS_KEY in (100001, 100002))
-- create some segments based on the locations from above
INSERT INTO [Segment]
(
SEG_ID,
BEG_LOC_ID,
END_LOC_ID,
SEG_ACTL_LEN,
SEG_LEN_TYP,
ADT_WT
)
SELECT
C.RN,
C.LOC_ID,
endLoc.LOC_ID,
endLoc.MP_NO - C.MP_NO,
0.10,
-- insert a value for ADT randomly selected from a fixed list of possible values
CASE (adt_base % 10)
WHEN 0 THEN 22000
WHEN 1 THEN 10600
WHEN 2 THEN 10300
WHEN 3 THEN 19300
WHEN 4 THEN 11700
WHEN 5 THEN 10600
WHEN 6 THEN 19500
WHEN 7 THEN 10000
WHEN 8 THEN 15000
WHEN 9 THEN 900
WHEN 10 THEN 2000
ELSE 10730
END
FROM CTE C INNER JOIN
[Location] endLoc on C.MP_NO + 0.09 = endLoc.MP_NO
ORDER BY C.RN
Viewing 11 posts - 1 through 10 (of 10 total)
You must be logged in to reply to this topic. Login to reply