RAND Unions

Question

RAND Unions

Viewing 15 posts - 16 through 30 (of 30 total)

You must be logged in to reply to this topic. Login to reply

Koen Verbeeck SSC Guru Points: 259115 More actions · Answer 1

gautamcs22 (9/29/2014)
and what is the possibility if the each rand() return different result ?

Zero, as the first RAND() is seeded.

Need an answer? No, you need a question
My blog at https://sqlkover.com.
MCSE Business Intelligence - Microsoft Data Platform MVP

andrew.ing SSC Eights! Points: 853 More actions · Answer 2

Something I've never quite understood is why RAND() is like it is - that is, contrary to what perhaps most people would want and expect: creating a (pseudo) random number each time or row it is called, analagous to NEWID().

If you need a random number to be generated once and then reused for every row there are explicit ways of doing this, say, with variables. But to get a random number generated on each call requires what look like hacks - creating seeds based on time or other table columns.

I expect the behaviour is a hangover from older language convention; Help says it mimics C. And it can't be changed; but would it be worth asking for RAND2(), with a sort of 'does what it says on the tin' promise?

RLilj33 SSCrazy Points: 2153 More actions · Answer 3

Must be Monday. Read it as how many columns instead of how many rows. Fortunately, I stopped for some coffee on my way in to work and was able to force the eyelids open enough to re-read how many rows.

These 2 points are compliments of my local Kwik Trip store... have a happy National Coffee Day everyone!

SQLRNNR SSC Guru Points: 281334 More actions · Answer 4

andrew.ing (9/29/2014)
Something I've never quite understood is why RAND() is like it is - that is, contrary to what perhaps most people would want and expect: creating a (pseudo) random number each time or row it is called, analagous to NEWID().
If you need a random number to be generated once and then reused for every row there are explicit ways of doing this, say, with variables. But to get a random number generated on each call requires what look like hacks - creating seeds based on time or other table columns.
I expect the behaviour is a hangover from older language convention; Help says it mimics C. And it can't be changed; but would it be worth asking for RAND2(), with a sort of 'does what it says on the tin' promise?

Surely there are some developers at microsoft that could correct it or at least add an alternative as you say. It must be too low on the "fix" list.

Jason...AKA CirqueDeSQLeil
_______________________________________________
I have given a name to my pain...MCM SQL Server, MVP
SQL RNNR
Posting Performance Based Questions - Gail Shaw[/url]
Learn Extended Events

Marcia J SSCertifiable Points: 5649 More actions · Answer 5

TomThomson (9/28/2014)
Good question.
Just like the last question on Random, I think the explanation could be improved.
First, the misleading "for a connection" is there again. It shouldn't be. What RAND returns for a given seed is fixed independent of teh connection.
The second is that the explanation doesn't explan why anything but the first coulum is the same in each of the three components of the union. The reason is that a RAND call with an explicit non-null seed determines the sequence of values generated by that RAND and all subsequent RAND without an explicit seed parameter in the same connection before the next RAND call with an explicit non-null seed parameter.
RAND calls with an explicit NULL seed parameter aren't relevant to todays question, but in case anyone wonders what there effect is it's quite simple: deliver NULL as result but they don't affect the sequence for calls with no explicit parameter. For example the only difference in the values delivered by the two call sequences
RAND(100), RAND(), RAND() and
RAND(100), RAND(), RAND(NULL), RAND()
is that the second sequence has a NULL in third position, in between the 2nd and 3rd values produced by the first sequence.

Thanks for the additional explanation. I knew that RAND(100) would always return the same value. I missed that non-seeded calls to RAND would be affected by the prior seeded call.

Bill Talada SSChampion Points: 11954 More actions · Answer 6

A different rand() value per row is possible with a little bit of convoluted work:

CREATE VIEW rnd AS SELECT RAND() AS nbr;

CREATE FUNCTION GetRnd() RETURNS real AS

BEGIN

DECLARE @r real

SET @r = (select nbr from rnd);

RETURN @r;

END

GO

SELECT name, dbo.GetRnd() AS n

FROM sys.tables t

GO

gbritton1 SSCertifiable Points: 6522 More actions · Answer 7

I just executed this against the AdventureWorks2012 database hosted by RedGate on Azure. I got one row. Changing the UNION to UNION ALL, I get:

(No column name)(No column name)(No column name)

0.7154366573674850.284633807679820.0131039082850364

As you can see, the three rows are identical, so you get one row with a plain UNION

stormsentinelcammy SSCrazy Points: 2493 More actions · Answer 8

i totally read the question wrong, for some reason in my head i was thinking how many columns since that was the first thing i was thinking about, but the question was asking about how many rows, i got a super easy answer wrong :(, good question in any case

Steve Jones - SSC Editor SSC Guru Points: 736478 More actions · Answer 9

stormsentinelcammy (9/29/2014)
i totally read the question wrong, for some reason in my head i was thinking how many columns since that was the first thing i was thinking about, but the question was asking about how many rows, i got a super easy answer wrong :(, good question in any case

My apologies. A few people got caught here. I shouldn't have 3 columns and 3 rows as it can confuse. I'll change the question to 2 columns and 4 rows. Hopefully that makes it more a focus on RAND and UNION.

stephen.long.1 SSCrazy Points: 2582 More actions · Answer 10

Steve, thanks for the two points on a (relatively) easy question. It's a good way to start off the work week. 🙂

TomThomson SSC Guru Points: 104773 More actions · Answer 11

SQLRNNR (9/29/2014)
andrew.ing (9/29/2014)
Something I've never quite understood is why RAND() is like it is - that is, contrary to what perhaps most people would want and expect: creating a (pseudo) random number each time or row it is called, analagous to NEWID().
If you need a random number to be generated once and then reused for every row there are explicit ways of doing this, say, with variables. But to get a random number generated on each call requires what look like hacks - creating seeds based on time or other table columns.
I expect the behaviour is a hangover from older language convention; Help says it mimics C. And it can't be changed; but would it be worth asking for RAND2(), with a sort of 'does what it says on the tin' promise?
Surely there are some developers at microsoft that could correct it or at least add an alternative as you say. It must be too low on the "fix" list.

Asking for a function to return a different value in each row seems a little strange. Do you really want to have something like INSERT Pig (InsertTime, Principal, PigID, PigValue) SELECT GETDATE(), Principal, PigID, Value from PendingPig to call GETDATE() once for each of the potentionally 1000s of rows inserted? If not, why should RAND() be any different?

Adding an alternative function that behaves like NEWID() would make some things easier, perhaps, but how often is it needed? Far more useful (although it wouldn;t address that issue at all) would be a source of genuine random numbers to use when a pseudo-random sequence isn't good enough, but I guess that one has to wait for hardware to become standard.

Tom

Michael Poppers SSCrazy Points: 2151 More actions · Answer 12

stephen.long.1 (9/29/2014)
Steve, thanks for the two points on a (relatively) easy question. It's a good way to start off the work week. 🙂

+1

akljfhnlaflkj SSC Guru Points: 76202 More actions · Answer 13

akljfhnlaflkj

SSC Guru

Points: 76202

September 29, 2014 at 12:59 pm

#1748913

+1

akljfhnlaflkj SSC Guru Points: 76202 More actions · Answer 14

I got caught up with that keyword versus with an ALL just last week so it was fresh on my mind.

andrew.ing SSC Eights! Points: 853 More actions · Answer 15

TomThomson (9/29/2014)
SQLRNNR (9/29/2014)
andrew.ing (9/29/2014)
Something I've never quite understood is why RAND() is like it is - that is, contrary to what perhaps most people would want and expect: creating a (pseudo) random number each time or row it is called, analagous to NEWID().
If you need a random number to be generated once and then reused for every row there are explicit ways of doing this, say, with variables. But to get a random number generated on each call requires what look like hacks - creating seeds based on time or other table columns.
I expect the behaviour is a hangover from older language convention; Help says it mimics C. And it can't be changed; but would it be worth asking for RAND2(), with a sort of 'does what it says on the tin' promise?
Surely there are some developers at microsoft that could correct it or at least add an alternative as you say. It must be too low on the "fix" list.
Asking for a function to return a different value in each row seems a little strange. Do you really want to have something like INSERT Pig (InsertTime, Principal, PigID, PigValue) SELECT GETDATE(), Principal, PigID, Value from PendingPig to call GETDATE() once for each of the potentionally 1000s of rows inserted? If not, why should RAND() be any different?
Adding an alternative function that behaves like NEWID() would make some things easier, perhaps, but how often is it needed? Far more useful (although it wouldn;t address that issue at all) would be a source of genuine random numbers to use when a pseudo-random sequence isn't good enough, but I guess that one has to wait for hardware to become standard.

That's interesting. I hadn't considered whether GETDATE() was evaluated per row or once for the whole statement. I think I assumed it was evaluated per row - like NEWID() and unlike RAND() (although I might have felt that RAND() was evaluated but returned a static value).

To turn your example round, I wouldn't feel comfortable writing

SELECT PigId, PigName, RAND() FROM ...

even though my intention was, "Generate one random number then use that for every row". The very name of the function is apt to mislead. And I don't know whether I'd trust it to keep the same value - there might be another rule I wasn't aware of. For clarity I'd probably rephrase it using a variable; it would make my intention clearer.

It's not a big deal of course, but the confusion it causes (hence this QOTD) seems as if it could have been easily averted at the (long past) design stage.

One use I had was to assign a random sort order value to a set of rows. I had to go through the usual hoops to make RAND() random.