January 3, 2014 at 2:37 am
pietlinden (1/2/2014)
Jeff,I think that's how I got the random number of records per enrollment so that the distribution is irregular. I noticed that Chris's code returns the same number of results per Protocol. I guess this is where I have to re-read parts of Guru's Guide to T-SQL - might be an old book, but it's easy to follow and he deals with a lot of the standard problems in T-SQL (like this one).
Think I'll read that and post back. Otherwise I may have to resort to the dreaded cursor... (might have to try it just to see how bad it is!) Might work okay as I only need it to run once ...
Pieter
The sample code works, and I know the theory behind it works too - I regularly use something similar for returning a random number of rows at random from a list of words to construct a random sentence:
-- Returns about 100,000 rows per second
WITH
[4] AS (SELECT n = 0 FROM (VALUES (0), (0), (0), (0)) d (n)),
[16] AS (SELECT n = 0 FROM [4] a, [4] b),
[256] AS (SELECT n = 0 FROM [16] a, [16] b),
[65536] AS (SELECT n = 0 FROM [256] a, [256] b),
iTally AS (SELECT n = ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) FROM [65536] a, [16] b)
SELECT TOP 10000
-- Shamelessly nicked from JBM's random data generator
t.n,
-- 10 years worth of dates ranging from 1/1/2000 to 12/31/2009
[RandomDate] = CAST(RAND(CHECKSUM(NEWID()))*3653.0+36524.0 AS DATETIME),
-- 100 different account numbers
[RandomAccount] = ABS(CHECKSUM(NEWID()))%100+1,
-- Amounts from -99.99 to + 99.99
[RandomValue] = CAST(CHECKSUM(NEWID())%10000 /100.0 AS MONEY),
-- Randomised number of words picked randomly from list
iTVF.Sentence
FROM iTally t
CROSS APPLY ( -- note 'n' in the TOP expression forces evaluation for every row
SELECT TOP ((t.n+ABS(CHECKSUM(NEWID())))%10+1)
[data()] = word
FROM (VALUES ('the'),('quick'), ('brown'), ('fox'), ('jumped'), ('over'), ('the'), ('lazy'), ('dog')) Words (word)
ORDER BY NEWID()
FOR XML PATH('')
) iTVF(Sentence)
I'm not sure what the problem is but if you post up the code you've used I'm sure we can work something out.
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
January 3, 2014 at 2:39 am
Jeff Moden (1/2/2014)
You can also just create a view called (for example) ANewID that generates a GUID as a single column, single row result and simply join to it as a CROSS JOIN or CROSS APPLY.
Absolutely, it works a treat and there are less hoops to go through than black-boxing the works inside an IF.
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
January 3, 2014 at 8:25 am
ChrisM@Work (1/3/2014)
Jeff Moden (1/2/2014)
You can also just create a view called (for example) ANewID that generates a GUID as a single column, single row result and simply join to it as a CROSS JOIN or CROSS APPLY.Absolutely, it works a treat and there are less hoops to go through than black-boxing the works inside an IF.
It also prevents resorting to cursors of any kind. 😀
--Jeff Moden
Change is inevitable... Change for the better is not.
January 3, 2014 at 8:29 am
Jeff Moden (1/3/2014)
ChrisM@Work (1/3/2014)
Jeff Moden (1/2/2014)
You can also just create a view called (for example) ANewID that generates a GUID as a single column, single row result and simply join to it as a CROSS JOIN or CROSS APPLY.Absolutely, it works a treat and there are less hoops to go through than black-boxing the works inside an IF.
It also prevents resorting to cursors of any kind. 😀
Some of my best friends are cursers 😀
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
January 8, 2014 at 9:28 pm
Thanks for the ideas, everybody.
I got it to work, which is good, but I had to resort to <whisper>a dreaded cursor.</whisper> I know, it's a total cheat, but I'm just trying to pump a table full of fake data. I ended up resorting to the Dark Side because I couldn't get the random number stuff working any other way. (Maybe I just did something wrong?)
Here's how I did it... (not entirely sure the CROSS APPLY was necessary... but that's what happens when you Frankenstein a solution together.) In all fairness, except for the CROSS APPLY, I understood it all. I did a bunch of reading on CROSS APPLY, but apparently not enough to get it through my head.
Here's what I came up with (warning, cursor ahead!):
ALTER PROC AddSomeSymptoms
@EnrollmentID INT
AS
BEGIN
SET NOCOUNT ON;
DECLARE @NextCycle INT;
-- get the next cycle for this enrollment.
SELECT @NextCycle = COALESCE(MAX(Cycle),0)+1
FROM Symptom
WHERE enrollmentID = @EnrollmentID;
-- Is enrollee dead?
IF NOT EXISTS (SELECT 1
FROM Symptom s INNER JOIN Data d ON s.symptomID = d.ID
WHERE d.Grade = 5
AND s.enrollmentID = @EnrollmentID)
BEGIN
-- get next cycle for this enrollee
INSERT INTO Symptom(Cycle, EnrollmentID, SymptomID)
SELECT
@NextCycle AS Cycle
, @EnrollmentID
, ID
/*, n
, Category
, Subcategory
, Term
, Grade
*/
FROM (SELECT n=1 /*UNION ALL SELECT 2*/) d
CROSS APPLY dbo.IF_GetRandomSymptoms3 (d.n,@EnrollmentID)
ORDER BY n;
END
SET NOCOUNT OFF;
END
I guess my only questions left are:
(1) If I do a terrible thing and use a cursor, can I tell if an enrollee is deceased (has a Grade 5) without having to test for existence each time?
(2) Do I even need the Cross Apply?
I just couldn't figure out how to get the AddSymptom stuff to work for all the cycles and get different numbers of symptoms each time. Pretty much everything I tried didn't like using non-deterministic functions (Table-Valued Functions), so that's why I resorted to the Dark Side...
Thanks for the schooling!,
Pieter
January 9, 2014 at 3:14 am
pietlinden (1/8/2014)
Thanks for the ideas, everybody.I got it to work, which is good, but I had to resort to <whisper>a dreaded cursor.</whisper> I know, it's a total cheat, but I'm just trying to pump a table full of fake data. I ended up resorting to the Dark Side because I couldn't get the random number stuff working any other way. (Maybe I just did something wrong?)
You gave up?
Here's how I did it... (not entirely sure the CROSS APPLY was necessary... but that's what happens when you Frankenstein a solution together.) In all fairness, except for the CROSS APPLY, I understood it all. I did a bunch of reading on CROSS APPLY, but apparently not enough to get it through my head.
APPLY is similar to JOIN. Paul White's excellent articles are linked in my sig. Run through his examples and you're done.
Here's what I came up with (warning, cursor ahead!):
ALTER PROC AddSomeSymptoms
@EnrollmentID INT
AS
BEGIN
SET NOCOUNT ON;
DECLARE @NextCycle INT;
-- get the next cycle for this enrollment.
SELECT @NextCycle = COALESCE(MAX(Cycle),0)+1
FROM Symptom
WHERE enrollmentID = @EnrollmentID;
-- Is enrollee dead?
IF NOT EXISTS (SELECT 1
FROM Symptom s INNER JOIN Data d ON s.symptomID = d.ID
WHERE d.Grade = 5
AND s.enrollmentID = @EnrollmentID)
BEGIN
-- get next cycle for this enrollee
INSERT INTO Symptom(Cycle, EnrollmentID, SymptomID)
SELECT
@NextCycle AS Cycle
, @EnrollmentID
, ID
/*, n
, Category
, Subcategory
, Term
, Grade
*/
FROM (SELECT n=1 /*UNION ALL SELECT 2*/) d
CROSS APPLY dbo.IF_GetRandomSymptoms3 (d.n,@EnrollmentID)
ORDER BY n;
END
SET NOCOUNT OFF;
END
I guess my only questions left are:
(1) If I do a terrible thing and use a cursor, can I tell if an enrollee is deceased (has a Grade 5) without having to test for existence each time?
(2) Do I even need the Cross Apply?
I just couldn't figure out how to get the AddSymptom stuff to work for all the cycles and get different numbers of symptoms each time. Pretty much everything I tried didn't like using non-deterministic functions (Table-Valued Functions), so that's why I resorted to the Dark Side...
I'd be more than happy to help you with solving the problems you were experiencing with a set-based solution, but not with the fallback. It's a little too Vichy France.
Thanks for the schooling!,
Pieter
I can't help thinking you're skipping a class...
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
January 9, 2014 at 5:56 am
I probably am skipping a class. The challenge I couldn't get around was this one:
the "continue adding treatment cycles (groups of "symptom" records) until either an enrolled patient has a grade 5 (fatal) or the patient completes some number of cycles of treatment.
I tried a TVF, but it wouldn't allow any non-determinant functions (like NEWID). Everything else I came up with had a fixed number of records per treatment cycle.
I actually did read Paul's articles (Cross/Outer Apply I & II). I even remember he was calculating averages of grades... But I couldn't get it to do a random TOP VALUES query with different random @TOP values for each enrollee. So eventually I resorted to cheating off the guy next to me and using a cursor.
January 9, 2014 at 6:11 am
Try again with the variable number of rows. Here's a script to run in tempdb (so it won't screw anything up). After running the whole script, rerun the final query a few times.
USE TEMPDB
-- sample table
SELECT TOP (50000)
ID = ROW_NUMBER() OVER (ORDER BY (SELECT NULL)),
[Version] = 'CTCAE v4.03 (MedDRAv12.0)',
Grade = 1+ABS(CHECKSUM(NEWID()))%5
INTO Data -- mockup of symptom table
FROM SYSCOLUMNS a, SYSCOLUMNS b
go
CREATE VIEW vwRandomSymptoms AS
SELECT TOP(1+ABS(CHECKSUM(NEWID()))%10) -- between 1 and 10 rows
*
FROM (
SELECT * FROM (SELECT TOP 449 * FROM Data WHERE [Version] = 'CTCAE v4.03 (MedDRAv12.0)' AND Grade = 1 ORDER BY NEWID()) d
UNION ALL
SELECT * FROM (SELECT TOP 300 * FROM Data WHERE [Version] = 'CTCAE v4.03 (MedDRAv12.0)' AND Grade = 2 ORDER BY NEWID()) d
UNION ALL
SELECT * FROM (SELECT TOP 200 * FROM Data WHERE [Version] = 'CTCAE v4.03 (MedDRAv12.0)' AND Grade = 3 ORDER BY NEWID()) d
UNION ALL
SELECT * FROM (SELECT TOP 50 * FROM Data WHERE [Version] = 'CTCAE v4.03 (MedDRAv12.0)' AND Grade = 4 ORDER BY NEWID()) d
UNION ALL
SELECT * FROM (SELECT TOP 1 * FROM Data WHERE [Version] = 'CTCAE v4.03 (MedDRAv12.0)' AND Grade = 5 ORDER BY NEWID()) d
) e
ORDER BY NEWID()
GO
CREATE FUNCTION IF_GetRandomSymptoms
(
@Param INT
)
RETURNS TABLE
AS
RETURN
(
SELECT
v.*,
CycleNo = ROW_NUMBER() OVER(ORDER BY (SELECT NULL))
FROM vwRandomSymptoms v
WHERE v.ID = (v.ID * @Param/@Param)
)
GO
-- usage
SELECT *
FROM (SELECT n = 1 UNION ALL SELECT 2 UNION ALL SELECT 3) d
CROSS APPLY dbo.IF_GetRandomSymptoms (d.n)
ORDER BY n
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
January 9, 2014 at 6:37 am
Here's another method using just a view as Jeff suggested. Same table as previously, different view.
CREATE VIEW vwSymptoms AS
SELECT * FROM (SELECT TOP 449 * FROM Data WHERE [Version] = 'CTCAE v4.03 (MedDRAv12.0)' AND Grade = 1 ORDER BY NEWID()) d
UNION ALL
SELECT * FROM (SELECT TOP 300 * FROM Data WHERE [Version] = 'CTCAE v4.03 (MedDRAv12.0)' AND Grade = 2 ORDER BY NEWID()) d
UNION ALL
SELECT * FROM (SELECT TOP 200 * FROM Data WHERE [Version] = 'CTCAE v4.03 (MedDRAv12.0)' AND Grade = 3 ORDER BY NEWID()) d
UNION ALL
SELECT * FROM (SELECT TOP 50 * FROM Data WHERE [Version] = 'CTCAE v4.03 (MedDRAv12.0)' AND Grade = 4 ORDER BY NEWID()) d
UNION ALL
SELECT * FROM (SELECT TOP 1 * FROM Data WHERE [Version] = 'CTCAE v4.03 (MedDRAv12.0)' AND Grade = 5 ORDER BY NEWID()) d
GO
-- usage
SELECT *
FROM (SELECT EnrollmentID = 1 UNION ALL SELECT 2 UNION ALL SELECT 3) d
CROSS APPLY (
SELECT TOP(1+(d.EnrollmentID+ABS(CHECKSUM(NEWID())))%10)
ID, [Version], Grade
FROM vwSymptoms
ORDER BY NEWID()
) s
ORDER BY d.EnrollmentID
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
January 9, 2014 at 9:06 am
Cruel Taskmaster <g>,
I have a dumb question - if I use a non-looping solution, how do I indicate that no more treatment cycles should be added for an enrollee after he has a grade 5 symptom?
thanks!
Pieter
January 9, 2014 at 9:13 am
pietlinden (1/9/2014)
Cruel Taskmaster <g>,I have a dumb question - if I use a non-looping solution, how do I indicate that no more treatment cycles should be added for an enrollee after he has a grade 5 symptom?
thanks!
Pieter
I knew you were going to ask that! I'd suggest you ignore it for the moment and concentrate on a functional subsystem with the intention of revisiting. The thing is, without a working subsystem, it's hard to tell exactly where to interrupt the process.
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
January 9, 2014 at 10:29 am
Okay, I re-read SQLKiwi's articles on APPLY. Basically, the "left" side is the set of enrollees to "process", and the right side is the symptoms. Got that part. I think I tried doing APPLY, but I couldn't figure out how to use ORDER BY NewID() to randomize. I'm pretty sure that's where I gave up and cheated. (Jeez, do I have to go to the Principal's office too?)
January 9, 2014 at 11:13 am
Each enrollee has a (random) small number of cycles, each cycle has a (random) number of symptoms - is this correct?
If a group 5 symptom is allocated at any point, it's marked as the last symptom of the set of symptoms/cycles for the enrollee?
For better assistance in answering your questions, please read this[/url].
Hidden RBAR: Triangular Joins[/url] / The "Numbers" or "Tally" Table: What it is and how it replaces a loop[/url] Jeff Moden[/url]
January 9, 2014 at 12:39 pm
Correct. It's modeling cancer treatment therapy, so sometimes they get fatal complications, and after that you can't test them anymore. (Cheery huh?)
January 9, 2014 at 1:14 pm
pietlinden (1/9/2014)
Correct. It's modeling cancer treatment therapy, so sometimes they get fatal complications, and after that you can't test them anymore. (Cheery huh?)
1989/90 I worked for a while in clinical trials analysis. It wasn't much fun and I've kept clear of it since. Whilst I was there, a bubbly young lass, Maria, was taken on to key in the results of an antidepressant study. These included a paragraph or two filled in by the patient. Maria grew quieter and quieter as the months passed, and within six months of starting work she'd taken her own life. What made it doubly tragic was that her brother Tony worked there too.
Anyway, back to business. What's the maximum number of cycles? Can they overlap? Presumably you want a datetime on each symptom row?
For better assistance in answering your questions, please read this[/url].
Hidden RBAR: Triangular Joins[/url] / The "Numbers" or "Tally" Table: What it is and how it replaces a loop[/url] Jeff Moden[/url]
Viewing 15 posts - 16 through 30 (of 34 total)
You must be logged in to reply to this topic. Login to reply