CTE - Avoiding the cursor

  • I just realized I'm going about this all wrong. I don't need *random* numbers specifically. I just need numbers that are NOT the TaxID. Which means I can use a Tally table. And if I can use a Tally table, I don't need to re-create this silly table every single time I restore the database. I can just store it (or my random numbers table) in another DB after the initial generation. DOH!

    BTW: For any who are interested, this is my code for generating non-repetitive random numbers. I noticed that there weren't a whole lot of solutions for this problem when I googled it.

    DECLARE @StoreCnt int, @CntUp int, @NewTaxID varchar(10);

    SELECT @StoreCnt = Count(StoreID) FROM dbo.tblStore;

    SET @CntUp = 0;

    CREATE TABLE dbo.#RandomNumbers (TaxID varchar(10));

    WHILE @CntUp < (@StoreCnt + 1000)

    BEGIN

    SET @NewTaxID = '11-' + convert(varchar(7),(left(convert(bigint,RAND()*100000000000),7)));

    IF (SELECT TaxID FROM dbo.#RandomNumbers WHERE TaxID = @NewTaxID) IS NULL

    BEGIN

    INSERT INTO dbo.#RandomNumbers (TaxID)

    SELECT @NewTaxID;

    END;

    SET @CntUp = @CntUp + 1;

    END;

    Yeah, I'm making this harder than it needs to be, but I want to explore all options which is why I'm doing this to myself.

    As far as the precon class I submitted for Orlando, it's "T-SQL 101" (for raw beginners who know nothing about SQL coding). I'm really good at teaching the intro stuff even if sometimes the advanced stuff slips by me.

    And now that my focus has changed, I'm going back to look once again at everyone's suggestions to see how I can fit that into what I'm doing.

    Brandie Tarvin, MCITP Database AdministratorLiveJournal Blog: http://brandietarvin.livejournal.com/[/url]On LinkedIn!, Google+, and Twitter.Freelance Writer: ShadowrunLatchkeys: Nevermore, Latchkeys: The Bootleg War, and Latchkeys: Roscoes in the Night are now available on Nook and Kindle.

  • Eugene Elutin (7/6/2010)


    Post #945638 explains how you can generate random number using NEWID() function without using FranchiseID column to randomize.

    It doesn't uses FranchiseID in the TaxID also.

    Combine hints form two posts together and you will get a solution wihtout using a cursor. What is wrong with it?

    My issue with NEWID() is the resulting datatype (uniqueidentifier) and the fact that it includes letters in it. It doesn't fit with what should be an int result.

    And yes, I know TaxID as stated is a varchar(), but it's only that way for the hyphens that are usually included. There shouldn't be any letters.

    Brandie Tarvin, MCITP Database AdministratorLiveJournal Blog: http://brandietarvin.livejournal.com/[/url]On LinkedIn!, Google+, and Twitter.Freelance Writer: ShadowrunLatchkeys: Nevermore, Latchkeys: The Bootleg War, and Latchkeys: Roscoes in the Night are now available on Nook and Kindle.

  • Hi, I was on holiday, so couldn't reply earlier.

    Brandie Tarvin (7/7/2010)


    Eugene Elutin (7/6/2010)


    Post #945638 explains how you can generate random number using NEWID() function without using FranchiseID column to randomize. ...

    My issue with NEWID() is the resulting datatype (uniqueidentifier) and the fact that it includes letters in it. It doesn't fit with what should be an int result...

    Sorry, but your real issue is not reading answers carefully and not trying to run given samples of code.

    NEWID() function indeed returns uniqueidentifier and it does include letters, however post #945638 explains how it can be used to generate quite unique value which only contain digits. I will show it again here:

    select RIGHT(CAST(CAST(CAST(NEWID() AS VARBINARY) AS BIGINT) AS VARCHAR(30)),9)

    Now you can see how NEWID() function can be used to generate random numbers. Unlike RAND(), it will generate unique random numbers for each raw affected by select query.

    _____________________________________________
    "The only true wisdom is in knowing you know nothing"
    "O skol'ko nam otkrytiy chudnyh prevnosit microsofta duh!":-D
    (So many miracle inventions provided by MS to us...)

    How to post your question to get the best and quick help[/url]

  • Eugene Elutin (7/12/2010)

    Sorry, but your real issue is not reading answers carefully and not trying to run given samples of code.

    And perhaps you should test your own code too?

    Yes, I admit I missed a few things in this thread. I'm a speed reader, which tends to bite me in the posterior when reading technical stuff. I didn't see the BigInt conversion because I was busy looking at the NewID() function.

    But I have tested your solution both using a normal update and using a Tally Table. In both methods, it still repeats numbers. So, I'm back where I was with needing unique, non-repeating numbers.

    I appear to have a solution to the problem of repetition, however, that I'm working on right now and it does appear avoid the issue of using the Loop, Cursor, and CTE.

    Brandie Tarvin, MCITP Database AdministratorLiveJournal Blog: http://brandietarvin.livejournal.com/[/url]On LinkedIn!, Google+, and Twitter.Freelance Writer: ShadowrunLatchkeys: Nevermore, Latchkeys: The Bootleg War, and Latchkeys: Roscoes in the Night are now available on Nook and Kindle.

  • Brandie Tarvin (7/16/2010)


    ...

    But I have tested your solution both using a normal update and using a Tally Table. In both methods, it still repeats numbers. So, I'm back where I was with needing unique, non-repeating numbers.

    ...

    Dear Brandie,

    Solution provided in Post #945594 does work fine it does generate unique random numbers, then looks for duplicates and regenerate FedTaxID for them. However it did used RAND() function in expression toghether with FranchiseID. You said you don't want to use FranchiseID when generating number, which is a strange requirement, as it used as kind of "seed" for RAND() function and it doesn't mean that result random number will look like it. Ok, I have advised to use NEWID(), here was my mistake, as use of RAND() together with its "seed" could not be just replaced with NEWID(). If done so, generated numbers will be unique for each record in Franchise table, but you need them to be the same for a FranchiseID. Therefore using NEWID() function will require some additonal grouping...

    here is a complete query (please notice how NEWID() gets converted to digits...):

    UPDATE Store SET FedTaxID = FI.NewFTID

    FROM Store S

    JOIN Franchise F

    ON F.StoreID = S.StoreID

    JOIN (SELECT FranchiseID, MAX(LEFT(ABS(CHECKSUM(NEWiD())), 9)) AS NewFTID

    FROM Franchise

    GROUP BY FranchiseID) FI

    ON FI.FranchiseID = F.FranchiseID

    -- CHECK IF DUPLICATE FedTaxID EXISTS FOR DIFFERENT FranchiseID

    WHILE EXISTS(

    SELECT S.FedTaxID

    FROM Store S

    JOIN Franchise F

    ON F.StoreID = S.StoreID

    GROUP BY S.FedTaxID

    HAVING COUNT(DISTINCT F.FranchiseID) > 1

    )

    BEGIN

    -- REGENERATE FedTaxID WHERE DUPLICATES FOUND

    UPDATE Store SET FedTaxID = FI.NewFTID

    FROM Store S

    JOIN Franchise F

    ON F.StoreID = S.StoreID

    JOIN (SELECT FranchiseID, MAX(LEFT(ABS(CHECKSUM(NEWID())), 9)) AS NewFTID

    FROM Franchise

    GROUP BY FranchiseID) FI

    ON FI.FranchiseID = F.FranchiseID

    WHERE EXISTS (SELECT NULL

    FROM Store S1

    JOIN Franchise F1

    ON F1.StoreID = S1.StoreID

    WHERE S1.FedTaxID = S.FedTaxID

    GROUP BY S1.FedTaxID

    HAVING COUNT(DISTINCT F1.FranchiseID) > 1

    )

    END

    _____________________________________________
    "The only true wisdom is in knowing you know nothing"
    "O skol'ko nam otkrytiy chudnyh prevnosit microsofta duh!":-D
    (So many miracle inventions provided by MS to us...)

    How to post your question to get the best and quick help[/url]

  • Just an FYI:

    Earlier, I was seeing 3 duplicate numbers when running your original NewID() solution. Repeated runnings seems to generate between 0 - 5 duplicate numbers with no discernible pattern to the repeats or non-repeats.

    I haven't run the code more than 20 times, but I have to wonder what the max number of duplicates will end up being after more repeated runnings. During investigating this whole issue, I've found a number of interesting issues that have absolutely no bearing on my current situation but gives me fodder for future research. @=)

    Brandie Tarvin, MCITP Database AdministratorLiveJournal Blog: http://brandietarvin.livejournal.com/[/url]On LinkedIn!, Google+, and Twitter.Freelance Writer: ShadowrunLatchkeys: Nevermore, Latchkeys: The Bootleg War, and Latchkeys: Roscoes in the Night are now available on Nook and Kindle.

  • Eugene Elutin (6/30/2010)


    Actually, recursive CTE is not far from RBAR (can be even worse in performance)...

    Not strictly true. A good example of RBAR would be performing a running totals update using a cursor. What makes the scalable and much faster "Quirky Update" method "set-based" other than the simple fact it can be written (and processed) as a single UPDATE statement? How fast does a single-statement running totals rCTE have to be before it slips out of the RBAR camp? Run the following code and prepare to be surprised.

    DROP TABLE #Numbers

    SELECT TOP 100000

    n = ROW_NUMBER() OVER (ORDER BY a.name),

    CalcValue = CAST(NULL AS BIGINT)

    INTO #Numbers

    FROM master.dbo.syscolumns a, master.dbo.syscolumns b

    CREATE UNIQUE CLUSTERED INDEX CIn ON #Numbers ([n] ASC)

    SET STATISTICS IO ON

    SET STATISTICS TIME ON

    -- 'Quirky' update

    DECLARE @Lastval INT = 0, @CalcValue BIGINT

    UPDATE #Numbers SET

    @CalcValue = CalcValue = (@Lastval + n),

    @Lastval = n

    -- (100000 row(s) affected) / CPU time = 454 ms, elapsed time = 526 ms.

    -- Table #Numbers... Scan count 1, logical reads 314, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    -- Recursive CTE

    ;WITH Calculator (n, CalcValue) AS (

    SELECT n.n,

    CalcValue = CAST(n.n AS BIGINT)

    FROM #Numbers n

    WHERE n.n = 1

    UNION ALL

    SELECT n.n,

    CalcValue = n.n + c.n

    FROM #Numbers n

    INNER JOIN Calculator c ON c.n + 1 = n.n -- nice

    )

    SELECT n, CalcValue

    FROM Calculator

    OPTION (MAXRECURSION 0)

    -- (100000 row(s) affected) / CPU time = 3203 ms, elapsed time = 3483 ms.

    -- Table 'Worktable'. Scan count 2, logical reads 600001, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    SET STATISTICS IO Off

    SET STATISTICS TIME Off

    @Brandie:-

    Why do the new values have to be randomised?

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • Chris Morris-439714 (7/16/2010)


    Eugene Elutin (6/30/2010)


    Actually, recursive CTE is not far from RBAR (can be even worse in performance)...

    Not strictly true. A good example of RBAR would be performing a running totals update using a cursor. What makes the scalable and much faster "Quirky Update" method "set-based" other than the simple fact it can be written (and processed) as a single UPDATE statement?

    That's what I thought. And that's what I was originally going for. Thank you for verifying that fact for me, Chris.

    Chris Morris-439714 (7/16/2010)


    @Brandie:-

    Why do the new values have to be randomised?

    1) There is a school of thought that thinks random numbers are less likely to be converted to "real" TaxIDs. Yeah, there's a debate behind this that we could get into, but allow me to now refer you to reason # 2.

    2) The boss said so.

    Personally, if I'm going to argue a point with someone about the workability of solutions, the random / not random argument is at the bottom of my list. I've got bigger fish to fry, if you know what I mean. Boss wants random? Boss gets random.

    The upside of this whole thread is I got a brand new article for site publication out of it. @=) Eugene gets mentioned for providing me with the seed of my final solution. And so does Jeff Moden because, well, it's Jeff. @=)

    Brandie Tarvin, MCITP Database AdministratorLiveJournal Blog: http://brandietarvin.livejournal.com/[/url]On LinkedIn!, Google+, and Twitter.Freelance Writer: ShadowrunLatchkeys: Nevermore, Latchkeys: The Bootleg War, and Latchkeys: Roscoes in the Night are now available on Nook and Kindle.

Viewing 8 posts - 16 through 22 (of 22 total)

You must be logged in to reply to this topic. Login to reply