Help- Very Complex SQL query

  • Matt Miller (#4) (2/20/2010)


    ChrisM@home (2/20/2010)


    Paul White (2/20/2010)


    Using ChrisM's Arun's super test-data script, the following appeals to me:

    snip

    'Course it does, it's got a CROSS APPLY in it!:hehe:

    CROSS APPLY is an awesome operator to use for compartmenting complex stuff for an output column without fear of changing the cardinality of the result set. Almost a scalar function replacement, with full code visibility.

    Not that I don't like Crossy Apply, but - I'm missing the part where it's guaranteed to not change the cardinality. The times when I use it (with XML), it's being used to explode out sub nodes into separate rows, so it's changing cardinality on me (specifically based on me asking it to, of course).

    Can you give me an example of how you're using it in this way? I'm thinking you have some specific examples in mind, and I love to pick up new tricks.

    Sure thing, Matt. If you don't have a couple of good examples from me within a couple of days, then please PM me a reminder. It's a very good point you've made here and deserves a studied response.


    [font="Arial"]Low-hanging fruit picker and defender of the moggies[/font]

    For better assistance in answering your questions, please read this[/url].


    Understanding and using APPLY, (I)[/url] and (II)[/url] Paul White[/url]

    Hidden RBAR: Triangular Joins[/url] / The "Numbers" or "Tally" Table: What it is and how it replaces a loop[/url] Jeff Moden[/url]

  • @ Matt: until I've posted evidence to support this statement, consider it retracted with humble apologies.

    Here's where it started, some messing about with CROSS APPLY - note that this query was for experimentation only, there are better ways to achieve the same result:

    DROP TABLE #Sample

    CREATE TABLE #Sample (ID CHAR(4), DodgyStringData VARCHAR(60))

    INSERT INTO #Sample (ID, DodgyStringData)

    SELECT 'ID_1', 'xyz 54050, FY11, Apr, Budget, Version_1, 0160117' UNION ALL

    SELECT 'ID_2', 'abc 54051, FY11, May, Budget, Version_2, 0160117' UNION ALL

    SELECT 'ID_3', 'abc 541, FY11, May, Actual, Version_12, 0160117'

    SELECT s.ID, q.Account, q.[Year], q.[Month], q.Scenario, q.[Version], q.Entity

    FROM #Sample s

    CROSS APPLY (

    SELECT

    MAX(CASE x.ColNo WHEN 1 THEN x.Stringybit END) AS [Account],

    MAX(CASE x.ColNo WHEN 2 THEN x.Stringybit END) AS [Year],

    MAX(CASE x.ColNo WHEN 3 THEN x.Stringybit END) AS [Month],

    MAX(CASE x.ColNo WHEN 4 THEN x.Stringybit END) AS [Scenario],

    MAX(CASE x.ColNo WHEN 5 THEN x.Stringybit END) AS [Version],

    MAX(CASE x.ColNo WHEN 6 THEN x.Stringybit END) AS [Entity]

    FROM (

    SELECT ColNo = ROW_NUMBER() OVER(ORDER BY n.n),

    Startpos = n.n,

    Endpos = ISNULL(NULLIF(CHARINDEX(',', s.DodgyStringData, n.n+1), 0), LEN(s.DodgyStringData)),

    Stringybit = SUBSTRING(s.DodgyStringData, n.n, ISNULL(NULLIF(CHARINDEX(',', s.DodgyStringData, n.n+1), 0), LEN(s.DodgyStringData))-n.n)

    FROM (SELECT TOP 100 [n] = ROW_NUMBER() OVER(ORDER BY a.[ID]) FROM #Sample a, #Sample b, #Sample c, #Sample d) n

    WHERE SUBSTRING(',' + s.DodgyStringData, n.n, 1) = ',') x

    ) q

    And of course on its own it shows nothing, really. I was surprised that it worked without a GROUP BY to support the MAX() aggregate function but that's about it.

    Can I suggest we open a new thread for this discussion, rather than railroading this one?


    [font="Arial"]Low-hanging fruit picker and defender of the moggies[/font]

    For better assistance in answering your questions, please read this[/url].


    Understanding and using APPLY, (I)[/url] and (II)[/url] Paul White[/url]

    Hidden RBAR: Triangular Joins[/url] / The "Numbers" or "Tally" Table: What it is and how it replaces a loop[/url] Jeff Moden[/url]

  • Matt's right: there isn't a cardinality guarantee. APPLY invokes a table-valued-function (TVF) for each row returned by the outer table expression. A TVF can have many rows of output for each row of input, as shown here:

    SELECT *

    FROM (

    -- One row on the input side

    VALUES (1)

    ) Source (col1)

    CROSS

    APPLY (

    SELECT *

    FROM (

    -- Three rows on the APPLY side

    VALUES(1), (2), (3)

    ) V (col2)

    ) TVF;

    Output:

    col1col2

    11

    12

    13

    Paul

  • Paul White (2/20/2010)


    Matt's right: there isn't a cardinality guarantee. APPLY invokes a table-valued-function (TVF) for each row returned by the outer table expression. A TVF can have many rows of output for each row of input, as shown here:

    SELECT *

    FROM (

    -- One row on the input side

    VALUES (1)

    ) Source (col1)

    CROSS

    APPLY (

    SELECT *

    FROM (

    -- Three rows on the APPLY side

    VALUES(1), (2), (3)

    ) V (col2)

    ) TVF;

    Output:

    col1col2

    11

    12

    13

    Paul

    Blimey, what was I thinking? :blush: Of course it works like this, it would be inconsistent for it to work any other way. I guess I got "confined" by the examples I've been working with recently. Thanks guys.


    [font="Arial"]Low-hanging fruit picker and defender of the moggies[/font]

    For better assistance in answering your questions, please read this[/url].


    Understanding and using APPLY, (I)[/url] and (II)[/url] Paul White[/url]

    Hidden RBAR: Triangular Joins[/url] / The "Numbers" or "Tally" Table: What it is and how it replaces a loop[/url] Jeff Moden[/url]

  • ChrisM@home (2/21/2010)


    Blimey, what was I thinking? :blush: Of course it works like this, it would be inconsistent for it to work any other way. I guess I got "confined" by the examples I've been working with recently. Thanks guys.

    Heh. No worries, Chris.

  • Paul White (2/20/2010)


    WayneS (2/20/2010)


    ...you also wouldn't be able to use the CROSS APPLY from Paul's post (which I see he has already provided a totally awesome solution that considers this).

    Fixed that for you, Wayne :laugh:

    There you go, putting words in my mouth... errr... post again. :Whistling:

    But I have to agree... after looking at it closer, it is totally awesome 🙂

    Wayne
    Microsoft Certified Master: SQL Server 2008
    Author - SQL Server T-SQL Recipes


    If you can't explain to another person how the code that you're copying from the internet works, then DON'T USE IT on a production system! After all, you will be the one supporting it!
    Links:
    For better assistance in answering your questions
    Performance Problems
    Common date/time routines
    Understanding and Using APPLY Part 1 & Part 2

  • Paul White (2/21/2010)


    ChrisM@home (2/21/2010)


    Blimey, what was I thinking? :blush: Of course it works like this, it would be inconsistent for it to work any other way. I guess I got "confined" by the examples I've been working with recently. Thanks guys.

    Heh. No worries, Chris.

    Agreed. we all get stuck in a viewpoint from time to time. This is why we have these conversations: the extra perspectives help to keep us pointing in the right direction.

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

  • Sash Mav (2/19/2010)


    Ohh buddy - that worked like a charm. I am such a fool not to think from basics.

    Arun certainly has the correct idea but, being the author of the article that a lot of the folks are citing on this thread, you need to know that unless you follow the rules, you're just asking for the method to fail. The only reason why the code Arun uses works is because it's a Temp Heap and the data was inserted absolutely correctly. Even that's no guarantee. Please see the article others are pointing to and if you're not going to use the method correctly, then don't use it at all.

    Arun, I appreciate you trying to help but you might want to take a peek at the article as well. If you're going to show folks how to use it, then please show them correctly or not at all. Don't do it for your sake. Do it for my sake and the sake of the people you're trying to help. I'm already taking enough heat from folks on the subject and you're not helping by posting incomplete renditions of the method. Please post it right or don't post it. Thanks.

    For those interested (and I know of 2 of you that better be), here's the article.

    http://www.sqlservercentral.com/articles/T-SQL/68467/

    Study it and obey the rules... if you don't, it WILL bite you hard some day in the future especially in SQL Server 2000 where "merry go round" indexes and heaps were the norm rather than the exception.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (2/21/2010)


    [Please see the article others are pointing to and if you're not going to use the method correctly, then do use it at all.

    Jeff, don't you mean "then don't use it at all"? Also, did you intend to include the link to the article? Your post sounds like you meant to...;-)

    Wayne
    Microsoft Certified Master: SQL Server 2008
    Author - SQL Server T-SQL Recipes


    If you can't explain to another person how the code that you're copying from the internet works, then DON'T USE IT on a production system! After all, you will be the one supporting it!
    Links:
    For better assistance in answering your questions
    Performance Problems
    Common date/time routines
    Understanding and Using APPLY Part 1 & Part 2

  • WayneS (2/21/2010)


    Jeff Moden (2/21/2010)


    [Please see the article others are pointing to and if you're not going to use the method correctly, then do use it at all.

    Jeff, don't you mean "then don't use it at all"? Also, did you intend to include the link to the article? Your post sounds like you meant to...;-)

    Fixed on both parts. Thanks for the catch, Wayne.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • So, would anyone like to post a correct quirky-update solution to the updated requirement, just for completeness?

    If not, I'll have a crack. Not fussed though, I've posted quite enough code already 😉

  • Paul White (2/21/2010)


    So, would anyone like to post a correct quirky-update solution to the updated requirement, just for completeness?

    Good idea Paul... we've talked about the need for doing it right, but it hasn't been done.

    Modifying Arun's original code, we end up with:

    create table #temp

    (

    RowID INT IDENTITY PRIMARY KEY CLUSTERED, -- need a clustered primary key for the quirky update

    date1 datetime null,

    date2 datetime null

    )

    insert into #temp (date1,date2)

    select '2009-01-01','2009-01-01' union all

    select '2009-01-02',null union all

    select '2009-01-03',null union all

    select '2009-01-04',null union all

    select '2009-01-05','2009-01-05' union all

    select '2009-01-06',null union all

    select '2009-01-07',null union all

    select '2009-01-08',null union all

    select '2009-01-09',null union all

    select '2009-01-10','2009-01-10' union all

    select '2009-01-11',null union all

    select '2009-01-12',null union all

    select '2009-01-13',null union all

    select '2009-01-14',null union all

    select '2009-01-15','2009-01-15' union all

    select '2009-01-16',null union all

    select '2009-01-17',null union all

    select '2009-01-18',null union all

    select '2009-01-19',null

    declare @datetime datetime,

    @RowID int

    set @datetime = null

    update #temp

    set @RowID = RowID, -- anchor column from clustered index

    @datetime = date2 = (case when date2 is not null then date2 else @datetime end)

    FROM #temp WITH (TABLOCKX) -- TABLOCKX NOT needed for temp tables, but always use it for consistency so you don't forget it when required!

    OPTION (MAXDOP 1) -- prevent parallelism

    select * from #temp

    Wayne
    Microsoft Certified Master: SQL Server 2008
    Author - SQL Server T-SQL Recipes


    If you can't explain to another person how the code that you're copying from the internet works, then DON'T USE IT on a production system! After all, you will be the one supporting it!
    Links:
    For better assistance in answering your questions
    Performance Problems
    Common date/time routines
    Understanding and Using APPLY Part 1 & Part 2

  • Wayne,

    Thanks for that. I'm a bit old-school and also include an INDEX(0) too. I was rather hoping for a quirky implementation for the updated requirements though, the ones I based the following code on.

    Paul White (2/20/2010)


    SQL 2000 version

    UPDATE RowsToUpdate

    SET notional = ISNULL(new_value, 50)

    FROM (

    SELECT T1.value_date_minus_1,

    T1.notional,

    new_value =

    (

    SELECT MIN(notional)

    FROM #tmp_Dates_Notional T2

    WHERE T2.value_date_minus_1 >= T1.value_date_minus_1

    AND T2.notional IS NOT NULL

    )

    FROM #tmp_Dates_Notional T1

    WHERE T1.notional IS NULL

    ) AS RowsToUpdate;

    Paul

  • Paul White (2/21/2010)


    Wayne,

    Thanks for that. I'm a bit old-school and also include an INDEX(0) too. I was rather hoping for a quirky implementation for the updated requirements though, the ones I based the following code on.

    Paul White (2/20/2010)


    SQL 2000 version

    UPDATE RowsToUpdate

    SET notional = ISNULL(new_value, 50)

    FROM (

    SELECT T1.value_date_minus_1,

    T1.notional,

    new_value =

    (

    SELECT MIN(notional)

    FROM #tmp_Dates_Notional T2

    WHERE T2.value_date_minus_1 >= T1.value_date_minus_1

    AND T2.notional IS NOT NULL

    )

    FROM #tmp_Dates_Notional T1

    WHERE T1.notional IS NULL

    ) AS RowsToUpdate;

    Paul

    I guess I didn't understand that one... it sets all the rows to 50 instead of doing the requested data smear.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Paul White (2/21/2010)


    Wayne,

    Thanks for that. I'm a bit old-school and also include an INDEX(0) too. I was rather hoping for a quirky implementation for the updated requirements though...

    I must have missed the updated requirements... let me look at what those are and see what I can do.

    Wayne
    Microsoft Certified Master: SQL Server 2008
    Author - SQL Server T-SQL Recipes


    If you can't explain to another person how the code that you're copying from the internet works, then DON'T USE IT on a production system! After all, you will be the one supporting it!
    Links:
    For better assistance in answering your questions
    Performance Problems
    Common date/time routines
    Understanding and Using APPLY Part 1 & Part 2

Viewing 15 posts - 31 through 45 (of 62 total)

You must be logged in to reply to this topic. Login to reply