how to do this join?

  • Say I have 2 tables like this.

    CREATE TABLE #TableA (Period CHAR(7), SomeValue INT)
    INSERT #TableA(Period,SomeValue) VALUES ('2018-01',10)
    INSERT #TableA(Period,SomeValue) VALUES ('2018-02',20)
    INSERT #TableA(Period,SomeValue) VALUES ('2018-03',30)
    INSERT #TableA(Period,SomeValue) VALUES ('2018-04',40)

    CREATE TABLE #TableB (Period CHAR(7))
    INSERT #TableB(Period) VALUES ('2018-01')
    INSERT #TableB(Period) VALUES ('2018-02')
    INSERT #TableB(Period) VALUES ('2018-03')
    INSERT #TableB(Period) VALUES ('2018-04')
    INSERT #TableB(Period) VALUES ('2018-05')
    INSERT #TableB(Period) VALUES ('2018-06')
    INSERT #TableB(Period) VALUES ('2018-07')
    INSERT #TableB(Period) VALUES ('2018-08')
    INSERT #TableB(Period) VALUES ('2018-09')

    If I join the 2 tables like this....

    SELECT
        b.Period AS B_Period,
        a.SomeValue
    FROM #TableB b
    LEFT OUTER JOIN #TableA a
        ON a.Period = b.Period

    ... then I have missing values for the periods that are not in #TableA.

    For those values, I would like them to be filled with the latest value in #TableA.
    So in this example, the value 40 from Period '2018-04' should should filled out for '2018-05' to '2018-09'

  • DoolinDalton - Thursday, January 3, 2019 8:25 AM

    Say I have 2 tables like this.

    CREATE TABLE #TableA (Period CHAR(7), SomeValue INT)
    INSERT #TableA(Period,SomeValue) VALUES ('2018-01',10)
    INSERT #TableA(Period,SomeValue) VALUES ('2018-02',20)
    INSERT #TableA(Period,SomeValue) VALUES ('2018-03',30)
    INSERT #TableA(Period,SomeValue) VALUES ('2018-04',40)

    CREATE TABLE #TableB (Period CHAR(7))
    INSERT #TableB(Period) VALUES ('2018-01')
    INSERT #TableB(Period) VALUES ('2018-02')
    INSERT #TableB(Period) VALUES ('2018-03')
    INSERT #TableB(Period) VALUES ('2018-04')
    INSERT #TableB(Period) VALUES ('2018-05')
    INSERT #TableB(Period) VALUES ('2018-06')
    INSERT #TableB(Period) VALUES ('2018-07')
    INSERT #TableB(Period) VALUES ('2018-08')
    INSERT #TableB(Period) VALUES ('2018-09')

    If I join the 2 tables like this....

    SELECT
        b.Period AS B_Period,
        a.SomeValue
    FROM #TableB b
    LEFT OUTER JOIN #TableA a
        ON a.Period = b.Period

    ... then I have missing values for the periods that are not in #TableA.

    For those values, I would like them to be filled with the latest value in #TableA.
    So in this example, the value 40 from Period '2018-04' should should filled out for '2018-05' to '2018-09'

    I'm not saying this is the most efficient way to do it but it works:
    SELECT
    b.Period AS B_Period,
    ISNULL(a.SomeValue,x.SomeValue)  SomeValue
    FROM #TableB b
    LEFT OUTER JOIN #TableA a
    ON a.Period = b.Period
    OUTER APPLY(SELECT TOP(1) SomeValue FROM #TableA x WHERE x.Period < b.Period ORDER BY x.Period DESC) x(SomeValue)

  • Jonathan AC Roberts - Thursday, January 3, 2019 8:43 AM

    DoolinDalton - Thursday, January 3, 2019 8:25 AM

    Say I have 2 tables like this.

    CREATE TABLE #TableA (Period CHAR(7), SomeValue INT)
    INSERT #TableA(Period,SomeValue) VALUES ('2018-01',10)
    INSERT #TableA(Period,SomeValue) VALUES ('2018-02',20)
    INSERT #TableA(Period,SomeValue) VALUES ('2018-03',30)
    INSERT #TableA(Period,SomeValue) VALUES ('2018-04',40)

    CREATE TABLE #TableB (Period CHAR(7))
    INSERT #TableB(Period) VALUES ('2018-01')
    INSERT #TableB(Period) VALUES ('2018-02')
    INSERT #TableB(Period) VALUES ('2018-03')
    INSERT #TableB(Period) VALUES ('2018-04')
    INSERT #TableB(Period) VALUES ('2018-05')
    INSERT #TableB(Period) VALUES ('2018-06')
    INSERT #TableB(Period) VALUES ('2018-07')
    INSERT #TableB(Period) VALUES ('2018-08')
    INSERT #TableB(Period) VALUES ('2018-09')

    If I join the 2 tables like this....

    SELECT
        b.Period AS B_Period,
        a.SomeValue
    FROM #TableB b
    LEFT OUTER JOIN #TableA a
        ON a.Period = b.Period

    ... then I have missing values for the periods that are not in #TableA.

    For those values, I would like them to be filled with the latest value in #TableA.
    So in this example, the value 40 from Period '2018-04' should should filled out for '2018-05' to '2018-09'

    I'm not saying this is the most efficient way to do it but it works:
    SELECT
    b.Period AS B_Period,
    ISNULL(a.SomeValue,x.SomeValue)  SomeValue
    FROM #TableB b
    LEFT OUTER JOIN #TableA a
    ON a.Period = b.Period
    OUTER APPLY(SELECT TOP(1) SomeValue FROM #TableA x WHERE x.Period < b.Period ORDER BY x.Period DESC) x(SomeValue)

    I feel like there is something clever you can do within the JOIN condition.

    I have it like this now ....

    SELECT
        b.Period AS B_Period,
        a.Period AS A_Period,
        a.SomeValue
    FROM #TableB b
    LEFT OUTER JOIN #TableA a
        ON a.Period <= b.Period

    ... and thinking there is a "AND" condition that will involve some kind of MAX.... hmmm.

    I can see yours works but not sure what OUTER APPLY is and want to explore more on the above.


  • SELECT
    b.Period AS B_Period,
    a1.Period AS A_period,
    a1.SomeValue
    FROM #TableB b
    OUTER APPLY (
        SELECT TOP (1) a.*
        FROM #TableA a
        WHERE a.Period <= b.Period
        ORDER BY a.Period DESC
    ) AS a1

    SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

  • DoolinDalton - Thursday, January 3, 2019 9:36 AM

    Jonathan AC Roberts - Thursday, January 3, 2019 8:43 AM

    DoolinDalton - Thursday, January 3, 2019 8:25 AM

    Say I have 2 tables like this.

    CREATE TABLE #TableA (Period CHAR(7), SomeValue INT)
    INSERT #TableA(Period,SomeValue) VALUES ('2018-01',10)
    INSERT #TableA(Period,SomeValue) VALUES ('2018-02',20)
    INSERT #TableA(Period,SomeValue) VALUES ('2018-03',30)
    INSERT #TableA(Period,SomeValue) VALUES ('2018-04',40)

    CREATE TABLE #TableB (Period CHAR(7))
    INSERT #TableB(Period) VALUES ('2018-01')
    INSERT #TableB(Period) VALUES ('2018-02')
    INSERT #TableB(Period) VALUES ('2018-03')
    INSERT #TableB(Period) VALUES ('2018-04')
    INSERT #TableB(Period) VALUES ('2018-05')
    INSERT #TableB(Period) VALUES ('2018-06')
    INSERT #TableB(Period) VALUES ('2018-07')
    INSERT #TableB(Period) VALUES ('2018-08')
    INSERT #TableB(Period) VALUES ('2018-09')

    If I join the 2 tables like this....

    SELECT
        b.Period AS B_Period,
        a.SomeValue
    FROM #TableB b
    LEFT OUTER JOIN #TableA a
        ON a.Period = b.Period

    ... then I have missing values for the periods that are not in #TableA.

    For those values, I would like them to be filled with the latest value in #TableA.
    So in this example, the value 40 from Period '2018-04' should should filled out for '2018-05' to '2018-09'

    I'm not saying this is the most efficient way to do it but it works:
    SELECT
    b.Period AS B_Period,
    ISNULL(a.SomeValue,x.SomeValue)  SomeValue
    FROM #TableB b
    LEFT OUTER JOIN #TableA a
    ON a.Period = b.Period
    OUTER APPLY(SELECT TOP(1) SomeValue FROM #TableA x WHERE x.Period < b.Period ORDER BY x.Period DESC) x(SomeValue)

    I feel like there is something clever you can do within the JOIN condition.

    I have it like this now ....

    SELECT
        b.Period AS B_Period,
        a.Period AS A_Period,
        a.SomeValue
    FROM #TableB b
    LEFT OUTER JOIN #TableA a
        ON a.Period <= b.Period

    ... and thinking there is a "AND" condition that will involve some kind of MAX.... hmmm.

    I can see yours works but not sure what OUTER APPLY is and want to explore more on the above.

    Yes, it can be simplified. You will need to still use OUTER APPLY instead of LEFT JOIN:
    SELECT
    b.Period AS B_Period,
    a.SomeValue SomeValue
    FROM #TableB b
    OUTER APPLY(SELECT TOP(1) SomeValue FROM #TableA a WHERE a.Period <= b.Period ORDER BY a.Period DESC) a(SomeValue)

  • You can also do it with a CTE and ROW_NUMBER():
    ;WITH CTE AS (
    SELECT
    b.Period AS B_Period,
    a.SomeValue,
    ROW_NUMBER() OVER (PARTITION BY b.Period ORDER BY a.Period DESC) RowNum
    FROM #TableB b
    LEFT JOIN #TableA a
     ON a.Period <= b.Period
    )
    SELECT B_Period,SomeValue
    FROM CTE
    WHERE RowNum = 1

  • I see most of the answers use APPLY.

    I'm a bit thrown off because I am use to seeing APPLY being used with a function....
    But I think I get it.

    Thank you.

  • DoolinDalton - Thursday, January 3, 2019 10:11 AM

    I see most of the answers use APPLY.

    I'm a bit thrown off because I am use to seeing APPLY being used with a function....
    But I think I get it.

    Thank you.

    CROSS APPLY is a bit like INNER JOIN and OUTER APPLY is a bit like LEFT JOIN.
    The main difference is the APPLY operator is logically executed for each row.

  • DoolinDalton - Thursday, January 3, 2019 10:11 AM

    I see most of the answers use APPLY.

    I'm a bit thrown off because I am use to seeing APPLY being used with a function....
    But I think I get it.

    Thank you.

    And APPLY is still being utilized with a 'function' in this case - it is just that this 'function' is inline and written as a TOP query.  The function is applied to each row in the outer query...

    Jeffrey Williams
    “We are all faced with a series of great opportunities brilliantly disguised as impossible situations.”

    ― Charles R. Swindoll

    How to post questions to get better answers faster
    Managing Transaction Logs

  • APPLY works out to be the same as a correlated subquery that can return more than 1 row for each correlation.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden - Thursday, January 3, 2019 8:22 PM

    APPLY works out to be the same as a correlated subquery that can return more than 1 row for each correlation.

    I prefer to phrase this as "it can return one than one value", because it can return multiple columns and/or rows.

    Drew

    J. Drew Allen
    Business Intelligence Analyst
    Philadelphia, PA

  • drew.allen - Friday, January 4, 2019 9:22 AM

    Jeff Moden - Thursday, January 3, 2019 8:22 PM

    APPLY works out to be the same as a correlated subquery that can return more than 1 row for each correlation.

    I prefer to phrase this as "it can return one than one value", because it can return multiple columns and/or rows.

    Drew

    True enough.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 12 posts - 1 through 11 (of 11 total)

You must be logged in to reply to this topic. Login to reply