SQL Server Query Writing Help Needed

  • Hi Cadavre,

    Thanks for helping with this request. What you have is almost exactly what i want. However, what i found that a lot of the accounts have last two rows [repcode] as the same with different current date. This is what i want, however, if the last two rows of the record has the same repcode starting with a Z or X or ending with XX, i want to go to the third to the last row to get the repcode.

    So, sorry, but i'm not able to organize in the post how you have it so i will do my best.

    AccountDim table

    Account # RepCode Currentdate

    12345678 LM01 2011-09-01

    12345678 LM02 2011-10-11

    12345678 XAB0 2011-10-25

    12345678 XAB0 2011-10-31

    AccountTable

    Account# RepCode CurrentDate

    12345678 XAB0 2011-10-31

    your query gives me result as such:

    Account# RepCodePR RepCodeCU CurrentDate

    12345678 XAB0 XAB0 2011-10-31

    How can i get this result?

    12345678 LM02 XAB0 2011-10-31

    please advise.

    Your help is most appreciated!

  • Paraphrasing here.

    You want the current info from Account, but you want the last record from Accountdim before the RepCode changed to what is in Account, correct?

  • ankita.patel01 81294 (11/3/2011)


    Hi Cadavre,

    Thanks for helping with this request. What you have is almost exactly what i want. However, what i found that a lot of the accounts have last two rows [repcode] as the same with different current date. This is what i want, however, if the last two rows of the record has the same repcode starting with a Z or X or ending with XX, i want to go to the third to the last row to get the repcode.

    So, sorry, but i'm not able to organize in the post how you have it so i will do my best.

    Sorry, but I find it irritating enough when the requirements are "changed" during coding when I'm at work, there's no chance of me being able to handle it when I don't have to.

    As I said in my previous post, please read this article[/url] --> http://www.sqlservercentral.com/articles/Best+Practices/61537/[/url].

    It explains the best way to post DDL and sample data in a way that is readily consumable by the volunteers that post here and will ensure that someone can post working, tested code for you.

    Either that or I'll leave you in the much more patient and capable hands of Lynn 🙂


    Forever trying to learn
    My blog - http://www.cadavre.co.uk/
    For better, quicker answers on T-SQL questions, click on the following...http://www.sqlservercentral.com/articles/Best+Practices/61537/
    For better, quicker answers on SQL Server performance related questions, click on the following...http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

  • Lynn,

    this is correct! please advise.

  • Cadavre,

    I didnt realize the data was doing this until i saw it. Maybe this will help you.

    CREATE TABLE #Account ([Account#] BIGINT, RepCode CHAR(4))

    INSERT INTO #Account

    SELECT 12345678, 'XX01'

    CREATE TABLE #Accountdim ([Account#] BIGINT, RepCode CHAR(4), CurrentDate DATETIME)

    INSERT INTO #Accountdim

    SELECT 12345678, 'AB01', '2011-09-01'

    UNION ALL SELECT 12345678, 'AB05', '2011-10-18'

    UNION ALL SELECT 12345678, 'XX01', '2011-11-01'

    UNION ALL SELECT 12345678, 'XX01', '2011-11-02'

    expected results::

    Account# RepCode[pr] RepCode[cu] CurrentDate

    -------------------- ----------- ----------- -----------------------

    12345678 AB05 XX01 2011-10-21 00:00:00.000

  • sorry...

    expected result set should be...

    Account# RepCode[pr] RepCode[cu]

    -------------------- ----------- -----------

    12345677 AB05 XX01

  • Does this do basically what you are looking for?

    create table #Account (

    AccountNo int,

    RepCode char(4)

    );

    create table #Accountdim (

    AccountNo int,

    RepCode char(4),

    CurrentDate datetime

    );

    insert into #Account (AccountNo, RepCode)

    values (12345678, 'XX01'),(12345677, 'XAB0'),(12345679,'Z001');

    insert into #Accountdim (AccountNo, RepCode, CurrentDate)

    values

    (12345678, 'AB01', '2011-09-01'),

    (12345678, 'AB05', '2011-10-18'),

    (12345678, 'PP21', '2011-11-01'),

    (12345678, 'XX01', '2011-11-02'),

    (12345677, 'AB01', '2011-08-01'),

    (12345677, 'AB05', '2011-10-08'),

    (12345677, 'TY00', '2011-10-21'),

    (12345677, 'XAB0', '2011-10-30'),

    (12345679, 'AB01', '2011-09-01'),

    (12345679, 'AB05', '2011-09-18'),

    (12345679, 'LM01', '2011-11-01'),

    (12345679, 'Z001', '2011-11-02');

    with cte_Accountdim as (

    select

    ad.AccountNo,

    ad.RepCode,

    CurrentDate,

    row_number() over (partition by ad.AccountNo order by ad.CurrentDate desc) as RowNum

    from

    #Accountdim ad

    inner join #Account ac

    on (ad.AccountNo = ac.AccountNo)

    where

    ad.RepCode <> ac.RepCode

    )

    select

    acc.AccountNo,

    acc.RepCode as RepCodeCurrent,

    ca.RepCode as RepCodePrevious,

    ca.CurrentDate

    from

    #Account acc

    inner join cte_Accountdim ca

    on (acc.AccountNo = ca.AccountNo)

    where

    ca.RowNum = 1

    and (acc.RepCode like '[X,Z]%' or acc.RepCode like '%XX');

    drop table #Account;

    drop table #AccountDim;

  • ankita.patel01 81294 (11/3/2011)


    Cadavre,

    I didnt realize the data was doing this until i saw it. Maybe this will help you.

    CREATE TABLE #Account ([Account#] BIGINT, RepCode CHAR(4))

    INSERT INTO #Account

    SELECT 12345678, 'XX01'

    CREATE TABLE #Accountdim ([Account#] BIGINT, RepCode CHAR(4), CurrentDate DATETIME)

    INSERT INTO #Accountdim

    SELECT 12345678, 'AB01', '2011-09-01'

    UNION ALL SELECT 12345678, 'AB05', '2011-10-18'

    UNION ALL SELECT 12345678, 'XX01', '2011-11-01'

    UNION ALL SELECT 12345678, 'XX01', '2011-11-02'

    expected results::

    Account# RepCode[pr] RepCode[cu] CurrentDate

    -------------------- ----------- ----------- -----------------------

    12345678 AB05 XX01 2011-10-21 00:00:00.000

    How's this?

    SELECT [Account#], RepCode AS 'RepCode[pr]', cuRepCode AS 'RepCode[cu]',

    CurrentDate

    FROM (SELECT CASE WHEN a.RepCode = b.RepCode

    THEN NULL ELSE a.[Account#] END AS [Account#],

    a.CurrentDate, a.RepCode, b.RepCode AS cuRepCode,

    ROW_NUMBER() OVER(PARTITION BY CASE WHEN a.RepCode = b.RepCode

    THEN NULL ELSE a.[Account#] END

    ORDER BY DATEDIFF(dd,0,a.CurrentDate) DESC) AS rn

    FROM #Accountdim a

    INNER JOIN #Account b ON a.[Account#] = b.[Account#]

    WHERE CurrentDate >= '2009-01-01') accountdim_1

    WHERE [Account#] IS NOT NULL AND rn = 1


    --EDIT--

    Perf-testing 🙂

    IF object_id('tempdb..#Account') IS NOT NULL

    BEGIN

    DROP TABLE #Account

    END

    --100 Random rows of data

    SELECT TOP 100 IDENTITY(INT,1,1) AS [Account#],

    CHAR((ABS(CHECKSUM(NEWID())) % 24) + 65) + CHAR((ABS(CHECKSUM(NEWID())) % 24) + 65) +

    RIGHT('0'+CONVERT(VARCHAR(2),(ABS(CHECKSUM(NEWID())) % 99) + 1),2) AS RepCode

    INTO #Account

    FROM master.dbo.syscolumns sc1, master.dbo.syscolumns sc2, master.dbo.syscolumns sc3

    IF object_id('tempdb..#Accountdim') IS NOT NULL

    BEGIN

    DROP TABLE #Accountdim

    END

    --1,000,000 Random rows of data

    SELECT TOP 1000000 IDENTITY(INT,1,1) AS ID,

    (ABS(CHECKSUM(NEWID())) % 100) + 1 AS [Account#],

    CHAR((ABS(CHECKSUM(NEWID())) % 24) + 65) + CHAR((ABS(CHECKSUM(NEWID())) % 24) + 65) +

    RIGHT('0'+CONVERT(VARCHAR(2),(ABS(CHECKSUM(NEWID())) % 99) + 1),2) AS RepCode,

    RAND(CHECKSUM(NEWID())) * 30000 + CAST('1945' AS DATETIME) AS CurrentDate

    INTO #Accountdim

    FROM master.dbo.syscolumns sc1, master.dbo.syscolumns sc2, master.dbo.syscolumns sc3

    --Add index

    CREATE NONCLUSTERED INDEX [tmp_Accountdim__on__currentDate__inc__Account#_&_RepCode]

    ON [dbo].[#Accountdim] ([CurrentDate])

    INCLUDE ([Account#],[RepCode])

    PRINT '========== BASELINE =========='

    SET STATISTICS TIME ON

    SELECT COUNT(*) FROM #Account

    UNION ALL

    SELECT COUNT(*) FROM #Accountdim

    SET STATISTICS TIME OFF

    PRINT REPLICATE('=',80)

    PRINT '========== SELECT SUBQUERY =========='

    SET STATISTICS TIME ON

    SELECT [Account#], RepCode AS 'RepCode[pr]', cuRepCode AS 'RepCode[cu]',

    CurrentDate

    FROM (SELECT CASE WHEN a.RepCode = b.RepCode

    THEN NULL ELSE a.[Account#] END AS [Account#],

    a.CurrentDate, a.RepCode, b.RepCode AS cuRepCode,

    ROW_NUMBER() OVER(PARTITION BY CASE WHEN a.RepCode = b.RepCode

    THEN NULL ELSE a.[Account#] END

    ORDER BY DATEDIFF(dd,0,a.CurrentDate) DESC) AS rn

    FROM #Accountdim a

    INNER JOIN #Account b ON a.[Account#] = b.[Account#]

    WHERE CurrentDate >= '2009-01-01') accountdim_1

    WHERE [Account#] IS NOT NULL AND rn = 1

    SET STATISTICS TIME OFF

    PRINT REPLICATE('=',80)

    PRINT '========== CTE =========='

    SET STATISTICS TIME ON

    ;

    WITH cte_Accountdim AS (

    SELECT ad.[Account#], ad.RepCode, CurrentDate,

    ROW_NUMBER() OVER (PARTITION BY ad.[Account#] ORDER BY ad.CurrentDate DESC) AS RowNum

    FROM #Accountdim ad

    INNER JOIN #Account ac ON (ad.[Account#] = ac.[Account#])

    WHERE ad.RepCode <> ac.RepCode)

    SELECT acc.[Account#], acc.RepCode AS RepCodeCurrent,

    ca.RepCode AS RepCodePrevious, ca.CurrentDate

    FROM #Account acc

    INNER JOIN cte_Accountdim ca ON (acc.[Account#] = ca.[Account#])

    WHERE ca.RowNum = 1

    SET STATISTICS TIME OFF

    PRINT REPLICATE('=',80)

    ========== BASELINE ==========

    (2 row(s) affected)

    SQL Server Execution Times:

    CPU time = 187 ms, elapsed time = 195 ms.

    ================================================================================

    ========== SELECT SUBQUERY ==========

    (100 row(s) affected)

    SQL Server Execution Times:

    CPU time = 1045 ms, elapsed time = 873 ms.

    ================================================================================

    ========== CTE ==========

    (100 row(s) affected)

    SQL Server Execution Times:

    CPU time = 4906 ms, elapsed time = 2564 ms.

    ================================================================================


    Forever trying to learn
    My blog - http://www.cadavre.co.uk/
    For better, quicker answers on T-SQL questions, click on the following...http://www.sqlservercentral.com/articles/Best+Practices/61537/
    For better, quicker answers on SQL Server performance related questions, click on the following...http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

  • How about this?

    SELECT A.[Account#] , A.RepCode [OldRepCode] , CrsAppOutput.RepCode [CurrentRepCode]

    FROM #Account A

    CROSS APPLY

    ( SELECT TOP 1 AD.RepCode

    FROM #Accountdim AD

    WHERE A.[Account#] = AD.[Account#]

    AND A.RepCode <> AD.RepCode

    AND AD.CurrentDate >= '2009-01-01'

    ORDER BY AD.CurrentDate DESC

    ) CrsAppOutput

    WHERE

    ( A.RepCode LIKE '[X,Z]%'

    OR A.RepCode LIKE '%XX' )

  • this is fine. thank you for all your help lynn and Cadavre!

  • ankita.patel01 81294 (11/3/2011)


    this is fine. thank you for all your help lynn and Cadavre!

    Test all three solutions against your own data, but ColdCoffee's CROSS APPLY is looking unbeatable on my test set-up.

    BEGIN TRAN

    IF object_id('tempdb..#Account') IS NOT NULL

    BEGIN

    DROP TABLE #Account

    END

    --100 Random rows of data

    SELECT TOP 100 IDENTITY(INT,1,1) AS [Account#],

    CHAR((ABS(CHECKSUM(NEWID())) % 24) + 65) + CHAR((ABS(CHECKSUM(NEWID())) % 24) + 65) +

    RIGHT('0'+CONVERT(VARCHAR(2),(ABS(CHECKSUM(NEWID())) % 99) + 1),2) AS RepCode

    INTO #Account

    FROM master.dbo.syscolumns sc1, master.dbo.syscolumns sc2, master.dbo.syscolumns sc3

    IF object_id('tempdb..#Accountdim') IS NOT NULL

    BEGIN

    DROP TABLE #Accountdim

    END

    --1,000,000 Random rows of data

    SELECT TOP 1000000 IDENTITY(INT,1,1) AS ID,

    (ABS(CHECKSUM(NEWID())) % 100) + 1 AS [Account#],

    CHAR((ABS(CHECKSUM(NEWID())) % 24) + 65) + CHAR((ABS(CHECKSUM(NEWID())) % 24) + 65) +

    RIGHT('0'+CONVERT(VARCHAR(2),(ABS(CHECKSUM(NEWID())) % 99) + 1),2) AS RepCode,

    RAND(CHECKSUM(NEWID())) * 30000 + CAST('1945' AS DATETIME) AS CurrentDate

    INTO #Accountdim

    FROM master.dbo.syscolumns sc1, master.dbo.syscolumns sc2, master.dbo.syscolumns sc3

    --Add index

    CREATE NONCLUSTERED INDEX [tmp_Accountdim__on__currentDate__inc__Account#_&_RepCode]

    ON [dbo].[#Accountdim] ([CurrentDate])

    INCLUDE ([Account#],[RepCode])

    PRINT '========== BASELINE =========='

    SET STATISTICS TIME ON

    SELECT COUNT(*) FROM #Account

    UNION ALL

    SELECT COUNT(*) FROM #Accountdim

    SET STATISTICS TIME OFF

    PRINT REPLICATE('=',80)

    PRINT '========== SELECT SUBQUERY =========='

    SET STATISTICS TIME ON

    SELECT [Account#], RepCode AS 'RepCode[pr]', cuRepCode AS 'RepCode[cu]',

    CurrentDate

    FROM (SELECT CASE WHEN a.RepCode = b.RepCode

    THEN NULL ELSE a.[Account#] END AS [Account#],

    a.CurrentDate, a.RepCode, b.RepCode AS cuRepCode,

    ROW_NUMBER() OVER(PARTITION BY CASE WHEN a.RepCode = b.RepCode

    THEN NULL ELSE a.[Account#] END

    ORDER BY DATEDIFF(dd,0,a.CurrentDate) DESC) AS rn

    FROM #Accountdim a

    INNER JOIN #Account b ON a.[Account#] = b.[Account#]

    WHERE CurrentDate >= '2009-01-01') accountdim_1

    WHERE [Account#] IS NOT NULL AND rn = 1

    SET STATISTICS TIME OFF

    PRINT REPLICATE('=',80)

    PRINT '========== CTE =========='

    SET STATISTICS TIME ON

    ;

    WITH cte_Accountdim AS (

    SELECT ad.[Account#], ad.RepCode, CurrentDate,

    ROW_NUMBER() OVER (PARTITION BY ad.[Account#] ORDER BY ad.CurrentDate DESC) AS RowNum

    FROM #Accountdim ad

    INNER JOIN #Account ac ON (ad.[Account#] = ac.[Account#])

    WHERE ad.RepCode <> ac.RepCode)

    SELECT acc.[Account#], acc.RepCode AS RepCodeCurrent,

    ca.RepCode AS RepCodePrevious, ca.CurrentDate

    FROM #Account acc

    INNER JOIN cte_Accountdim ca ON (acc.[Account#] = ca.[Account#])

    WHERE ca.RowNum = 1

    SET STATISTICS TIME OFF

    PRINT REPLICATE('=',80)

    PRINT '========== CROSS APPLY =========='

    SET STATISTICS TIME ON

    SELECT A.[Account#] , A.RepCode [OldRepCode] , CrsAppOutput.RepCode [CurrentRepCode]

    FROM #Account A

    CROSS APPLY

    ( SELECT TOP 1 AD.RepCode

    FROM #Accountdim AD

    WHERE A.[Account#] = AD.[Account#]

    AND A.RepCode <> AD.RepCode

    AND AD.CurrentDate >= '2009-01-01'

    ORDER BY AD.CurrentDate DESC

    ) CrsAppOutput

    SET STATISTICS TIME OFF

    PRINT REPLICATE('=',80)

    ROLLBACK

    ========== BASELINE ==========

    (2 row(s) affected)

    SQL Server Execution Times:

    CPU time = 188 ms, elapsed time = 188 ms.

    ================================================================================

    ========== SELECT SUBQUERY ==========

    (100 row(s) affected)

    SQL Server Execution Times:

    CPU time = 1108 ms, elapsed time = 828 ms.

    ================================================================================

    ========== CTE ==========

    (100 row(s) affected)

    SQL Server Execution Times:

    CPU time = 5189 ms, elapsed time = 2790 ms.

    ================================================================================

    ========== CROSS APPLY ==========

    (100 row(s) affected)

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 10 ms.

    ================================================================================


    Forever trying to learn
    My blog - http://www.cadavre.co.uk/
    For better, quicker answers on T-SQL questions, click on the following...http://www.sqlservercentral.com/articles/Best+Practices/61537/
    For better, quicker answers on SQL Server performance related questions, click on the following...http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

  • Results from my laptop ( my laptop sucks 😀 )

    (100 row(s) affected)

    (1000000 row(s) affected)

    ========== BASELINE ==========

    (2 row(s) affected)

    SQL Server Execution Times:

    CPU time = 110 ms, elapsed time = 112 ms.

    ================================================================================

    ========== SELECT SUBQUERY ==========

    (100 row(s) affected)

    SQL Server Execution Times:

    CPU time = 1074 ms, elapsed time = 883 ms.

    ================================================================================

    ========== CTE ==========

    (100 row(s) affected)

    SQL Server Execution Times:

    CPU time = 4510 ms, elapsed time = 4002 ms.

    ================================================================================

    ========== CROSS APPLY ==========

    (100 row(s) affected)

    SQL Server Execution Times:

    CPU time = 16 ms, elapsed time = 12 ms.

    ================================================================================

Viewing 12 posts - 16 through 26 (of 26 total)

You must be logged in to reply to this topic. Login to reply