SQL Server Query Writing Help Needed

  • Hello,

    I need to write a query that does the following.

    We have two tables.

    Account - Need to pull fields RepCode [this is the current rep code assigned to the account, need only those starting with Z, X or ending in XX], Account #

    AccountDim - need to pull fields repcode [this is the previous rep code that is the second last repcode before it is changed to the account tables rep code], most current date

    Data looks like

    Account table

    Account# RepCode

    12345678 XX01

    12345677 XAB0

    12345679 Z001

    Accountdim table

    Account# RepCode CurrentDate

    12345678 AB01 2011-09-01

    12345678 AB05 2011-10-18

    12345678 PP21 2011-11-01

    12345678 XX01 2011-11-02

    12345677 AB01 2011-08-01

    12345677 AB05 2011-10-08

    12345677 TY00 2011-10-21

    12345677 XAB0 2011-10-30

    12345679 AB01 2011-09-01

    12345679 AB05 2011-09-18

    12345679 LM01 2011-11-01

    12345679 Z001 2011-11-02

    Expected result:

    Account# RepCode[pr] RepCode[cu] CurrentDate

    12345678 PP21 XX01

    12345677 TY00 XAB0

    12345679 LM01 Z001

    These are huge tables.. both contain millions of rows and hundred or so fields.

    can someone help with writing the query ?? I need to get this done ASAP...

    Thanks!!!!

  • Here you go. Let me know if you have any questions.

    ;WITH [PreviousUpdate]

    AS

    ( SELECT AccountNo

    ,MAX(RecordDate) AS [CurrentDate]

    FROM AccountDim

    GROUP BY AccountNo

    ),

    [PreviousRep]

    AS

    ( SELECT AccountNo

    ,RepCode

    FROM AccountDim

    JOIN PreviousUpdate

    ON AccountDim.AccountNo = PreviousUpdate.AccountNo

    AND AccountDim.RecordDate = PreviousUpdate.CurrentDate

    )

    SELECT Account.AccountNo AS [AccountNo]

    ,Account.RepCode AS [CurrentRepCode]

    ,PreviousRep.RepCode AS [PreviousRepCode]

    ,PreviousUpdate.CurrentDate AS [CurrentDate]

    FROM Account

    JOIN PreviousUpdate

    ON Account.AccountNo = PreviousUpdate.AccountNo

    JOIN PreviousRep

    ON Account.AccountNo = PreviousRep.AccountNo

    WHERE Account.RepCode LIKE '[X,Z]%'

    OR Account.RepCode LIKE '%XX'

  • Try this for a starting point.

    with cte_Accountdim as (

    select

    AccountNo,

    RepCode,

    CurrentDate,

    row_number() over (partition by AccountNo order by CurrentDate desc) as RowNum

    from

    dbo.Accountdim

    )

    select

    acc.AccountNo,

    acc.RepCode,

    ca.AccountNo,

    ca.RepCode,

    ca.CurrentDate

    from

    dbo.Account acc

    inner join cte_Accountdim ca

    on (acc.AccountNo = ca.AccountNo)

    where

    ca.RowNum = 1;

  • Fixed my code a bit:

    with cte_Accountdim as (

    select

    AccountNo,

    RepCode,

    CurrentDate,

    row_number() over (partition by AccountNo order by CurrentDate desc) as RowNum

    from

    dbo.Accountdim

    )

    select

    acc.AccountNo,

    acc.RepCode,

    ca.AccountNo,

    ca.RepCode,

    ca.CurrentDate

    from

    dbo.Account acc

    inner join cte_Accountdim ca

    on (acc.AccountNo = ca.AccountNo)

    where

    ca.RowNum = 1

    and (acc.RepCode like '[X,Z]%' or acc.RepCode like '%XX');

  • David Moutray (11/2/2011)


    Here you go. Let me know if you have any questions.

    ;WITH [PreviousUpdate]

    AS

    ( SELECT AccountNo

    ,MAX(RecordDate) AS [CurrentDate]

    FROM AccountDim

    GROUP BY AccountNo

    ),

    [PreviousRep]

    AS

    ( SELECT AccountNo

    ,RepCode

    FROM AccountDim

    JOIN PreviousUpdate

    ON AccountDim.AccountNo = PreviousUpdate.AccountNo

    AND AccountDim.RecordDate = PreviousUpdate.CurrentDate

    )

    SELECT Account.AccountNo AS [AccountNo]

    ,Account.RepCode AS [CurrentRepCode]

    ,PreviousRep.RepCode AS [PreviousRepCode]

    ,PreviousUpdate.CurrentDate AS [CurrentDate]

    FROM Account

    JOIN PreviousUpdate

    ON Account.AccountNo = PreviousUpdate.AccountNo

    JOIN PreviousRep

    ON Account.AccountNo = PreviousRep.AccountNo

    WHERE Account.RepCode LIKE '[X,Z]%'

    OR Account.RepCode LIKE '%XX'

    Sure do have a lot joins going on here. Don't really need all of those if you are using SQL Server 2005 or later.

  • Here is another option:

    with cte_maxacc as (

    select

    acc.AccountNo,

    acc.RepCode,

    max(ad.CurrentDate) as MaxDate

    from

    dbo.Account acc

    inner join dbo.Accountdim ad

    on (acc.AccountNo = ad.AccountNo)

    group by

    acc.AccountNo,

    acc.RebCode

    )

    select

    cma.AccountNo,

    cma.RepCode,

    acd.RepCode,

    cma.MaxDate

    from

    cte_maxacc cma

    inner join dbo.Accountdim acd

    on (cma.AccountNo = acd.AccountNo and

    cma.MaxDate = acd.CurrentDate)

    where

    cma.RepCode like '[X,Z]' or

    cma.RepCode like '%XX';

  • Account - Need to pull fields RepCode [this is the current rep code assigned to the account, need only those starting with Z, X or ending in XX]

    Because of this requirement, the queries as stated will not perform well. A search clause like WHERE RepCode LIKE '[X,Z]%' is not SARGable, and it can't take advantage of an index on the RepCode field. That is going to be a major factor in performance for a table with millions of records.

    One way to get around this would be to create a computed column against RepCode. Depending on the distribution of data, you might get a performance boost by creating a filtered, persisted index against the computed column. Then the query could use the computed column in the WHERE clause.

    That's a lot of trouble to go to unless performance is important, though.

  • This is not bringing back expected result or the expected row count back.

  • ankita.patel01 81294 (11/3/2011)


    This is not bringing back expected result or the expected row count back.

    That's not going to help anyone fix the code...

    What's it doing wrong? What should the queries be returning and what are they returning? Remember we can't see your screen and we can't read your mind.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • sorry. i was in the process of putting my thoughts together so i can show you what i meant in the comments...

    i know i missed to include this, but i put the filter in the where clause on my own, i need current date > '2009-01-01', but results are coming back with dates < 2009.

    the current (max) date being pulled is coming wrong.

    the current and previous repcodes are exactly the same.

  • this is the result i want.

    Account# RepCode[pr] RepCode[cu] CurrentDate

    12345678 PP21 XX01

    12345677 TY00 XAB0

    12345679 LM01 Z001

    i want account# from tblAccount, Previous second last rep code from tblaccountdim, current repcode from tblaccount.

    Thanks!!!!! I appreciate any help!

  • i have this query... need help combining into one.

    select max(currentdate) as currentdate

    , d.accountnumber

    , d.repcode as PreviousRepCode

    into #tempdim

    from tblaccountdim d

    where d.currentdate > '2009-01-01'

    and ( left(d.repcode ,1) not in ('X', 'Z')

    --left(d.repcode),1) < 'Z'

    and right(rtrim(d.repcode),2) != 'XX')

    group by accountnumber, d.repcode

    --closed accounts

    select accountnumber

    , repcode

    --into #tempacct

    from tblaccountdimcurrent

    where currentdate > '2011-10-10'

    and (repcode like 'X%'

    or repcode like 'Z%'

    or repcode like '%XX')

    select Currentdate

    , d.PreviousRepcode

    , a.Repcode as CurrentRepCode

    , a.account

    into #tempall

    from #tempacct a

    inner join #tempdim d

    on a.account = d.accountnumber

    any thoughts??

  • Hello and welcome to SSC!

    It seems that your DDL script has become detached from your post, or perhaps you were unaware of the benefits of providing one.

    When you have time, please read this article[/url] about the best way to provide us with working sample data and DDL scripts. This will allow the unpaid volunteers of this site to provide you with working, tested code for your particular problem.

    For now, I've had a guess at your requirements based on the posts by yourself and others.

    First, lets' create a test environment.

    CREATE TABLE #Account ([Account#] BIGINT, RepCode CHAR(4))

    INSERT INTO #Account

    SELECT 12345678, 'XX01'

    UNION ALL SELECT 12345677, 'XAB0'

    UNION ALL SELECT 12345679, 'Z001'

    CREATE TABLE #Accountdim ([Account#] BIGINT, RepCode CHAR(4), CurrentDate DATETIME)

    INSERT INTO #Accountdim

    SELECT 12345678, 'AB01', '2011-09-01'

    UNION ALL SELECT 12345678, 'AB05', '2011-10-18'

    UNION ALL SELECT 12345678, 'PP21', '2011-11-01'

    UNION ALL SELECT 12345678, 'XX01', '2011-11-02'

    UNION ALL SELECT 12345677, 'AB01', '2011-08-01'

    UNION ALL SELECT 12345677, 'AB05', '2011-10-08'

    UNION ALL SELECT 12345677, 'TY00', '2011-10-21'

    UNION ALL SELECT 12345677, 'XAB0', '2011-10-30'

    UNION ALL SELECT 12345679, 'AB01', '2011-09-01'

    UNION ALL SELECT 12345679, 'AB05', '2011-09-18'

    UNION ALL SELECT 12345679, 'LM01', '2011-11-01'

    UNION ALL SELECT 12345679, 'Z001', '2011-11-02'

    Now let's look at the query.

    SELECT account.[Account#], accountdim.RepCode AS 'RepCode[pr]',

    account.RepCode AS 'RepCode[cu]', accountdim.CurrentDate

    FROM #Account account

    INNER JOIN (SELECT [Account#], CurrentDate, RepCode

    FROM (SELECT [Account#], CurrentDate, RepCode,

    ROW_NUMBER() OVER(PARTITION BY [Account#] ORDER BY CurrentDate DESC) AS rn

    FROM #Accountdim

    WHERE CurrentDate >= '2009-01-01') accountdim_1

    WHERE rn = 2) accountdim ON account.[Account#] = accountdim.[Account#]

    Which returns

    Account# RepCode[pr] RepCode[cu] CurrentDate

    -------------------- ----------- ----------- -----------------------

    12345677 TY00 XAB0 2011-10-21 00:00:00.000

    12345678 PP21 XX01 2011-11-01 00:00:00.000

    12345679 LM01 Z001 2011-11-01 00:00:00.000

    Is that what you're after?

    --EDIT--

    And here's some performance testing.

    IF object_id('tempdb..#Account') IS NOT NULL

    BEGIN

    DROP TABLE #Account

    END

    --100 Random rows of data

    SELECT TOP 100 IDENTITY(INT,1,1) AS [Account#],

    CHAR((ABS(CHECKSUM(NEWID())) % 24) + 65) + CHAR((ABS(CHECKSUM(NEWID())) % 24) + 65) +

    RIGHT('0'+CONVERT(VARCHAR(2),(ABS(CHECKSUM(NEWID())) % 99) + 1),2) AS RepCode

    INTO #Account

    FROM master.dbo.syscolumns sc1, master.dbo.syscolumns sc2, master.dbo.syscolumns sc3

    IF object_id('tempdb..#Accountdim') IS NOT NULL

    BEGIN

    DROP TABLE #Accountdim

    END

    --1,000,000 Random rows of data

    SELECT TOP 1000000 IDENTITY(INT,1,1) AS ID,

    (ABS(CHECKSUM(NEWID())) % 100) + 1 AS [Account#],

    CHAR((ABS(CHECKSUM(NEWID())) % 24) + 65) + CHAR((ABS(CHECKSUM(NEWID())) % 24) + 65) +

    RIGHT('0'+CONVERT(VARCHAR(2),(ABS(CHECKSUM(NEWID())) % 99) + 1),2) AS RepCode,

    RAND(CHECKSUM(NEWID())) * 30000 + CAST('1945' AS DATETIME) AS CurrentDate

    INTO #Accountdim

    FROM master.dbo.syscolumns sc1, master.dbo.syscolumns sc2, master.dbo.syscolumns sc3

    --Add index

    CREATE NONCLUSTERED INDEX [tmp_Accountdim__on__currentDate__inc__Account#_&_RepCode]

    ON [dbo].[#Accountdim] ([CurrentDate])

    INCLUDE ([Account#],[RepCode])

    PRINT '========== BASELINE =========='

    SET STATISTICS TIME ON

    SELECT COUNT(*) FROM #Account

    UNION ALL

    SELECT COUNT(*) FROM #Accountdim

    SET STATISTICS TIME OFF

    PRINT REPLICATE('=',80)

    PRINT '========== QUERY =========='

    SET STATISTICS TIME ON

    SELECT account.[Account#], accountdim.RepCode AS 'RepCode[pr]',

    account.RepCode AS 'RepCode[cu]', accountdim.CurrentDate

    FROM #Account account

    INNER JOIN (SELECT [Account#], CurrentDate, RepCode

    FROM (SELECT [Account#], CurrentDate, RepCode,

    ROW_NUMBER() OVER(PARTITION BY [Account#] ORDER BY CurrentDate DESC) AS rn

    FROM #Accountdim

    WHERE CurrentDate >= '2009-01-01') accountdim_1

    WHERE rn = 2) accountdim ON account.[Account#] = accountdim.[Account#]

    ORDER BY account.[Account#]

    SET STATISTICS TIME OFF

    PRINT REPLICATE('=',80)

    ========== BASELINE ==========

    (2 row(s) affected)

    SQL Server Execution Times:

    CPU time = 141 ms, elapsed time = 144 ms.

    ================================================================================

    ========== QUERY ==========

    (100 row(s) affected)

    SQL Server Execution Times:

    CPU time = 625 ms, elapsed time = 369 ms.

    ================================================================================


    Forever trying to learn
    My blog - http://www.cadavre.co.uk/
    For better, quicker answers on T-SQL questions, click on the following...http://www.sqlservercentral.com/articles/Best+Practices/61537/
    For better, quicker answers on SQL Server performance related questions, click on the following...http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

  • ankita.patel01 81294 (11/3/2011)


    sorry. i was in the process of putting my thoughts together so i can show you what i meant in the comments...

    i know i missed to include this, but i put the filter in the where clause on my own, i need current date > '2009-01-01', but results are coming back with dates < 2009.

    the current (max) date being pulled is coming wrong.

    the current and previous repcodes are exactly the same.

    Please elaborate on the above, and well as on this requirement:

    AccountDim - need to pull fields repcode [this is the previous rep code that is the second last repcode before it is changed to the account tables rep code], most current date

  • Hey Lynn,

    The post by 'Cadavre' is what i'm trying to do... i did put my query and i can see it... but i'm not sure if you can see it in my post.

    SELECT account.[Account#], accountdim.RepCode AS 'RepCode[pr]',

    account.RepCode AS 'RepCode[cu]', accountdim.CurrentDate

    FROM #Account account

    INNER JOIN (SELECT [Account#], CurrentDate, RepCode

    FROM (SELECT [Account#], CurrentDate, RepCode,

    ROW_NUMBER() OVER(PARTITION BY [Account#] ORDER BY CurrentDate DESC) AS rn

    FROM #Accountdim

    WHERE CurrentDate >= '2009-01-01') accountdim_1

    WHERE rn = 2) accountdim ON account.[Account#] = accountdim.[Account#]

    But i'm getting this error message.

    Msg 4104, Level 16, State 1, Line 1

    The multi-part identifier "tblaccount.Account" could not be bound.

    Msg 4104, Level 16, State 1, Line 1

    The multi-part identifier "tblaccount.Account" could not be bound.

    Msg 4104, Level 16, State 1, Line 1

    The multi-part identifier "tblaccount.RepCode" could not be bound.

    any thoughts?

Viewing 15 posts - 1 through 15 (of 26 total)

You must be logged in to reply to this topic. Login to reply