Extract rows from a table/CTE when a column matches

  • HI All

    I have been struggling with what seems to be a fairly simple piece of t-sql code.

    I have a CTE which creates the following results.

    Invoice       First4     AmountMst

    94579320  9457      -27144.0000000000000000

    94593496  9459     -27144.0000000000000000

    SI012101   SI01      -30480.4500000000000000

    SI012102   SI01      -30480.4500000000000000

    Hopefully you can see there are four unique invoices, however two rows have the same first four characters the column is First4

    All I want to return from the next query is the two rows with matching First4 SI01

    I have tried creating two CTEs and joining them to each other. I have tried group by but this just gives me the one row SI01 I need both.

    Here is the code which generates the CTE which won't help much I know.

    ; with dupes as

    (

    SELECT Invoice

    ,SUBSTRING(Invoice,1,4) as First4

    ,AmountMst

    from vendtrans

    where invoice in

    (

    '94579320'

    ,'94593496'

    ,'SI012101'

    ,'SI012102')

    )

    select * from dupes

    The next step I need to do is the one I can't seem to crack.

    Any help greatly appreciated.

     

     

  • I might be tempted to do it in 2 stages

    select * from dupes inner join (select first4 from dupes group by first4 having count(*) >1) dupes2  on dupes2.first4=dupes.first4

    you might have to fiddle with that as i did it in a web browser and i don't have your original table

     

    MVDBA

  • Thanks Mike

    I ended up doing this

     

    ;WITH First4Count_Cte (Invoice, First4, First4Count, Amountmst) AS (

    select

    Invoice,

    First4,

    row_number () OVER (PARTITION BY First4 ORDER BY First4) as First4Count,

    Amountmst

    from tbl_test

    ) select * from First4Count_Cte

    where First4 in (

    select First4 from First4Count_Cte where First4Count > 1

    )

  • Your solution will get the correct result, but it is running the entire First4Count_Cte twice.    Basically generating a huge number of rows and then filtering them out.    The GROUP BY solution generates a smaller number of rows and joins directly to the dupes table.   If you care about performance, you should probably test both solutions.    I suspect the CPU load will be lighter using the solution using the GROUP BY.

     

    __________________________________________________

    Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
    Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills

  • Could we by any chance look at the underlying CTE that you mentioned?

     

    __________________________________________________

    Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
    Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply