Most efficient way to count subset of rows in a table?

  • Hi all,

    I’ve a table with production objects, and another with possible items composition of the object. I need to count how many occurrences of each standard composition appears:

    Table PROD:

    PROD_ID    COMPONENT   TYPE

    --------- ----------- -----

    1         AAA         X

    1         BBB         Y

    2         AAA         X

    3         BBB         Y

    4         AAA         Y

    5         AAA         X

    5         BBB         Y

    Table ITEM_COMP

    ITEM_ID   COMPONENT   TYPE

    --------  ---------   -----

    7         AAA         X

    7         BBB         Y

    8         AAA         X

    9         BBB         Y

    The result should be:

    ITEM_ID   OCCURRENCES

    --------  ----

    7         2

    8         1

    9         1

    Table PROD have millions of rows, my way is too slow (I’ve a loop where each PROD object are separately queried against the ITEM_COMP), some have an idea for a most efficient way?

    > >

    ThX

    > >

    NeuralC

  • Not sure I get what you need but this will procude the requested resultset.

    Select ITEM_ID, count(*) as Total from dbo.ITEM_COMP

  • Ok! My example isn't very happy!!

     

    Changing the tables:

    PROD_ID    COMPONENT   TYPE

    --------- ----------- -----

    1         AAA         X

    1         BBB         Y

    2         AAA         X

    3         BBB         Y

    4         AAA         Y

    5         AAA         X

    5         BBB         Y

    6         AAA         X

    6         BBB         Y

    7         AAA         X

    Table ITEM_COMP

    ITEM_ID   COMPONENT   TYPE

    --------  ---------   -----

    7         AAA         X

    7         BBB         Y

    8         AAA         X

    9         BBB         Y

    The result should be:

    ITEM_ID   OCCURRENCES

    --------  ----

    7         3

    8         2

    9         1

     

    ThX

    NeuralC

  • So you need to count the number of time the items are in both the ITEM_COMP and products table, but only if they are in the ITEM_COMP table?

  • Then item_id 8 should correspond to 1, shouldn't it?

  • Hey jesper.. delete your first post... or edit the table tags.

  • Sure, thanks

  • Well, what I was trying to say was that I don't see how the result arrives - and that Remi is missing group by item_id in his first suggestion

  • Not really...

    An example (for simplicity I ignore the TYPE column): the item with ID 7 (in ITEM_COMP table) should be composed by component AAA and BBB. In my products table, I’ve 3 products with this composition: products with ID 1, 5 and 6.

    Resuming I’m counting products with the same composition of the items ‘x’…

    ThX

    NeuralC

     

  • OK, I understand what you want, it's a bit complicated

    I have to leave now, I will have a look at it tomorrow, if it is still unsolved (but that's not likely when Remi is arond ).

  • Don't have a lot of time to speed this up but at least is going to give you what you need:

     

    select item_id,( select  count(distinct prod_id )

        from prod p1

        where exists( select * from  item_comp it1

             where it1.component = p1.component and it1.type =p1.type and it1.item_id = itm.item_id)

          and cnt = (select count(*) from prod p2 where p1.prod_id = p2.prod_id) 

       &nbsp occurrences 

    from

    (

    select item_id, count(*) cnt

    from item_comp

    group by item_id

    ) itm

     


    * Noel

  • I thought about it at home and came up with the following. As Noel, I haven't thought much about performance. Maybe you should consider making dt1, dt2 and dt3 real tables (instead of derived tables), such that they can be indexed.

    create table prod (PROD_ID int, COMPONENT varchar(3), TYPE varchar(1))

    go

    insert into prod

    select 1,         'AAA',         'X'

    union all

    select 1,         'BBB',         'Y'

    union all

    select 2,         'AAA',         'X'

    union all

    select 3,         'BBB',         'Y'

    union all

    select 4,         'AAA',         'Y'

    union all

    select 5,         'AAA',         'X'

    union all

    select 5,         'BBB',         'Y'

    union all

    select 6,         'AAA',         'X'

    union all

    select 6,         'BBB',         'Y'

    union all

    select 7,         'AAA',         'X'

     

    create table ITEM_COMP(ITEM_ID int, COMPONENT varchar(3), TYPE varchar(1))

    go

    insert into ITEM_COMP

    select 7,         'AAA',         'X'

    union all

    select 7,         'BBB',         'Y'

    union all

    select 8,         'AAA',         'X'

    union all

    select 9,         'BBB',         'Y'

    select dt1.item_id, count(*) as occurencies from

    (

    select c.item_id, p.prod_id, count(*) as count

    from item_comp c inner join prod p on c.component = p.component and c.type = p.type

    group by c.item_id, p.prod_id

    )

    dt1

    inner join

    (

    select item_id, count(*) as count from item_comp group by item_id

    )

    dt2

    on dt1.item_id = dt2.item_id

    inner join

    (

    select prod_id, count(*) as count from prod group by prod_id

    )

    dt3

    on dt1.prod_id = dt3.prod_id

    where dt1.count = dt2.count and dt1.count = dt3.count

    group by dt1.item_id

  • Hi all,

    Noeld, you approach aren’t correct. If you have one item with same number of components, and only one is equals it counts it. One example: if you change ITEM_ID 7 to ‘AAA’ + ‘CCC’ (instead of ‘AAA’ + ‘BBB’) it counts also 3 occurrences (it’s because of the exists).

    Jesper, your approach works ok, in half of the time of the previous approach. I compare 30000 products (with 2 components=>60000 rows) with 10 items in 4 seconds.

     Thanks both,

    NeuralC

     

  • NeuralC

    You are correct the exists was my oversite I stand corrected and I am glad you got the right solution from Jesper

     


    * Noel

  • I should add that if an item can appear more than once in an item composition, or in a product, then my approach will fail... This can be fixed as well, but there is no need to go into this if it won't happen...

Viewing 15 posts - 1 through 14 (of 14 total)

You must be logged in to reply to this topic. Login to reply