Nth Select

  • I have a table with 85224 rows.  There are 89 distinct codes in the table.  My challenge is:

     

    Perform 3 NTH selects on this table to have the final result of:

     

    25,000

    25,000

    25,000

    Remaining 10,224

     

    My problem here is that I have to take approximately the same number of rows per code and quantities per code can vary: for example code 228 has one row, code 114 has 7909 rows.

     

    The only business rule is if there is one row per code it should go to first group of 25000 if there are 3 rows per code one should go to each of the 25,000 groups.  I tried to resolve this with cursor, but without success. SEQN is unique, ID is not.

     

    Sample of data

    SEQN

    Code

    ID

    1

    214

    100489

    2

    214

    100622

    3

    415

    1006842

    4

    414

    1007003

    5

    416

    1007018

    6

    214

    1007126

    7

    117

    1007255

    8

    215

    1007274

    9

    413

    1007281

    10

    115

    1007587

    11

    214

    1007721

    12

    416

    100775

    13

    216

    100840

    14

    411

    100871

    15

    213

    1010619

    16

    114

    1010623

    17

    425

    1010636

    18

    214

    101189

    19

    224

    101364

    20

    413

    101776

    21

    134

    102031

    22

    212

    102469

    23

    114

    102701

    24

    136

    102702

    25

    112

    10287

    26

    214

    102987

    27

    315

    103080

    28

    434

    103373

    29

    413

    103422

    30

    213

    103802

     

     

    Any help would be much appreciated.

     

    Thank you everyone.

     

    M.

     

  • So If i read this right a code with 9 records would be divided between the first three but a record with 10 would be divided between the 4?

    --step 1 distribute your codes(89 rows)

    --table1

    insert tbl1

    select distinct code,[id]

    from tbl

    --table2

    insert tbl2

    select distinct code,[id]

    from tbl

    where exists(

    select code

    from tbl

    where tbl.code = t2.code

    group by code

    having count(*)>=2

    )

    --table3

    Insert tbl3

    select distinct code,[id]

    from tbl

    where exists(

    select code

    from tbl t2

    where tbl.code = t2.code

    group by code

    having count(*)=3

    )

    --step2 create your population for table 1

    declare @C int,@ct int

    set @C = (select min(code) from tbl1)

    set @CT = 0

    while @C is not null or @CT < 25000

    begin

    Insert fintabl1

    select top 33 percent seqn,code,[id]

    from tbl t join tbl1 t1

    on t.code = t.code

    where t.code = @C

    order by code,seqn

    set @CT = (select count(*) from tbl1)

    set @C = (select min(code) from tbl1 where code > @C)

    end

    --create your population for table 2

    set @C = (select min(code) from tbl2)

    set @CT = 0

    while @C is not null or @CT < 25000

    begin

    Insert fintabl2

    select top 33 percent seqn,code,[id]

    from tbl t join tbl2 t2

    on t.code = t.code

    where t.code = @C

    and not exists (

    select seqn

    from tbl1 t1

    where t1.seqn = t.seqn

    )

    order by code,seqn

    set @CT = (select count(*) from tbl1)

    set @C = (select min(code) from tbl2 where code > @C)

    end

    --create your population for table 3

    set @C = (select min(code) from tbl3)

    set @CT = 0

    while @C is not null or @CT < 25000

    begin

    Insert fintabl3

    select top 33 percent seqn,code,[id]

    from tbl t join tbl3 t3

    on t.code = t.code

    where t.code = @C

    and not exists (

    select seqn

    from tbl1 t1

    where t1.seqn = t.seqn

    )

    and not exists (

    select seqn

    from tbl2 t2

    where t2.seqn = t.seqn

    )

    order by code,seqn

    set @CT = (select count(*) from tbl3)

    set @C = (select min(code) from tbl3 where code > @C)

    end

    --table 4

    insert tbl4

    select seqn,code,[id]

    from tbl left join (

    select seqn

    from fintabl1

    union all

    select seqn

    from fintabl2

    union all

    select seqn

    from fintabl3

    )tbls

    on tbl.seqn = tbls.seqn

    where tbls.seqn is null

  • Not necessarily the idea is to reach nth select of 25000 first, then second and third and at the end put the remaining records into the fourth group of 10224 keeping in mind that quantity of each code should be represented equally between the nth selects.

     

    Thank you very much for the time you have taken to help me.

     

    M.

     

  • It might be helpful to post what you have even if you believe it to be wrong.

Viewing 4 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply