January 7, 2005 at 8:34 am
I have a table with 85224 rows. There are 89 distinct codes in the table. My challenge is:
Perform 3 NTH selects on this table to have the final result of:
25,000
25,000
25,000
Remaining 10,224
My problem here is that I have to take approximately the same number of rows per code and quantities per code can vary: for example code 228 has one row, code 114 has 7909 rows.
The only business rule is if there is one row per code it should go to first group of 25000 if there are 3 rows per code one should go to each of the 25,000 groups. I tried to resolve this with cursor, but without success. SEQN is unique, ID is not.
Sample of data
SEQN | Code | ID |
1 | 214 | 100489 |
2 | 214 | 100622 |
3 | 415 | 1006842 |
4 | 414 | 1007003 |
5 | 416 | 1007018 |
6 | 214 | 1007126 |
7 | 117 | 1007255 |
8 | 215 | 1007274 |
9 | 413 | 1007281 |
10 | 115 | 1007587 |
11 | 214 | 1007721 |
12 | 416 | 100775 |
13 | 216 | 100840 |
14 | 411 | 100871 |
15 | 213 | 1010619 |
16 | 114 | 1010623 |
17 | 425 | 1010636 |
18 | 214 | 101189 |
19 | 224 | 101364 |
20 | 413 | 101776 |
21 | 134 | 102031 |
22 | 212 | 102469 |
23 | 114 | 102701 |
24 | 136 | 102702 |
25 | 112 | 10287 |
26 | 214 | 102987 |
27 | 315 | 103080 |
28 | 434 | 103373 |
29 | 413 | 103422 |
30 | 213 | 103802 |
Any help would be much appreciated.
Thank you everyone.
M.
January 7, 2005 at 9:44 am
So If i read this right a code with 9 records would be divided between the first three but a record with 10 would be divided between the 4?
--step 1 distribute your codes(89 rows)
--table1
insert tbl1
select distinct code,[id]
from tbl
--table2
insert tbl2
select distinct code,[id]
from tbl
where exists(
select code
from tbl
where tbl.code = t2.code
group by code
having count(*)>=2
)
--table3
Insert tbl3
select distinct code,[id]
from tbl
where exists(
select code
from tbl t2
where tbl.code = t2.code
group by code
having count(*)=3
)
--step2 create your population for table 1
declare @C int,@ct int
set @C = (select min(code) from tbl1)
set @CT = 0
while @C is not null or @CT < 25000
begin
Insert fintabl1
select top 33 percent seqn,code,[id]
from tbl t join tbl1 t1
on t.code = t.code
where t.code = @C
order by code,seqn
set @CT = (select count(*) from tbl1)
set @C = (select min(code) from tbl1 where code > @C)
end
--create your population for table 2
set @C = (select min(code) from tbl2)
set @CT = 0
while @C is not null or @CT < 25000
begin
Insert fintabl2
select top 33 percent seqn,code,[id]
from tbl t join tbl2 t2
on t.code = t.code
where t.code = @C
and not exists (
select seqn
from tbl1 t1
where t1.seqn = t.seqn
)
order by code,seqn
set @CT = (select count(*) from tbl1)
set @C = (select min(code) from tbl2 where code > @C)
end
--create your population for table 3
set @C = (select min(code) from tbl3)
set @CT = 0
while @C is not null or @CT < 25000
begin
Insert fintabl3
select top 33 percent seqn,code,[id]
from tbl t join tbl3 t3
on t.code = t.code
where t.code = @C
and not exists (
select seqn
from tbl1 t1
where t1.seqn = t.seqn
)
and not exists (
select seqn
from tbl2 t2
where t2.seqn = t.seqn
)
order by code,seqn
set @CT = (select count(*) from tbl3)
set @C = (select min(code) from tbl3 where code > @C)
end
--table 4
insert tbl4
select seqn,code,[id]
from tbl left join (
select seqn
from fintabl1
union all
select seqn
from fintabl2
union all
select seqn
from fintabl3
)tbls
on tbl.seqn = tbls.seqn
where tbls.seqn is null
January 7, 2005 at 10:09 am
Not necessarily the idea is to reach nth select of 25000 first, then second and third and at the end put the remaining records into the fourth group of 10224 keeping in mind that quantity of each code should be represented equally between the nth selects.
Thank you very much for the time you have taken to help me.
M.
January 7, 2005 at 12:54 pm
It might be helpful to post what you have even if you believe it to be wrong.
Viewing 4 posts - 1 through 3 (of 3 total)
You must be logged in to reply to this topic. Login to reply