getting four equally sized datasets from a table with no sequential key

david.hassell

Newbie

Points: 2
More actions
April 7, 2008 at 5:06 am

#186120

My apologies if I am in the wrong forum, but this seems to be more of a sql question than an SSIS question.
I work with large datasets in SSIS and am attempting to find a way to run some of the processes in parallel, the problem with splitting my datasets into equal sizes is that the part of the multipart key that makes a row unique is not sequential. So the values I would be comparing against could be 1,3,10,22 etc. I would like to be able to do something similar to the following in an 4 SQL tasks:
With t(cnt) as ( select max(id) as cnt from mytable)
Select m.myfields from mytable2 m, t where m.id <= (t.cnt * .25)
Select m.myfields from mytable2 m, t where m.id <= (t.cnt * .50)
Select m.myfields from mytable2 m, t where m.id <= (t.cnt * .75)
Select m.myfields from mytable2 m, t where m.id > (t.cnt * .75)
david.hassell

Newbie

Points: 2
More actions
April 7, 2008 at 5:11 am

#799033

d'oh...
Reading another thread I think I found my answer, I believe I can use the rownumber function to accomplish what I am looking to do.
Todd Engen

SSCrazy Eights

Points: 9869
More actions
April 7, 2008 at 5:29 am

#799035

Since you mentioned the ROW_NUMBER function you might want to look at the NTILE ranking function too.
For example, to generate 4 groups based on col1 from Table1.
SELECT col1,col2, NTILE(4) OVER(ORDER BY col1)
FROM Table1

Viewing 3 posts - 1 through 2 (of 2 total)

You must be logged in to reply to this topic. Login to reply