getting four equally sized datasets from a table with no sequential key

  • My apologies if I am in the wrong forum, but this seems to be more of a sql question than an SSIS question.

    I work with large datasets in SSIS and am attempting to find a way to run some of the processes in parallel, the problem with splitting my datasets into equal sizes is that the part of the multipart key that makes a row unique is not sequential. So the values I would be comparing against could be 1,3,10,22 etc. I would like to be able to do something similar to the following in an 4 SQL tasks:

    With t(cnt) as ( select max(id) as cnt from mytable)

    Select m.myfields from mytable2 m, t where m.id <= (t.cnt * .25)

    Select m.myfields from mytable2 m, t where m.id <= (t.cnt * .50)

    Select m.myfields from mytable2 m, t where m.id <= (t.cnt * .75)

    Select m.myfields from mytable2 m, t where m.id > (t.cnt * .75)

  • d'oh...

    Reading another thread I think I found my answer, I believe I can use the rownumber function to accomplish what I am looking to do.

  • Since you mentioned the ROW_NUMBER function you might want to look at the NTILE ranking function too.

    For example, to generate 4 groups based on col1 from Table1.

    SELECT col1,col2, NTILE(4) OVER(ORDER BY col1)

    FROM Table1

Viewing 3 posts - 1 through 2 (of 2 total)

You must be logged in to reply to this topic. Login to reply