Need the Best Solution for splitting into batches

Question

Post reply

Need the Best Solution for splitting into batches

venus.pvr

SSCarpal Tunnel

Points: 4813
More actions
September 11, 2012 at 8:47 am

#261424

I have the below requirement:
I need to split the one column into 3 columns such that any one row doesn't have the same value for any two columns. I have a solution below, but it is not the minimum rows possible. i need to rearrange them to get as minimum rows as possible.
Below are the table scripts required for the exercise.
CREATE TABLE [dbo].[Table_2](
[a] [int] NULL
) ON [PRIMARY]
INSERT INTO [dbo].[Table_2]([a])
SELECT 1 UNION ALL
SELECT 1 UNION ALL
SELECT 1 UNION ALL
SELECT 2 UNION ALL
SELECT 2 UNION ALL
SELECT 2 UNION ALL
SELECT 1 UNION ALL
SELECT 2 UNION ALL
SELECT 3 UNION ALL
SELECT 1 UNION ALL
SELECT 2 UNION ALL
SELECT 4 UNION ALL
SELECT 5 UNION ALL
SELECT 6 UNION ALL
SELECT 7 UNION ALL
SELECT 8 UNION ALL
SELECT 9 UNION ALL
SELECT 10 UNION ALL
SELECT 3
Below is the solution I had written:
DECLARE @maxPID INT
DECLARE @pID INT=1
CREATE TABLE #SplitBatches(
Batch1 VARCHAR(50),
Batch2 VARCHAR(50),
Batch3 VARCHAR(50))
SELECT @maxPID=MAX(pID)
FROM
(SELECT ROW_NUMBER() OVER(PARTITION BY a ORDER BY a) AS pID,
a AS List
FROM dbo.Table_2
) AS cl
WHILE(@pID<=@maxPID)
BEGIN
;WITH ListCTE AS
(
SELECT
ROW_NUMBER() OVER(PARTITION BY a ORDER BY a) AS pID,
CAST(a AS VARCHAR(10)) AS List
FROM dbo.Table_2
)
INSERT INTO #SplitBatches
SELECT
ISNULL([0],''),
ISNULL([1],''),
ISNULL([2],'')
FROM (SELECT
(ROW_NUMBER()OVER(ORDER BY List)-1)/3 AS BatchNumber,
(ROW_NUMBER()OVER(ORDER BY List)-1)%3 AS BatchOrder,
List AS ClientListing
FROM ListCTE WHERE pID=@pID)
ClientListingCTE
PIVOT (MAX(ClientListing) FOR batchorder IN ([0],[1],[2])) AS pvt
SET @pID=@pID+1
END
SELECT Batch1,Batch2,Batch3 FROM #SplitBatches
DROP TABLE #SplitBatches
But I need the solution such that there are as minimum rows as possible in the result set.
Please let me know if you have any suggestions.
Thanks and Regards,
Praveena

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply

Robert Klimes SSCoach Points: 17865 More actions · Answer 1

would just adding a group by to your final statement get desired results

SELECT Batch1,Batch2,Batch3 FROM #SplitBatches

GROUP BY Batch1,Batch2,Batch3

Bob
-----------------------------------------------------------------------------
How to post to get the best help[/url]

ChrisM@Work SSC Guru Points: 186127 More actions · Answer 2

SELECT

col1 = MAX(CASE WHEN seq = 1 THEN a END),

col2 = MAX(CASE WHEN seq = 2 THEN a END),

col3 = MAX(CASE WHEN seq = 3 THEN a END)

FROM (

SELECT

a,

Batch = CAST(rn/3.1 AS INT),

Seq = ISNULL(NULLIF(rn%3,0),3)

FROM (

SELECT

a,

rn = ROW_NUMBER() OVER(ORDER BY (SELECT NULL))

FROM Table_2

) s

) b

GROUP BY Batch

^{“Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw}

For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

venus.pvr SSCarpal Tunnel Points: 4813 More actions · Answer 3

Bob,

thank you for the reply. I need all the 19 values even if they are duplicated(count of rows in Table_2). Just rearranged in batches of 3, such that no two columns have the same value for a row. Adding a group by would eliminate some of those values.

venus.pvr SSCarpal Tunnel Points: 4813 More actions · Answer 4

ChrisM,

No two columns should have same value within a row. With the query you provided, below is the result set:

111

222

123

124

567

8910

3NULLNULL

First two rows have same values for all the columns.

Thank you very much for the quick responses.

venus.pvr SSCarpal Tunnel Points: 4813 More actions · Answer 5

The solution I provided would give the results.

1102

345

678

9

123

12

But rearranging them to something like below would give less number of rows within the results. I removed the second row and added values to the last 3 rows, thus 1 less row in the result set. Not sure if it is possible though.

1102

678

9

123

125

124

12 3

Thanks and Regards,

Praveena.

ChrisM@Work SSC Guru Points: 186127 More actions · Answer 6

venus.pvr (9/11/2012)
ChrisM,
No two columns should have same value within a row. With the query you provided, below is the result set:
111
222
123
124
567
8910
3NULLNULL
First two rows have same values for all the columns.
Thank you very much for the quick responses.

Yes, you are right. Try this.

SELECT a.a, b.a, c.a

FROM (

SELECT TOP 33 percent a, rn = ROW_NUMBER() OVER(ORDER BY a)

FROM Table_2 ORDER BY a ASC

) a

LEFT JOIN (

SELECT TOP 50 percent a, rn = ROW_NUMBER() OVER(ORDER BY a)

FROM (SELECT TOP 66 percent a FROM Table_2 ORDER BY a DESC) d ORDER BY a ASC

) b ON b.rn = a.rn

LEFT JOIN (

SELECT a, rn = ROW_NUMBER() OVER(ORDER BY a)

FROM (SELECT TOP 33 percent a FROM Table_2 ORDER BY a DESC) d

) c ON c.rn = a.rn

It ain't perfect and won't work with every set - it depends upon the distribution of the data. But it works with your sample.

^{“Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw}

For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden