August 6, 2009 at 2:16 pm
I don't even know what to search for to find the answer to my question.
I have a table that lists items (fires) in categories (fire centres). I have six categories and each can have 0 or more items. The number of items varies on a daily basis.
What I want to do is write a query that will divide them up in such a way that I have roughly the same number of items in each of two groups (columns on a web page). I've faked it right now by creating another table that lists each of the six categories and assigns them either to column 1 or column 2, but the number of items keeps changing and so I'm always having to adjust that table.
Can anyone suggest a way I might dynamically code this?
Here's the web page http://bcwildfire.ca/ if it helps to better understand my problem!
Thanks!
August 6, 2009 at 2:52 pm
You probably want to look at the NTILE windowed aggregate.
For more infor look here:
http://msdn.microsoft.com/en-us/library/ms175126.aspx
----------------------------------------------------------------------------------
Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?
August 6, 2009 at 3:10 pm
Thanks, that looks promising. Is there a way to tell it to keep all of the items for one category together, though? If it has to be that one column has 30 records and the other 18 because there's no better way to break them up, so be it -- rather than making each of them have 24 records and one category is split over the two columns.
Doing the query as listed below results in my Coastal fire centre having one record in column 1 and the rest in column 2. I need to keep all of the Coastal records together.
create table FireList (
FireCentreName varchar(15)
, ProjectID int
, DisplayName varchar(50)
)
go
Insert FireList (FireCentreName, ProjectId, DisplayName) values ('Prince George', 250, 'Junction of Smith and Liard River')
Insert FireList (FireCentreName, ProjectId, DisplayName) values ('Kamloops', 260, 'Terrace Mountain')
Insert FireList (FireCentreName, ProjectId, DisplayName) values ('Kamloops', 263, 'Mount McLean')
Insert FireList (FireCentreName, ProjectId, DisplayName) values ('Coastal', 266, 'Wolf River')
Insert FireList (FireCentreName, ProjectId, DisplayName) values ('Northwest', 270, 'Salmon River Rd. - Kispiox Valley')
Insert FireList (FireCentreName, ProjectId, DisplayName) values ('Kamloops', 271, 'Momich Lake Fires')
Insert FireList (FireCentreName, ProjectId, DisplayName) values ('Cariboo', 278, 'Corkscrew Basin')
Insert FireList (FireCentreName, ProjectId, DisplayName) values ('Southeast', 280, 'Galena Bay')
Insert FireList (FireCentreName, ProjectId, DisplayName) values ('Kamloops', 281, 'Intlpam')
Insert FireList (FireCentreName, ProjectId, DisplayName) values ('Coastal', 283, 'Camel Back East')
Insert FireList (FireCentreName, ProjectId, DisplayName) values ('Coastal', 284, 'Copper Mt')
Insert FireList (FireCentreName, ProjectId, DisplayName) values ('Coastal', 286, 'Antler Lake')
Insert FireList (FireCentreName, ProjectId, DisplayName) values ('Coastal', 288, 'Blackcomb Mountain')
Insert FireList (FireCentreName, ProjectId, DisplayName) values ('Cariboo', 289, 'Heckman Pass Area')
Insert FireList (FireCentreName, ProjectId, DisplayName) values ('Cariboo', 290, 'Choelquoit Lake')
Insert FireList (FireCentreName, ProjectId, DisplayName) values ('Cariboo', 291, '5 Miles E of Sugarloaf Mtn')
Insert FireList (FireCentreName, ProjectId, DisplayName) values ('Coastal', 294, 'Uztlius')
Insert FireList (FireCentreName, ProjectId, DisplayName) values ('Coastal', 295, 'North End of Pitt Lake')
Insert FireList (FireCentreName, ProjectId, DisplayName) values ('Cariboo', 296, '1km S of Chilcotin River in Alexis Creek')
Insert FireList (FireCentreName, ProjectId, DisplayName) values ('Coastal', 299, 'Mid Slope East Side Saloompt')
Insert FireList (FireCentreName, ProjectId, DisplayName) values ('Prince George', 300, 'Southwest of MacKenzie along Phillips Creek')
Insert FireList (FireCentreName, ProjectId, DisplayName) values ('Prince George', 301, '9 km Southwest of McBride')
Insert FireList (FireCentreName, ProjectId, DisplayName) values ('Kamloops', 302, 'Yalakom Fires')
Insert FireList (FireCentreName, ProjectId, DisplayName) values ('Northwest', 303, 'West of Pimpernel Mountain')
Insert FireList (FireCentreName, ProjectId, DisplayName) values ('Northwest', 304, 'North Side of the Morice River')
Insert FireList (FireCentreName, ProjectId, DisplayName) values ('Kamloops', 305, 'Brookmere')
Insert FireList (FireCentreName, ProjectId, DisplayName) values ('Northwest', 306, 'NE of Duck Lake')
Insert FireList (FireCentreName, ProjectId, DisplayName) values ('Cariboo', 307, 'Alexis Creek')
Insert FireList (FireCentreName, ProjectId, DisplayName) values ('Kamloops', 308, 'Seton Portage')
Insert FireList (FireCentreName, ProjectId, DisplayName) values ('Prince George', 309, 'Manson Creek')
Insert FireList (FireCentreName, ProjectId, DisplayName) values ('Coastal', 310, 'Nuxalk Mountain, Bella Coola')
Insert FireList (FireCentreName, ProjectId, DisplayName) values ('Coastal', 311, 'Ruby Bowl, Blackcomb Mountain')
Insert FireList (FireCentreName, ProjectId, DisplayName) values ('Cariboo', 312, '61 Km on 4000rd')
Insert FireList (FireCentreName, ProjectId, DisplayName) values ('Southeast', 313, 'Beaton Incomappleux')
Insert FireList (FireCentreName, ProjectId, DisplayName) values ('Coastal', 315, 'East side of Nimpkish')
Insert FireList (FireCentreName, ProjectId, DisplayName) values ('Coastal', 316, 'Spuzzum Creek')
go
select FireCentreName, ProjectID, DisplayName
,NTILE(2) OVER(ORDER BY FireCentreName DESC) AS 'ColumnNumber'
from FireList
--drop table FireList
Results of query:
FireCentreNameProjectIDDisplayNameColumnNumber
Southeast280Galena Bay1
Southeast313Beaton Incomappleux1
Prince George309Manson Creek1
Prince George300Southwest of MacKenzie along Phillips Creek1
Prince George3019 km Southwest of McBride1
Prince George250Junction of Smith and Liard River1
Northwest270Salmon River Rd. - Kispiox Valley1
Northwest303West of Pimpernel Mountain1
Northwest304North Side of the Morice River1
Northwest306NE of Duck Lake1
Kamloops308Seton Portage1
Kamloops305Brookmere1
Kamloops302Yalakom Fires1
Kamloops271Momich Lake Fires1
Kamloops260Terrace Mountain1
Kamloops263Mount McLean1
Kamloops281Intlpam1
Coastal283Camel Back East1
Coastal284Copper Mt2
Coastal286Antler Lake2
Coastal288Blackcomb Mountain2
Coastal266Wolf River2
Coastal294Uztlius2
Coastal295North End of Pitt Lake2
Coastal299Mid Slope East Side Saloompt2
Coastal310Nuxalk Mountain, Bella Coola2
Coastal311Ruby Bowl, Blackcomb Mountain2
Coastal315East side of Nimpkish2
Coastal316Spuzzum Creek2
Cariboo31261 Km on 4000rd2
Cariboo307Alexis Creek2
Cariboo2961km S of Chilcotin River in Alexis Creek2
Cariboo278Corkscrew Basin2
Cariboo289Heckman Pass Area2
Cariboo290Choelquoit Lake2
Cariboo2915 Miles E of Sugarloaf Mtn2
August 6, 2009 at 4:28 pm
So what I've done on my website is the ASP code only looks at the column number when it changes categories, and ignores it from then on. So that keeps my categories together, but it doesn't necessarily end up with anything approaching a balanced column (although right now it just happens to come out pretty well)
However, if category 1 (Coastal) had 30 records, and all the rest together had 32, the NTILE would put the first row for the second category (Cariboo) into column 1, and then I would ignore the fact that the rest of the Cariboo fires were in column 2, so it would come out pretty lopsided.
So it's still not ideal the way I have it... but I'm getting closer.
August 6, 2009 at 4:48 pm
I should mention that I don't care at all what sequence the categories come out in. Right now the NTILE query is causing them to be alphabetical, but that's unimportant.
August 6, 2009 at 9:18 pm
Using the sample table from the code above, will this do the trick for you? Ordering by newID basically randomizes it. Then the row_number() is assigned and the odd numbers all go in column 1. This ensures as even a split as possible.
Please note, if you have three rows (or any odd number) column 1 will always get more rows than column 2, because the row numbering begins with 1. We could randomly generate an integer between 0 and 9, but that would sometimes lead to one column having two+ rows more than the other column. Let me know if you would rather use that approach.
;with cte1 as
(select *,row_number() over (partition by fireCentreName order by newid()) as rowno from firelist)
,cte2 as
(select *, case when right(rowNo,1) in (1,3,5,7,9) then 1 else 2 end as colNo
from cte1)
select FireCentreName,ProjectID,DisplayName,ColNo
from cte2
order by FireCentreName, ProjectID
__________________________________________________
Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills
August 7, 2009 at 6:54 am
The ntile(2) function will get you close but you still have to worry about the fire center that is in both columns. This tests for that and updates the center to be in 1 or the other column.
SELECT *, ntile(2) OVER (ORDER BY FireCentreName) AS rowNum
INTO #temp
FROM FireList
DECLARE @misMatch VARCHAR(15)
SELECT DISTINCT @misMatch = fireCentreName FROM #temp
WHERE rowNum = 1
AND fireCentreName IN (SELECT fireCentreName FROM #temp WHERE rowNum = 2)
IF (SELECT COUNT(*) FROM #temp WHERE fireCentreName = @misMatch AND rowNum = 1) > (SELECT COUNT(*) FROM #temp WHERE fireCentreName = @misMatch AND rowNum = 2)
UPDATE #temp SET rowNum = 1 WHERE FireCentreName = @misMatch
ELSE
UPDATE #temp SET rowNum = 2 WHERE FireCentreName = @misMatch
SELECT * FROM #temp
August 7, 2009 at 7:30 am
Not guaranteed to split equally, but groups nicely:
select FireCentreName, ProjectID, DisplayName, RANK() OVER(ORDER BY FireCentreName) % 2 + 1 AS 'ColumnNumber'
from FireList
order by ColumnNumber, FireCentreName -- Not required, just for display clarity
Paul White
SQLPerformance.com
SQLkiwi blog
@SQL_Kiwi
August 7, 2009 at 7:34 am
Thats nice. I never even thought to do a mod on the rank() function.
August 7, 2009 at 8:03 am
Matt Wilhoite (8/7/2009)
Thats nice. I never even thought to do a mod on the rank() function.
Thank you Matt :w00t:
Paul White
SQLPerformance.com
SQLkiwi blog
@SQL_Kiwi
August 7, 2009 at 8:54 am
"Not guaranteed to spit evenly" is a mild understatement. If you change one row of the sample data to move a station from Kamloops to Cariboo, all the groups have even counts and the entire list will show up in column 2. Using DENSE_RANK instead works better, but there still may be degenerate cases.
Some problems call for an actual algorithm that can't be expressed in a single query. You might consider doing the column split in your web code instead of trying to do it in T-SQL, but it can be done. This code splits the sample data into two columns by assigning each group (in descending group size order) to the shortest column. It produces two columns of 18 (even if one Kamloops is changed to Cariboo).
SELECT IDENTITY(INT,1,1) AS ID, FireCentreName, COUNT(*) AS [Count], 1 AS [Column]
INTO #fc
FROM dbo.FireList
GROUP BY FireCentreName
ORDER BY COUNT(*) DESC
DECLARE @ID INT, @n INT, @Col1 INT, @Col2 INT
SELECT @Col1 = 0, @Col2 = 0, @ID = 1, @n = MAX(ID) FROM #fc
WHILE @ID <= @n BEGIN
IF @Col1 <= @Col2
UPDATE #fc SET [Column] = 1, @Col1 = @Col1 + [Count], @ID = @ID + 1 WHERE ID = @ID
ELSE
UPDATE #fc SET [Column] = 2, @Col2 = @Col2 + [Count], @ID = @ID + 1 WHERE ID = @ID
END
SELECT [Column], SUM([Count]) FROM #fc GROUP BY [Column]
SELECT c.[Column], f.FireCentreName, f.ProjectID, f.DisplayName
FROM FireList AS f
INNER JOIN #fc AS c ON f.FireCentreName = c.FireCentreName
ORDER BY c.[Column]
August 7, 2009 at 8:58 am
Here's an attempt:
DECLARE@median SMALLINT
, @denseRank SMALLINT
SELECT@median = CEILING(MAX([RowNum])/2)
FROM(SELECTROW_NUMBER() OVER (ORDER BY [FireCentreName]) [RowNum]
FROM[FireList]) median PRINT @median
SELECT@denseRank = MAX(dR)
FROM(SELECTDENSE_RANK() OVER(ORDER BY [FireCentreName]) [dR]
FROM[FireList]) denseRank PRINT @denseRank
SELECTf.[FireCentreName]
, f.[ProjectID]
, f.[DisplayName]
, DENSE_RANK() OVER(ORDER BY f.[FireCentreName]) [DenseRanked]
, RANK() OVER(ORDER BY f.[FireCentreName]) [Ranked]
, CASE
WHEN @denseRank <= 2THEN DENSE_RANK() OVER(ORDER BY f.[FireCentreName])
WHEN RANK() OVER(ORDER BY f.[FireCentreName]) < @medianTHEN 1
ELSE 2 END [ColumnNum]
FROM[FireList] f
[font="Arial Narrow"]bc[/font]
August 7, 2009 at 9:23 am
Scott Coleman (8/7/2009)
Using DENSE_RANK instead works better, but there still may be degenerate cases.
As it happens, originally I did use DENSE_RANK, but changed it on a whim as I posted.
DENSE_RANK seems to be fine - not bad considering it's one line.
Paul White
SQLPerformance.com
SQLkiwi blog
@SQL_Kiwi
August 9, 2009 at 12:18 am
Thanks so much for the replies, guys! I'm on holidays for a week but will try these out when I get back!
August 9, 2009 at 1:37 pm
Beverley (8/9/2009)
Thanks so much for the replies, guys! I'm on holidays for a week but will try these out when I get back!
Perfect! Hey guys... I've got this really nasty problem... would you do all my work for me so I don't have to worry about it when I go on holiday next week? Thanks so much! 😛
--Jeff Moden
Change is inevitable... Change for the better is not.
Viewing 15 posts - 1 through 15 (of 25 total)
You must be logged in to reply to this topic. Login to reply