April 1, 2016 at 10:34 pm
I have a table which stores Databasename and sizes. I want to be able to assign group numbers to set of database on fly based on there size.
DBName Size
AW1 2000
AW2 200
AW3 9000
AW4 100000
AW5 600
Now i want the databases to be in 3 groups such that total size in any group should not be less than average ( total size of all databases / number of groups). I basically want groups to be created so that when i execute my maintenance plans , the tasks are evenly spread out as much as possible
April 2, 2016 at 3:08 am
Is there are limitations to no of databases that can be a part of the a group ? Should the no of groups always be 3 ?
DECLARE @t TABLE
(
dbname VARCHAR(100),
size INT
)
INSERT @t
SELECT 'AW1',2000
UNION
SELECT 'AW2',200
UNION
SELECT 'AW3',9000
UNION
SELECT 'AW4',100000
UNION
SELECT 'AW5',600
SELECT DENSE_RANK()OVER(ORDER BY groupsize)GroupNo,
DBName,
GroupSize
FROM (SELECT T1.*,
SUM(size)
OVER (
partition BY db
ORDER BY db)GroupSize
FROM (SELECT T2.*,
T3.dbname DB
FROM @t T2
CROSS APPLY(SELECT *
FROM @t
WHERE T2.dbname != dbname)T3)T1)T
SELECT SUM(size) / 3 AS AverageSize FROM @t
--------------------------------------------------------------------------------------------------
I am just an another naive wannabe DBA trying to learn SQL Server
April 2, 2016 at 3:38 am
curious_sqldba (4/1/2016)
I have a table which stores Databasename and sizes. I want to be able to assign group numbers to set of database on fly based on there size.
DBName Size
AW1 2000
AW2 200
AW3 9000
AW4 100000
AW5 600
Now i want the databases to be in 3 groups such that total size in any group should not be less than average ( total size of all databases / number of groups). I basically want groups to be created so that when i execute my maintenance plans , the tasks are evenly spread out as much as possible
based on your sample data can you please tell what you expect your answer to be?
<deleted......reread OP>
________________________________________________________________
you can lead a user to data....but you cannot make them think
and remember....every day is a school day
April 2, 2016 at 1:01 pm
curious_sqldba (4/1/2016)
I have a table which stores Databasename and sizes. I want to be able to assign group numbers to set of database on fly based on there size.
DBName Size
AW1 2000
AW2 200
AW3 9000
AW4 100000
AW5 600
Now i want the databases to be in 3 groups such that total size in any group should not be less than average ( total size of all databases / number of groups). I basically want groups to be created so that when i execute my maintenance plans , the tasks are evenly spread out as much as possible
With the possible exceptions of backups/restores, there's very little that the size of a database has to do with any type of maintenance. For example, there may be a larger volume of non-clustered indexes to rebuild on AW1 than in AW3 or even AW4.
So, to echo JLS's question, what are you actually expecting for output here and, to answer my question, what kind(s) of maintenance are you talking about? Without knowing that and because things like backups are all or nothing, I see no way that, using the numbers given, trying to balance the load by average will even come close to being incorrect.
--Jeff Moden
Change is inevitable... Change for the better is not.
April 2, 2016 at 1:20 pm
On the outside chance that you want the "all or nothing" type of load balancing... please see the following.
curious_sqldba (4/1/2016)
I have a table which stores Databasename and sizes. I want to be able to assign group numbers to set of database on fly based on there size.
DBName Size
AW1 2000
AW2 200
AW3 9000
AW4 100000
AW5 600
Now i want the databases to be in 3 groups such that total size in any group should not be less than average ( total size of all databases / number of groups). I basically want groups to be created so that when i execute my maintenance plans , the tasks are evenly spread out as much as possible
[font="Arial Black"]Test Data[/font]
Ok... first of all, you've been around this forum long enough to know that that's NOT the way to post sample data. 😉 Make it easy on those trying to help you and give yourself the best chance at getting not only a coded answer, but the right answer. If you don't know what I'm talking about, then take the time to read and heed the article at the first link under "Helpful Links" in my signature line below. Thanks.
[font="Arial Black"]Impossible Average[/font]
Getting to your problem, it's not possible to meet your "average" with the numbers given nor should you ever rely on such a thing simply because you can't guarantee a thing with things like database sizes. So we'll shoot at making sure the "tasks are evenly spread out as much as possible".
[font="Arial Black"]Realizing the Horrible Truth[/font]
There are several names for this problem. In this case, the name of "Load Balancing" would be appropriate. The generic name for it is "Bin Stacking". As Hugo Kornelis will tell you, it's impossible to do in T-SQL without the use of a real live procedural loop and he's absolutely correct. Anyone that tries to do it "in a single query" or in some sort of supposedly "Set Based" fashion will always run into exceptions or huge performance/resource usage problems whereas the procedural method will always do it correctly and quickly. Anyone that thinks they "see a pattern" is seeing things that are only there for the current given set of numbers that they're looking at, which can certainly change from day to day.
[font="Arial Black"]An Overview[/font]
The way this problem works is that you have 3 "Bins" (Maintenance Groups) and you intend to assign databases to groups based on their size all while keeping the groups as even as possible. That means that you need to assign the databases according to their descending size and add the size of each database to whatever the smallest total group size is, which changes with every added database. That's why it has to be procedural to perform well. Otherwise, you'd have to calculate every possibility and that makes for much worse than a simple Cartesian Product that will require you to connect a garden hose to your computer to keep it cool enough while calculating larger groups of numbers.
So, here's the plan...
1. Find the smallest size group using the GroupNumber as a tie breaker.
2. Find an unassigned database with the largest size and assign it to the group number in found in Step 1.
3. Add the DB size to the group size.
4. Loop back to step one until you run out of databases.
5. Assign subgroup numbers to each group number to assist in process control for your maintenance routines.
Here's the code. As normal, the details are in the code, where they belong. 😀
--=====================================================================================================================
-- Create your DB size table
-- This is actually a part of the solution.
-- Note the two extra columns I've added
--=====================================================================================================================
--===== If the test table exists, drop it to make reruns easier.
IF OBJECT_ID('tempdb..#DBGrouping') IS NOT NULL DROP TABLE #DBGrouping;
--===== Local Variables
DECLARE @DBCount INT
;
--===== Create and populate the test table on the fly.
-- The ISNULL below makes the column NOT NULL.
SELECT d.DBname
,Size
,RelativeSize = ISNULL(ROW_NUMBER() OVER (ORDER BY d.Size DESC),0)
,MaintenanceGroup = 0
,MaintenanceSubGroup = 0
INTO #DBGrouping
FROM
( --=== This simulates the sys.Databases table you read from
-- to create your table
SELECT 'AW1',2000 UNION ALL
SELECT 'AW2',200 UNION ALL
SELECT 'AW3',9000 UNION ALL
SELECT 'AW4',100000 UNION ALL
SELECT 'AW5',600
)d(DBname,Size)
;
--===== Remember how many rows there are from the table population above.
SELECT @DBCount = @@ROWCOUNT
;
--===== Add the necesssary covering index for performance sake.
ALTER TABLE #DBGrouping
ADD PRIMARY KEY CLUSTERED (RelativeSize)
;
--=====================================================================================================================
-- Create the table that will keep track of group size.
--=====================================================================================================================
--===== If the group table exists, drop it to make reruns easier.
IF OBJECT_ID('tempdb..#Group') IS NOT NULL DROP TABLE #Group
;
--===== Create the Group table
SELECT GroupNumber = ISNULL(d.N,0) --ISNULL makes the column NOT NULL
,GroupSize = 0
INTO #Group
FROM (SELECT N FROM (VALUES (1),(2),(3))v(N))d
;
--===== Add a clustered index
ALTER TABLE #Group
ADD PRIMARY KEY CLUSTERED (GroupNumber)
;
--=====================================================================================================================
-- Assign the databases to groups
--=====================================================================================================================
--===== Local variables
DECLARE @Counter INT
,@GroupNumber INT
,@Size INT
;
--===== Preset the loop counter to start at the largest size
-- accordinng to the database size in your table.
SELECT @Counter = 1
;
--===== Assign each database in descending order according to size
WHILE @Counter <= @DBCount
BEGIN
--===== Get the smallest group size from the group table
-- because that's where we need to make the next assignment.
SELECT TOP 1
@GroupNumber = GroupNumber
FROM #Group
ORDER BY GroupSize, GroupNumber
;
--===== Get the database size according to the counter.
-- This will always be the largest size that we've
-- not yet assigned to a group. This also bumps the
-- counter for the loop.
UPDATE #DBGrouping
SET @Size = Size
,MaintenanceGroup = @GroupNumber
,@Counter = @Counter + 1
WHERE RelativeSize = @Counter
;
--===== Add the size of the DB we just found to the correct
-- group in the group table.
UPDATE #Group
SET GroupSize = GroupSize + @Size
WHERE GroupNumber = @GroupNumber
;
END --of loop
;
--===== Assign the subgrouping order for each group for process controls.
-- The CTE is a trick to allow us to use a Widowing funtion in an UPDATE.
WITH cteUpdate AS
(
SELECT MSG = ROW_NUMBER() OVER (PARTITION BY MaintenanceGroup
ORDER BY RelativeSize)
,MaintenanceSubGroup
FROM #DBGrouping
)
UPDATE cteUpdate
SET MaintenanceSubGroup = MSG
;
--=====================================================================================================================
-- Let's see the results.
--=====================================================================================================================
SELECT * FROM #Group ORDER BY GroupNumber;
SELECT * FROM #DBGrouping ORDER BY MaintenanceGroup, MaintenanceSubGroup;
Here's the total distribution by group...
GroupNumber GroupSize
----------- -----------
1 100000
2 9000
3 2800
... and here are the group assignments for each database...
DBname Size RelativeSize MaintenanceGroup MaintenanceSubGroup
------ ----------- -------------------- ---------------- -------------------
AW4 100000 1 1 1
AW3 9000 2 2 1
AW1 2000 3 3 1
AW5 600 4 3 2
AW2 200 5 3 3
--Jeff Moden
Change is inevitable... Change for the better is not.
April 14, 2016 at 1:26 pm
Thank you Jeff and yes i should have provided more details, sorry posted in haste:-P. Below is the script for the table and the code that i used to get database grouping. I think the way i explained made it look more complicated than it is.
-- Create table
SET QUOTED_IDENTIFIER ON
GO
SET ANSI_PADDING ON
GO
CREATE TABLE [dbo].[DatabaseGrouping](
[DatabaseName] [varchar](400) NOT NULL,
[Size] [float] NOT NULL,
[GroupNumber] [int] NOT NULL,
[Created] [datetime] NOT NULL
) ON [PRIMARY]
GO
SET ANSI_PADDING OFF
GO
ALTER TABLE [dbo].[DatabaseGrouping] ADD DEFAULT (getdate()) FOR [Created]
GO
-- Script to load above table with group numbers
Create PROCEDURE [dbo].[usp_DatabaseGrouping]
AS
BEGIN
TRUNCATE TABLE [dbo].[DatabaseGrouping]
CREATE TABLE #LoadDbSize
(
DBName VARCHAR(400) ,
DBSize FLOAT
)
INSERT INTO #LoadDbSize
( DBName ,
DBSize
)
SELECT [Database Name] = DB_NAME(database_id) ,
[Size in MB] = CAST(( ( SUM(Size) * 8 ) / 1024.0 ) AS DECIMAL(18,
2))
FROM sys.master_files
where database_id > 4
GROUP BY ( DB_NAME(database_id) )
INSERT INTO [dbo].[DatabaseGrouping]
( DatabaseName ,
Size ,
GroupNumber
)
SELECT DBName ,
DBSize ,
1 + ( ( ROW_NUMBER() OVER ( ORDER BY DBSize ) - 1 )
% 4 ) GroupNumber
FROM #LoadDbSize
DROP TABLE #LoadDbSize
END
[/code]
Viewing 6 posts - 1 through 5 (of 5 total)
You must be logged in to reply to this topic. Login to reply