NTILE help

  • I have the following table of departments, members, and actions:

    CREATE TABLE [dbo].[GroupMembership](

    [grp] [varchar](200) NULL,

    [member] [varchar](200) NULL,

    [act] [varchar](3) NULL

    ) ON [PRIMARY]

    GO

    SET ANSI_PADDING OFF

    GO

    INSERT [dbo].[GroupMembership] ([grp], [member], [act]) VALUES (N'Sales', N'jsmith', N'add')

    INSERT [dbo].[GroupMembership] ([grp], [member], [act]) VALUES (N'Sales', N'jjones', N'add')

    INSERT [dbo].[GroupMembership] ([grp], [member], [act]) VALUES (N'Sales', N'ljames', N'add')

    INSERT [dbo].[GroupMembership] ([grp], [member], [act]) VALUES (N'Marketing', N'srichards', N'add')

    INSERT [dbo].[GroupMembership] ([grp], [member], [act]) VALUES (N'Marketing', N'lking', N'add')

    INSERT [dbo].[GroupMembership] ([grp], [member], [act]) VALUES (N'Marketing', N'lsandy', N'add')

    INSERT [dbo].[GroupMembership] ([grp], [member], [act]) VALUES (N'Marketing', N'hwilliams', N'add')

    INSERT [dbo].[GroupMembership] ([grp], [member], [act]) VALUES (N'Marketing', N'mwilliams', N'add')

    INSERT [dbo].[GroupMembership] ([grp], [member], [act]) VALUES (N'Marketing', N'bwillis', N'add')

    INSERT [dbo].[GroupMembership] ([grp], [member], [act]) VALUES (N'Marketing', N'tljones', N'add')

    INSERT [dbo].[GroupMembership] ([grp], [member], [act]) VALUES (N'Marketing', N'wsmith', N'add')

    INSERT [dbo].[GroupMembership] ([grp], [member], [act]) VALUES (N'Marketing', N'mhenry', N'add')

    INSERT [dbo].[GroupMembership] ([grp], [member], [act]) VALUES (N'Sales', N'sanderson', N'rem')

    INSERT [dbo].[GroupMembership] ([grp], [member], [act]) VALUES (N'Sales', N'dcook', N'rem')

    INSERT [dbo].[GroupMembership] ([grp], [member], [act]) VALUES (N'Sales', N'rwilliams', N'rem')

    INSERT [dbo].[GroupMembership] ([grp], [member], [act]) VALUES (N'Sales', N'hnill', N'rem')

    INSERT [dbo].[GroupMembership] ([grp], [member], [act]) VALUES (N'Sales', N'tnill', N'rem')

    INSERT [dbo].[GroupMembership] ([grp], [member], [act]) VALUES (N'Sales', N'chill', N'rem')

    SELECT *

    FROM GroupMembership

    Ideally, I'd like the table to look like this:

    CREATE TABLE [dbo].[GroupMembershipNew](

    [grp_no] [int] NULL,

    [grp_sub_no] [int] NULL,

    [grp] [varchar](200) NULL,

    [member] [varchar](200) NULL,

    [act] [varchar](3) NULL

    ) ON [PRIMARY]

    GO

    SET ANSI_PADDING OFF

    GO

    INSERT [dbo].[GroupMembershipNew] ([grp_no], [grp_sub_no], [grp], [member], [act]) VALUES (1, 1, N'Sales', N'jsmith', N'add')

    INSERT [dbo].[GroupMembershipNew] ([grp_no], [grp_sub_no], [grp], [member], [act]) VALUES (1, 1, N'Sales', N'jjones', N'add')

    INSERT [dbo].[GroupMembershipNew] ([grp_no], [grp_sub_no], [grp], [member], [act]) VALUES (1, 1, N'Sales', N'ljames', N'add')

    INSERT [dbo].[GroupMembershipNew] ([grp_no], [grp_sub_no], [grp], [member], [act]) VALUES (2, 1, N'Marketing', N'srichards', N'add')

    INSERT [dbo].[GroupMembershipNew] ([grp_no], [grp_sub_no], [grp], [member], [act]) VALUES (2, 1, N'Marketing', N'lking', N'add')

    INSERT [dbo].[GroupMembershipNew] ([grp_no], [grp_sub_no], [grp], [member], [act]) VALUES (2, 1, N'Marketing', N'lsandy', N'add')

    INSERT [dbo].[GroupMembershipNew] ([grp_no], [grp_sub_no], [grp], [member], [act]) VALUES (2, 2, N'Marketing', N'hwilliams', N'add')

    INSERT [dbo].[GroupMembershipNew] ([grp_no], [grp_sub_no], [grp], [member], [act]) VALUES (2, 2, N'Marketing', N'mwilliams', N'add')

    INSERT [dbo].[GroupMembershipNew] ([grp_no], [grp_sub_no], [grp], [member], [act]) VALUES (2, 2, N'Marketing', N'bwillis', N'add')

    INSERT [dbo].[GroupMembershipNew] ([grp_no], [grp_sub_no], [grp], [member], [act]) VALUES (2, 3, N'Marketing', N'tljones', N'add')

    INSERT [dbo].[GroupMembershipNew] ([grp_no], [grp_sub_no], [grp], [member], [act]) VALUES (2, 3, N'Marketing', N'wsmith', N'add')

    INSERT [dbo].[GroupMembershipNew] ([grp_no], [grp_sub_no], [grp], [member], [act]) VALUES (2, 3, N'Marketing', N'mhenry', N'add')

    INSERT [dbo].[GroupMembershipNew] ([grp_no], [grp_sub_no], [grp], [member], [act]) VALUES (3, 1, N'Sales', N'sanderson', N'rem')

    INSERT [dbo].[GroupMembershipNew] ([grp_no], [grp_sub_no], [grp], [member], [act]) VALUES (3, 1, N'Sales', N'dcook', N'rem')

    INSERT [dbo].[GroupMembershipNew] ([grp_no], [grp_sub_no], [grp], [member], [act]) VALUES (3, 1, N'Sales', N'rwilliams', N'rem')

    INSERT [dbo].[GroupMembershipNew] ([grp_no], [grp_sub_no], [grp], [member], [act]) VALUES (3, 2, N'Sales', N'hnill', N'rem')

    INSERT [dbo].[GroupMembershipNew] ([grp_no], [grp_sub_no], [grp], [member], [act]) VALUES (3, 2, N'Sales', N'tnill', N'rem')

    INSERT [dbo].[GroupMembershipNew] ([grp_no], [grp_sub_no], [grp], [member], [act]) VALUES (3, 2, N'Sales', N'chill', N'rem')

    SELECT *

    FROM GroupMembershipNew

    I'm having a brain freeze -- I know that I can number each unique action-department combination using DENSE_RANK():

    SELECT DENSE_RANK() OVER(ORDER BY act, grp desc) AS grp_no, *

    FROM GroupMembership

    but what I'm really looking to do is get the number in each group, divide it by 3, and use that number in the NTILE function, so that the number of groups is contingent upon the number of records in that group. The table I posted above demonstrates that:

    There are 3 sales-add records, so 3/3 = 1. Using 1 with the NTILE function results in only 1 grouping.

    There are 9 marketing-add records, so 9/3 = 3. Using 3 with the NTILE function results in 3 equal groupings, 1-3.

    There are 6 sales-rem records, so 6/3 = 2. Using 2 with the NTILE function results in 2 equal groupings, 1-2.

    I'm sure there's a set-based way to do this (possibly with a recursive cte), but even if it's in a loop, that might be OK too. Again, there may be a really easy solution that I'm way over thinking, but does anyone have any ideas?

    Thanks,

    Mike

    Mike Scalise, PMP
    https://www.michaelscalise.com

  • Hi Mike

    NTILE() is great for splitting tables into equal partitions for e.g. marketing, I think this better suits your requirement:

    SELECT

    grp_no,

    grp_sub_no = ((rn+2)/3*3)/3,

    grp,

    member,

    act

    FROM (

    SELECT

    grp_no,

    grp,

    member,

    act,

    rn = ROW_NUMBER() OVER(PARTITION BY grp_no ORDER BY grp)

    FROM (

    SELECT grp_no = dense_RANK() OVER(ORDER BY act, grp DESC), grp, member, act

    FROM GroupMembership

    ) d

    ) x

    Cheers

    ChrisM


    [font="Arial"]Low-hanging fruit picker and defender of the moggies[/font]

    For better assistance in answering your questions, please read this[/url].


    Understanding and using APPLY, (I)[/url] and (II)[/url] Paul White[/url]

    Hidden RBAR: Triangular Joins[/url] / The "Numbers" or "Tally" Table: What it is and how it replaces a loop[/url] Jeff Moden[/url]

  • mikes84 (12/20/2011)


    Thanks for the response, Chris. Would you mind explaining this line a little bit more?

    grp_sub_no = ((rn+2)/3*3)/3

    The actual number I'd like to split the groups up by is 700. Therefore, if one group had 2200 "add"s then it would have sub_grp_nos from 1-4. Does that make sense, and do you know how the above would change to accommodate groups of 700?

    Thanks,

    Mike

    Hi Mike

    Sure...the expression can be simplified to (rn+2)/3 - it was late 🙂

    The sample code below will show you how this expression works, and I've modified it so you can plug in any group size.

    DECLARE @GroupPartitionSize INT

    SET @GroupPartitionSize = 3 -- 700 in your actual data

    SELECT

    grp_no,

    cnt, -- workings, not part of solution

    rn, -- workings

    frac = (rn+(@GroupPartitionSize-1.0))/@GroupPartitionSize, -- workings

    grp_sub_no = (rn+(@GroupPartitionSize-1))/@GroupPartitionSize,

    grp,

    member,

    act

    FROM (

    SELECT

    grp_no,

    grp,

    member,

    act,

    cnt = COUNT(*) OVER(PARTITION BY grp_no),

    rn = ROW_NUMBER() OVER(PARTITION BY grp_no ORDER BY grp)

    FROM (

    SELECT

    grp_no = DENSE_RANK() OVER(ORDER BY act, grp DESC),

    grp,

    member,

    act

    FROM GroupMembership

    ) d

    ) x


    [font="Arial"]Low-hanging fruit picker and defender of the moggies[/font]

    For better assistance in answering your questions, please read this[/url].


    Understanding and using APPLY, (I)[/url] and (II)[/url] Paul White[/url]

    Hidden RBAR: Triangular Joins[/url] / The "Numbers" or "Tally" Table: What it is and how it replaces a loop[/url] Jeff Moden[/url]

  • 2 ChrisM@home

    even more simple 🙂declare @subgroupsize int

    set @subgroupsize=3

    select dense_rank() over(order by act,grp desc) grp_no,1+(rn-1)/@subgroupsize grp_sub_no

    ,grp,member,act

    from

    (select * ,ROW_NUMBER() over(partition by act,grp order by (select 0)) rn

    from GroupMembership

    ) a

  • VIG (12/21/2011)


    2 ChrisM@home

    even more simple 🙂declare @subgroupsize int

    set @subgroupsize=3

    select dense_rank() over(order by act,grp desc) grp_no,1+(rn-1)/@subgroupsize grp_sub_no

    ,grp,member,act

    from

    (select * ,ROW_NUMBER() over(partition by act,grp order by (select 0)) rn

    from GroupMembership

    ) a

    It won't get any simpler or faster than this. Even when you consider that it's easier to improve an existing query than to write one from scratch, this is still an excellent solution, VIG.


    [font="Arial"]Low-hanging fruit picker and defender of the moggies[/font]

    For better assistance in answering your questions, please read this[/url].


    Understanding and using APPLY, (I)[/url] and (II)[/url] Paul White[/url]

    Hidden RBAR: Triangular Joins[/url] / The "Numbers" or "Tally" Table: What it is and how it replaces a loop[/url] Jeff Moden[/url]

  • 2 ChrisM@home

    10x

  • Thanks guys, very helpful! I have one more question. Is there any way make it so that if there's a remainder, it's at the beginning instead of at the end?

    For example, if my subgroup size is 3 and I have a group with 8 records, it'd be idea if it looked like this (where the remainder - 2 - is assigned to subgroup 1):

    11Marketingsrichardsadd

    11Marketinglking add

    12Marketinghwilliamsadd

    12Marketingmwilliamsadd

    12Marketingbwillisadd

    13Marketingtljonesadd

    13Marketingwsmithadd

    13Marketingmhenryadd

    Instead of:

    11Marketingsrichardsadd

    11Marketinglking add

    11Marketinglsandyadd

    12Marketinghwilliamsadd

    12Marketingmwilliamsadd

    12Marketingbwillisadd

    13Marketingtljonesadd

    13Marketingwsmithadd

    Where group 3 gets the remainder of 2.

    Thanks in advance,

    Mike

    Mike Scalise, PMP
    https://www.michaelscalise.com

  • mikes84 (12/21/2011)


    Thanks guys, very helpful! I have one more question. Is there any way make it so that if there's a remainder, it's at the beginning instead of at the end?

    declare @subgroupsize int

    set @subgroupsize=3

    select grp_no

    ,1+ MAX(grp_sub_no) over(partition by grp_no,act)- grp_sub_no grp_sub_no

    ,grp,member,act

    from

    (select dense_rank() over(order by act,grp desc) grp_no

    ,rn/@subgroupsize grp_sub_no,grp,member,act

    from

    (select *,count(*) over (partition by act,grp)-ROW_NUMBER() over(partition by act,grp order by (select 0)) rn

    from GroupMembership

    ) a

    )b

  • DECLARE @GroupPartitionSize INT

    SET @GroupPartitionSize = 6 -- 700 in your actual data

    SELECT

    grp_no,

    cnt, -- workings, not part of solution

    rn, -- workings

    frac = (rn - 1.0 +ISNULL(@GroupPartitionSize-NULLIF(cnt%@GroupPartitionSize,0),0)+ @GroupPartitionSize)/@GroupPartitionSize, -- workings

    grp_sub_no = (rn -1+ISNULL(@GroupPartitionSize-NULLIF(cnt%@GroupPartitionSize,0),0) + @GroupPartitionSize)/@GroupPartitionSize,

    grp,

    member,

    act

    FROM (

    SELECT

    grp_no,

    grp,

    member,

    act,

    cnt = COUNT(*) OVER(PARTITION BY grp_no),

    rn = ROW_NUMBER() OVER(PARTITION BY grp_no ORDER BY grp)

    FROM (

    SELECT

    grp_no = DENSE_RANK() OVER(ORDER BY act, grp DESC),

    grp,

    member,

    act

    FROM GroupMembership

    ) d

    ) x


    [font="Arial"]Low-hanging fruit picker and defender of the moggies[/font]

    For better assistance in answering your questions, please read this[/url].


    Understanding and using APPLY, (I)[/url] and (II)[/url] Paul White[/url]

    Hidden RBAR: Triangular Joins[/url] / The "Numbers" or "Tally" Table: What it is and how it replaces a loop[/url] Jeff Moden[/url]

Viewing 9 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply