Subset Sum without Cursor

  • Shawn Therrien (5/11/2009)


    Jeffrey Williams (5/11/2009)


    Shawn Therrien (5/11/2009)


    I do apologize. I didn't realize that VPN software didn't work with a 64bit OS and was stranded away from my work materials.

    Not sure what VPN software you use, but if you are looking for an IPSec solution on x64 that can replace a Cisco client, you can go to:

    http://www.ncp-e.com/en.html

    I had the same issue with our Cisco client and use this now. Works very well - in fact, I think it works better. However, it is not inexpensive - but not too bad.

    Thanks, I'll look into it.

    My IT dept told me to install Virtual PC on my computer and throw a 32bit OS onto it. :rolleyes: I guess it's a solution to get me remoting into my work PC. I guess a little better than just installing a 32bit OS on my home box and ignore the extra ram.

    I was told the same thing - but, when I asked them to purchase the full product so I could install it in Virtual PC, they weren't sure how to respond. I ended up buying this myself for $144 US - and it has been worth every penny so far.

    Jeffrey Williams
    “We are all faced with a series of great opportunities brilliantly disguised as impossible situations.”

    ― Charles R. Swindoll

    How to post questions to get better answers faster
    Managing Transaction Logs

  • I've used the data that you posted to mock-up a preliminary routine to solve this:

    Alter Proc spInstructorAssignments_SS_Solution as

    /*select * from InstructorSummary order by StudentId

    select * from Assignment --Where StudentId=101

    */

    ;WITH cteTally as (

    Select top 1024-- Only need 2^10 numbers

    (Row_Number() Over(order by object_id)) as N

    From sys.system_columns

    Order By N

    ), cteInstructorSummary as (

    Select *--sequence and count the instructors for each student

    , Row_Number() Over(Partition By MonthID, StudentID Order By DaysTotal) as InstrSeq

    , Count(InstructorID) Over(Partition By MonthID, StudentID) as InstrCnt

    From InstructorSummary

    ), cteAssignment as (

    Select *--sequence the assignments for each student

    , Row_Number() Over(Partition By MonthID, StudentID Order By Days) as CoursSeq

    From Assignment

    ), cteAssignSum as (

    -- rollup total days for each student

    Select MonthID, StudentID

    , Sum(Days) as TotalDays

    , Count(*) as AssignCnt

    From cteAssignment A

    Group By MonthID, StudentID

    ), cteStudentInstructors as (

    --Produce summary info for student's instructor info

    -- for students with 2 instructors in a month

    Select MonthID, StudentID

    , Sum(DaysTotal) as TotalDays

    , Min(InstrCnt) as InstrCnt

    , Min(Case When InstrSeq=1 Then InstructorID End) as Instr1

    , Min(Case When InstrSeq=1 Then DaysTotal End) as Days1

    , Min(Case When InstrSeq=2 Then InstructorID End) as Instr2

    , Min(Case When InstrSeq=2 Then DaysTotal End) as Days2

    From cteInstructorSummary

    Where InstrCnt = 2

    Group By MonthID, StudentID

    ), cteSubset2Cases as (

    --Determine all of the SubsetSum(partitions=2) candidates

    Select si.MonthId, si.StudentID, si.TotalDays

    , si.Instr1

    , si.Days1

    , si.Instr2

    , AssignCnt

    , AssignCnt/2 as EvenCnt

    , (AssignCnt+1)/2 as OddCnt

    From cteStudentInstructors si

    Join cteAssignSum ss ON ( si.MonthId = ss.MonthID

    And si.StudentID = ss.StudentID

    And si.TotalDays = ss.TotalDays )

    ), cteSs2Assignments as (

    --Assignments for subset2 cases

    Select A.*

    , S.TotalDays

    , S.Instr1

    , S.Days1

    , CoursSeq % 2 as Half

    From cteAssignment A

    Join cteSubset2Cases S

    ON A.MonthID = S.MonthId

    And A.StudentID = S.StudentID

    ), cteSs2AssignHalf as (

    Select *--separate half of the Assignments

    , Count(*) Over(partition by MonthId, StudentID, Half) as AssgCnt

    , Power(2, Row_Number() over(

    partition by MonthId, StudentID, Half

    order by Days)-1) as AssgMsk

    From cteSs2Assignments

    ), cteSs2OddAssignGroups as (

    Select t.N--Determine all possible totals for the Odd assignments

    , s.MonthID, s.StudentID

    , a.ID, a.Course, IsNull(a.Days, 0) as Days

    , a.CoursSeq, a.AssgMsk

    , s.OddCnt, s.Instr1, s.Days1

    From cteTally t

    Join cteSubset2Cases s ON t.N 0

    And a.Half = 1--odd half

    And s.MonthID = a.MonthID

    And s.StudentID = a.StudentID

    ), cteSs2OddGroupSums as (

    Select MonthId, StudentId, N as Grp

    , Sum(Days) as SubtotalDays

    , Min(Instr1) as Instr1

    , Min(Days1) as Days1

    , count(Course) as CoursCnt

    , Sum(Days) as TotIdx

    From cteSs2OddAssignGroups

    Group By MonthId, StudentId, N

    ), cteSs2EvenAssignGroups as (

    --Determine all possible totals for the Even assignments

    Select t.N

    , s.MonthID, s.StudentID

    , a.ID, a.Course, IsNull(a.Days, 0) as Days

    , a.CoursSeq, a.AssgMsk

    , s.EvenCnt, s.Instr1, s.Days1

    From cteTally t

    Join cteSubset2Cases s ON t.N 0

    And a.Half = 0--even half

    And s.MonthID = a.MonthID

    And s.StudentID = a.StudentID

    ), cteSs2EvenGroupSums as (

    Select MonthId, StudentId, N as Grp

    , Sum(Days) as SubtotalDays

    , Min(Instr1) as Instr1

    , Min(Days1) as Days1

    , count(Course) as CoursCnt

    , Min(Days1) - Sum(Days) as TotIdx

    From cteSs2EvenAssignGroups

    Group By MonthId, StudentId, N

    )

    SELECT o.MonthId, o.StudentID

    , e.subtotaldays + o.subtotaldays as TotalDays

    , e.CoursCnt + o.CoursCnt as TotalCourses

    , o.Grp as OddGrp

    , e.Grp as EvenGrp

    , o.subTotalDays as OddDays

    , e.subTotalDays as EvenDays

    , o.Instr1

    , o.Days1

    , o.TotIdx

    From cteSs2EvenGroupSums e

    Join cteSs2OddGroupSums o

    ON e.TotIdx = o.totidx

    And e.StudentID = o.StudentID

    And e.MonthID = o.MonthID

    Order by MonthID, StudentID, e.totidx desc

    This probably does not return info the way that you may need, because I have not seen how you want that yet.

    [font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
    Proactive Performance Solutions, Inc.
    [/font]
    [font="Verdana"] "Performance is our middle name."[/font]

  • First off, let me say that this is great! I'm looking at it and after playing with it a bit I've learned a few new techniques ordering subsets, tables, using CTE's in on top of each other... and I'm only a quarter way down and the really interesting things haven't happened yet!

    As to how this ends up... just like it started. We had values in Assignment.InstructorId that were lost in a transformation. In the end of this Subset Sum we end up with a valid array of Assignments being tagged by each Instructor.

    There are likely dozens of "correct" results as the best we can do is get the numbers right. With the data that was lost there just isn't a way to magically have the database determine exactly which Instructor actually tutored which courses unless there is only one Instructor.

    So for the first combination (student 101, Instructors 501,502) we have the following values:

    DaysCourse

    1Algebra 1A

    3English 1A

    3PE-Running

    1Reading 2B

    2Weaving

    Solutions would include:

    DaysCourseInstructorId

    1Algebra 1A501

    3English 1A501

    3PE-Running502

    1Reading 2B501

    2Weaving502

    DaysCourseInstructorId

    1Algebra 1A501

    3English 1A502

    3PE-Running501

    1Reading 2B501

    2Weaving502

    There are a few other variations that are equally fine, in this case exchanging values of 2 and 3 around to match 5 in different combinations.

    The subset sum cursor we have went from about 10 tables down to two for this example, so I'm still working out the kinks. It feels like it's about 98% there, but it's not placing all of the instructorid's into Assignment.InstrutorId.

    Thank you so much for your help with this! Exactly the sort of thing thats helping me learn some great techniques. I don't quite see where this will end up, but yesterday I couldn't have provided such an array of statistics without a cursor.

  • Shawn Therrien (5/12/2009)


    ...There are a few other variations that are equally fine, in this case exchanging values of 2 and 3 around to match 5 in different combinations....

    The partial solution that I posted enumerates all of the valid solutions, but hasn't done the arbitrary narrowing down to a single solution yet, nor does it do anything with it.

    The subset sum cursor we have went from about 10 tables down to two for this example, so I'm still working out the kinks. It feels like it's about 98% there, but it's not placing all of the instructorid's into Assignment.InstrutorId.

    My example does not do this yet either. Let me know and I can add this whenever you want.

    [font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
    Proactive Performance Solutions, Inc.
    [/font]
    [font="Verdana"] "Performance is our middle name."[/font]

  • I forgot to mention that my example only addresses the case of Two instructors in one month. The one instructor case is striagh-forward of course. The three instructor case is especially complicated, but can be solved as a nested two-instructor problem.

    [font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
    Proactive Performance Solutions, Inc.
    [/font]
    [font="Verdana"] "Performance is our middle name."[/font]

  • so that would be a rbar promotion ?

    I know I'm way off, but in would a pivot help to determine the set ?

    IF object_id('v_SSC_Assignment_InstructorCorse') IS NOT NULL

    DROP view v_SSC_Assignment_InstructorCorse

    go

    Create view v_SSC_Assignment_InstructorCorse

    as

    Select MonthId

    , InstructorId

    , StudentId

    , Course

    , [Days]

    , Sum(Days) over ( partition by StudentId) as TotalStudentDays

    from Assignment

    go

    declare @pivotin as varchar(max)

    Declare @separator char(1)

    Select @pivotin = ''

    , @separator = ''

    Select @pivotin = @pivotin + @separator + '[' + convert(varchar(15),InstructorId) + ']'

    , @separator = ','

    from (Select InstructorId from Assignment group by InstructorId ) i

    EXECUTE ('SELECT *

    FROM v_SSC_Assignment_InstructorCorse

    pivot (sum([Days]) FOR InstructorId

    in (' + @pivotin+ ')) pvt')

    Johan

    Learn to play, play to learn !

    Dont drive faster than your guardian angel can fly ...
    but keeping both feet on the ground wont get you anywhere :w00t:

    - How to post Performance Problems
    - How to post data/code to get the best help[/url]

    - How to prevent a sore throat after hours of presenting ppt

    press F1 for solution, press shift+F1 for urgent solution 😀

    Need a bit of Powershell? How about this

    Who am I ? Sometimes this is me but most of the time this is me

  • This is the same partial solution as before, just with better doucmentation and some column names:

    Alter Proc spInstructorAssignments_SS_Solution as

    /*select * from InstructorSummary order by StudentId

    select * from Assignment --Where StudentId=101

    Note: this implements the Horowitz-Sahni solution to

    the Subset Sums problem, which should run in O(N*2^(N/2)).

    -- RBarryYoung, May, 2009

    */

    ;WITH cteTally as (-- Provides numbers for binary enumeration of subsets

    Select top 1024--(only need 2^10 numbers, because 10 is the max subset-half size)

    (Row_Number() Over(order by object_id)) as N

    From sys.system_columns

    Order By N

    ), cteInstructorSummary as (

    Select *--sequence and count the instructors for each student

    , Row_Number() Over(Partition By MonthID, StudentID Order By DaysTotal) as InstrSeq

    , Count(InstructorID) Over(Partition By MonthID, StudentID) as InstrCnt

    From InstructorSummary

    ), cteAssignment as (

    Select *--sequence the assignments for each student

    , Row_Number() Over(Partition By MonthID, StudentID Order By Days) as CoursSeq

    From Assignment

    ), cteAssignSum as (

    -- rollup total days and assignments for each student

    Select MonthID, StudentID

    , Sum(Days) as TotalDays

    , Count(*) as AssignCnt

    From cteAssignment A

    Group By MonthID, StudentID

    ), cteStudentInstructors as (

    --Produce summary info for student's instructor info

    -- for students with 2 instructors in a month

    Select MonthID, StudentID

    , Sum(DaysTotal) as TotalDays

    , Min(InstrCnt) as InstrCnt

    , Min(Case When InstrSeq=1 Then InstructorID End) as Instr1

    , Min(Case When InstrSeq=1 Then DaysTotal End) as Days1

    , Min(Case When InstrSeq=2 Then InstructorID End) as Instr2

    , Min(Case When InstrSeq=2 Then DaysTotal End) as Days2

    From cteInstructorSummary

    Where InstrCnt = 2

    Group By MonthID, StudentID

    ), cteSubset2Cases as (

    --Determine all of the SubsetSum(partitions=2) candidates

    --(must be 2 instructors

    -- and instructors totals must be equal to students totals)

    Select si.MonthId, si.StudentID, si.TotalDays

    , si.Instr1

    , si.Days1

    , si.Instr2

    , AssignCnt

    , AssignCnt/2 as EvenCnt

    , (AssignCnt+1)/2 as OddCnt

    From cteStudentInstructors si

    Join cteAssignSum ss ON ( si.MonthId = ss.MonthID

    And si.StudentID = ss.StudentID

    And si.TotalDays = ss.TotalDays )

    ), cteSs2Assignments as (

    --All Assignments that are SubsetSum(2) cases

    Select A.*

    , S.TotalDays

    , S.Instr1

    , S.Days1

    , CoursSeq % 2 as Half-- sets odd=1 or even=0

    From cteAssignment A

    Join cteSubset2Cases S

    ON A.MonthID = S.MonthId

    And A.StudentID = S.StudentID

    ), cteSs2AssignHalf as (

    Select *--divides the Assignments into odd and even halves

    , Count(*) Over(partition by MonthId, StudentID, Half) as AssgCnt

    , Power(2, Row_Number() over(

    partition by MonthId, StudentID, Half

    order by Days)-1) as AssgMsk--used for binary enumeration masking

    From cteSs2Assignments

    ), cteSs2OddAssignGroups as (

    Select t.N--Determine all possible subsets for the Odd assignments

    , s.MonthID, s.StudentID

    , a.ID, a.Course, IsNull(a.Days, 0) as Days

    , a.CoursSeq, a.AssgMsk

    , s.OddCnt, s.Instr1, s.Days1

    From cteTally t

    Join cteSubset2Cases s

    ON t.N 0

    And a.Half = 1--Odd half

    And s.MonthID = a.MonthID

    And s.StudentID = a.StudentID

    ), cteSs2OddGroupSums as (-- Sum all of the odd subsets

    Select MonthId, StudentId, N as Grp

    , Sum(Days) as OddSum

    , Min(Instr1) as Instr1

    , Min(Days1) as Days1

    , count(Course) as CoursCnt

    From cteSs2OddAssignGroups

    Group By MonthId, StudentId, N

    ), cteSs2EvenAssignGroups as (

    Select t.N--Determine all possible subsets for the Even assignments

    , s.MonthID, s.StudentID

    , a.ID, a.Course, IsNull(a.Days, 0) as Days

    , a.CoursSeq, a.AssgMsk

    , s.EvenCnt, s.Instr1, s.Days1

    From cteTally t

    Join cteSubset2Cases s

    ON t.N 0

    And a.Half = 0--Even half

    And s.MonthID = a.MonthID

    And s.StudentID = a.StudentID

    ), cteSs2EvenGroupSums as (-- Sum all of the even subsets

    Select MonthId, StudentId, N as Grp

    , Sum(Days) as EvenSum

    , Min(Instr1) as Instr1

    , Min(Days1) as Days1

    , count(Course) as CoursCnt

    -- EvenSumRemainder(below) is the key trick for SQL to match the halves efficiently

    , Min(Days1) - Sum(Days) as EvenSumRemainder

    From cteSs2EvenAssignGroups

    Group By MonthId, StudentId, N

    )

    SELECT o.MonthId, o.StudentID

    , EvenSum + OddSum as TotalDays

    , e.CoursCnt + o.CoursCnt as TotalCourses

    , o.Grp as OddGrp

    , e.Grp as EvenGrp

    , EvenSum

    , EvenSumRemainder

    , OddSum

    , o.Instr1

    , o.Days1

    From cteSs2EvenGroupSums e

    Join cteSs2OddGroupSums o

    ON e.EvenSumRemainder = o.OddSum-- match the Odd total to the Even remainder

    --(note: EvenSumRemainder reverses the half-sum so that the odd & even halves

    -- can be matched with an equi-join merge (may need to force it though) )

    And e.StudentID = o.StudentID

    And e.MonthID = o.MonthID

    Order by MonthID, StudentID, TotalCourses desc

    [font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
    Proactive Performance Solutions, Inc.
    [/font]
    [font="Verdana"] "Performance is our middle name."[/font]

Viewing 7 posts - 16 through 21 (of 21 total)

You must be logged in to reply to this topic. Login to reply