Recursive queries - get the top N quantity

Question

Recursive queries - get the top N quantity

steve-lauzon

SSC Veteran

Points: 283
More actions
April 12, 2010 at 4:15 pm

#213783

Hi all,
I need to select the top N quantity
1. I have a division table that has the total quantity associated to it
2. I have a division bloc table that contains the weight that each bloc represents within a specified region. In the data below, BlocID #1 represents 10% of division #1
I need to retreive N amount of quantity. In the example below, 166
I need to know, how much QTY will be retreive from each individual blocs, starting from the bloc that has the highest weight. I must not exceed the amount of quantity to be retreive - this is where I'm confused... how not to exceed the amount.
I have absolutely no clue how to do this... my guess is with a recursive query... Can you expert's give a newbie advice 🙂
Thanks
Here is the tables that I'm using
[Code="sql"]
CREATE TABLE #tDivision (
DivisionID int primary key not null
,DivisionName nvarchar(100)
,DivisionQty int
)
CREATE TABLE #tDivisionBloc (
BlocID int primary key not null
,DivisionID int
,BlocWeight float
)
--Insert test data in division
INSERT #tDivision (DivisionID, DivisionName, DivisionQty) values (1, 'Division 1',500)
--Insert test data in divisionbloc
INSERT #tDivisionBloc (BlocID, DivisionID, BlocWeight) values (1, 1, 0.1)
INSERT #tDivisionBloc (BlocID, DivisionID, BlocWeight) values (2, 1, 0.05)
INSERT #tDivisionBloc (BlocID, DivisionID, BlocWeight) values (3, 1, 0.02)
INSERT #tDivisionBloc (BlocID, DivisionID, BlocWeight) values (4, 1, 0.03)
INSERT #tDivisionBloc (BlocID, DivisionID, BlocWeight) values (5, 1, 0.20)
INSERT #tDivisionBloc (BlocID, DivisionID, BlocWeight) values (6, 1, 0.1)
INSERT #tDivisionBloc (BlocID, DivisionID, BlocWeight) values (7, 1, 0.05)
INSERT #tDivisionBloc (BlocID, DivisionID, BlocWeight) values (8, 1, 0.05)
INSERT #tDivisionBloc (BlocID, DivisionID, BlocWeight) values (9, 1, 0.15)
INSERT #tDivisionBloc (BlocID, DivisionID, BlocWeight) values (10, 1, 0.02)
INSERT #tDivisionBloc (BlocID, DivisionID, BlocWeight) values (11, 1, 0.2)
INSERT #tDivisionBloc (BlocID, DivisionID, BlocWeight) values (12, 1, 0.03)
DECLARE @QtyToRetreive int = 366
--Recursive query here...
DROP TABLE #tDivision
DROP TABLE #tDivisionBloc
[/code]
Thanks

Viewing 15 posts - 1 through 15 (of 16 total)

You must be logged in to reply to this topic. Login to reply

Jeff Moden SSC Guru Points: 1004432 More actions · Answer 1

I don't believe it can be done without a wee bit more information. Either the Qty column is missing from the second table or we're supposed to guess that the total quanity in the first table equals the total weight in the second table. Also, what happens when the largest block is finally depleted enough to "tie" with the second largest block? Do you then want to deplete both until they become tied with the 3rd block and then deplete all 3 at the same rate?

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

WayneS SSC Guru Points: 95461 More actions · Answer 2

You didn't give us all the information we're going to need in order to help you.

For instance, does "highest weight" = #tDivisionBloc.BlocWeight?

Does "Qty" = #tDivision.DivisionQty, or #tDivisionQty * #tDivisionBloc.BlocWeight?

Based upon the data you've provided, what do you expect to get for the results?

Wayne
Microsoft Certified Master: SQL Server 2008
Author - SQL Server T-SQL Recipes

If you can't explain to another person how the code that you're copying from the internet works, then DON'T USE IT on a production system! After all, you will be the one supporting it!

Links:
For better assistance in answering your questions
Performance Problems
Common date/time routines
Understanding and Using APPLY Part 1 & Part 2

steve-lauzon SSC Veteran Points: 283 More actions · Answer 3

Thanks... here are the clarifications

For the sake of the argument, I will focus only on DivisionID = 1. I've also edited the code above

The QTY from the second table is calculated as follows

#tDivision.DivisionQty* #tDivisionBloc.BlocWeight where #tDivision.DivisionID = #tDivisionBloc.DivisionID

500 (Quantity of Division #1) * #tDivisionBloc.BlocWeight

Therefore

#tDivisionBloc.BlocID #1 = 500 * 10% = 50

#tDivisionBloc.BlocID #2 = 500 * 5% = 25

#tDivisionBloc.BlocID #3 = 500 * 2% = 10

#tDivisionBloc.BlocID #4 = 500 * 3% = 15

#tDivisionBloc.BlocID #5 = 500 * 20% = 100

#tDivisionBloc.BlocID #6 = 500 * 10% = 50

#tDivisionBloc.BlocID #7 = 500 * 5% = 25

#tDivisionBloc.BlocID #8 = 500 * 5% = 25

#tDivisionBloc.BlocID #9 = 500 * 15% = 75

#tDivisionBloc.BlocID #10 = 500 * 2% = 10

#tDivisionBloc.BlocID #11 = 500 * 20% = 100

#tDivisionBloc.BlocID #12 = 500 * 3% = 15

I need to get the first N quantity (366 in the example) starting from the bloc that has the most numbers. The result should be the following

Bloc #5 : 100 --> 266 need to be retrieve after

Bloc #11 : 100 --> 166 need to be retrieve after

Bloc #9 : 75 --> 91 need to be retrieve after

Bloc #1 : 50 --> 41 need to be retrieve after

Bloc #6 : 41 (out of the 50)

AND I MUST STOP HERE

Does this help 🙁 ?

Thanks again

Jeff Moden SSC Guru Points: 1004432 More actions · Answer 4

Yes, it helps a lot. Thanks. Just one final question... is there a column in the second table to show what has been consumed and what has not?

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Matt Miller (4) SSC Guru Points: 124210 More actions · Answer 5

Also - does it matter which blocks get consumed first in the case of a tie? For example - if batches 1,3,6,8 each had 100 and you were looking for 350, does it matter which one might have something left over?

----------------------------------------------------------------------------------
Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

WayneS SSC Guru Points: 95461 More actions · Answer 6

steve-lauzon (4/12/2010)
Thanks... here are the clarifications ... Does this help

Absolutely. Based on these clarifications, this will work for you.

SET NOCOUNT ON

if object_id('tempdb..#tDivision') IS NOT NULL DROP TABLE #tDivision

if object_id('tempdb..#tDivisionBloc') IS NOT NULL DROP TABLE #tDivisionBloc

if object_id('tempdb..#temp') IS NOT NULL DROP TABLE #temp

CREATE TABLE #tDivision (

DivisionID int primary key not null

,DivisionName nvarchar(100)

,DivisionQty int

)

CREATE TABLE #tDivisionBloc (

BlocID int primary key not null

,DivisionID int

,BlocWeight float

)

--Insert test data in division

INSERT #tDivision (DivisionID, DivisionName, DivisionQty) values (1, 'Division 1',500)

INSERT #tDivision (DivisionID, DivisionName, DivisionQty) values (2, 'Division 2',200)

INSERT #tDivision (DivisionID, DivisionName, DivisionQty) values (3, 'Division 3',250)

INSERT #tDivision (DivisionID, DivisionName, DivisionQty) values (4, 'Division 4',100)

INSERT #tDivision (DivisionID, DivisionName, DivisionQty) values (5, 'Division 5',50)

INSERT #tDivision (DivisionID, DivisionName, DivisionQty) values (6, 'Division 6',1000)

INSERT #tDivision (DivisionID, DivisionName, DivisionQty) values (7, 'Division 7',350)

--Insert test data in divisionbloc

INSERT #tDivisionBloc (BlocID, DivisionID, BlocWeight) values (1, 1, 0.1)

INSERT #tDivisionBloc (BlocID, DivisionID, BlocWeight) values (2, 1, 0.05)

INSERT #tDivisionBloc (BlocID, DivisionID, BlocWeight) values (3, 1, 0.02)

INSERT #tDivisionBloc (BlocID, DivisionID, BlocWeight) values (4, 1, 0.03)

INSERT #tDivisionBloc (BlocID, DivisionID, BlocWeight) values (5, 1, 0.20)

INSERT #tDivisionBloc (BlocID, DivisionID, BlocWeight) values (6, 1, 0.1)

INSERT #tDivisionBloc (BlocID, DivisionID, BlocWeight) values (7, 1, 0.05)

INSERT #tDivisionBloc (BlocID, DivisionID, BlocWeight) values (8, 1, 0.05)

INSERT #tDivisionBloc (BlocID, DivisionID, BlocWeight) values (9, 1, 0.15)

INSERT #tDivisionBloc (BlocID, DivisionID, BlocWeight) values (10, 1, 0.02)

INSERT #tDivisionBloc (BlocID, DivisionID, BlocWeight) values (11, 1, 0.2)

DECLARE @QtyToRetrieve int = 166

--Recursive query here...

DECLARE @DivisionID int = 1

-- get the data. Calculate the Qty and put a column for the accumulation of the selected QtyToRetrieve

SELECT t1.* ,

Qty = t2.DivisionQty * t1.BlocWeight,

Accumulator = 0

INTO #temp

FROM #tDivisionBloc t1

JOIN #tDivision t2

ON t1.DivisionID = t2.DivisionID

WHERE t1.DivisionID = @DivisionID

ORDER BY t1.DivisionID, t1.BlocWeight DESC

-- the following update requires a clustered index on the proper columns to work properly.

CREATE CLUSTERED INDEX t1 ON #temp(BlocWeight DESC)

-- need some variables to control the update statement

declare @Accumulator int,

@BlocWeight float

-- This form of the UPDATE statement has some rules for proper usage.

-- See Jeff Moden's article at http://www.sqlservercentral.com/articles/T-SQL/68467/

-- for a complete discussion of how this works, and all of the rules for utilizing it.

UPDATE t

SET @Accumulator = Accumulator = CASE WHEN @QtyToRetrieve > 0 THEN CASE WHEN @QtyToRetrieve >= Qty THEN Qty ELSE @QtyToRetrieve END

ELSE 0 END,

@BlocWeight = BlocWeight, --<< anchor column

@QtyToRetrieve = @QtyToRetrieve - @Accumulator

FROM #temp t WITH (TABLOCKX)

OPTION (MAXDOP 1)

-- show the final results

select * from #temp

DROP TABLE #tDivision

DROP TABLE #tDivisionBloc

DROP TABLE #temp

Edit: simplified the select into and update statements; handle multiple DivisionIDs in #tDivisionBloc

Edit2: corrected name misspelling

Wayne
Microsoft Certified Master: SQL Server 2008
Author - SQL Server T-SQL Recipes

If you can't explain to another person how the code that you're copying from the internet works, then DON'T USE IT on a production system! After all, you will be the one supporting it!

Links:
For better assistance in answering your questions
Performance Problems
Common date/time routines
Understanding and Using APPLY Part 1 & Part 2

Jeff Moden SSC Guru Points: 1004432 More actions · Answer 7

Heh... I was just getting ready to post a nearly identical solution. I'm pretty proud of you guys... you beat me at my own game anymore. 😛 Well done, Wayne.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

WayneS SSC Guru Points: 95461 More actions · Answer 8

Jeff Moden (4/12/2010)
Heh... I was just getting ready to post a nearly identical solution. I'm pretty proud of you guys... you beat me at my own game anymore. 😛 Well done, Wayne.

Well Jeff, thanks for teaching me!

Jeff, I've got one question for you... you were asking about whether the 2nd table already has an accumulator column. I understand the need for it, but I couldn't figure out a better way of doing it against the second table if it did have this column without doing a join to the first table to multiply the two columns together... which violates one of the rules in your article. Do you see a way of doing it against the second table directly?

Wayne
Microsoft Certified Master: SQL Server 2008
Author - SQL Server T-SQL Recipes

If you can't explain to another person how the code that you're copying from the internet works, then DON'T USE IT on a production system! After all, you will be the one supporting it!

Links:
For better assistance in answering your questions
Performance Problems
Common date/time routines
Understanding and Using APPLY Part 1 & Part 2

WayneS SSC Guru Points: 95461 More actions · Answer 9

Matt Miller (#4) (4/12/2010)
Also - does it matter which blocks get consumed first in the case of a tie? For example - if batches 1,3,6,8 each had 100 and you were looking for 350, does it matter which one might have something left over?

Steve, if this does matter, then you will need to add this to the clustered index to make it work correctly.

Wayne
Microsoft Certified Master: SQL Server 2008
Author - SQL Server T-SQL Recipes

If you can't explain to another person how the code that you're copying from the internet works, then DON'T USE IT on a production system! After all, you will be the one supporting it!

Links:
For better assistance in answering your questions
Performance Problems
Common date/time routines
Understanding and Using APPLY Part 1 & Part 2

Jeff Moden SSC Guru Points: 1004432 More actions · Answer 10

Just convert the Qty from the temp table back to weight and do a final update on the original second table. Don't forget to update the Qty on the first original table, as well.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Jeff Moden SSC Guru Points: 1004432 More actions · Answer 11

This could probably stand a tweek here and there but here's the code I came up with that I was going to post before you beat me to the punch. It does update the original tables to show what has been consumed. The tweeks I'm talking about are adding the appropriate implicit transaction and a precheck to ensure enough inventory is available in the bloc table before the whole shootin' match starts. I also tend to do things in a very straight forward manner and someone could probably beat it for performance if they combine some of those straight forward things. I also did everything by weight and then converted to "DrawQty" which probably slows it down a bit more but I thought doing it by weight might somehow be more straight forward. Here's the code...

--=================================================================================================

-- Setup the test tables. This is NOT a part of the solution.

--=================================================================================================

--===== Presets

SET NOCOUNT ON

;

--===== Conditionally drop Temp Tables to make reruns easier

-- Drop the "real" tables

IF OBJECT_ID('TempDB..#tDivision') IS NOT NULL

DROP TABLE #tDivision

;

IF OBJECT_ID('TempDB..#tDivisionBloc') IS NOT NULL

DROP TABLE #tDivisionBloc

;

-- Drop the working tables

IF OBJECT_ID('TempDB..#Work') IS NOT NULL

DROP TABLE #Work

;

--===== Recreate the "real" tables

CREATE TABLE #tDivision

(

DivisionID INT PRIMARY KEY NOT NULL,

DivisionName NVARCHAR(100),

DivisionQty INT

)

;

CREATE TABLE #tDivisionBloc

(

BlocID INT PRIMARY KEY NOT NULL,

DivisionID INT NOT NULL,

BlocWeight DECIMAL(38,19) NOT NULL,

WeightConsumed DECIMAL(38,19) NOT NULL DEFAULT 0

)

;

--===== Populate the "real" tables

-- Insert test data in division

INSERT INTO #tDivision

(DivisionID, DivisionName, DivisionQty)

SELECT 1, 'Division 1', 500 UNION ALL

SELECT 2, 'Division 2', 400

;

-- Insert test data in divisionbloc

INSERT INTO #tDivisionBloc

(BlocID, DivisionID, BlocWeight)

SELECT 1, 1, 0.1 UNION ALL

SELECT 2, 1, 0.05 UNION ALL

SELECT 3, 1, 0.02 UNION ALL

SELECT 4, 1, 0.03 UNION ALL

SELECT 5, 1, 0.20 UNION ALL

SELECT 6, 1, 0.1 UNION ALL

SELECT 7, 1, 0.05 UNION ALL

SELECT 8, 1, 0.05 UNION ALL

SELECT 9, 1, 0.15 UNION ALL

SELECT 10, 1, 0.02 UNION ALL

SELECT 11, 1, 0.2 UNION ALL

SELECT 12, 1, 0.03

;

--===== Add some data for Division 2

INSERT INTO #tDivisionBloc

(BlocID, DivisionID, BlocWeight)

SELECT BlocID+12, 2, BlocWeight*4

FROM #tDivisionBloc

;

-- SELECT DivisionID, SUM(BlocWeight)

-- FROM #tDivisionBloc

-- GROUP BY DivisionID

--;

--=================================================================================================

-- Solve the problem.

--=================================================================================================

--===== Declare some variables that will become parameters in the future.

DECLARE @pQtyToRetreive INT,

@pDivisionID INT

;

--===== Pretend we've received some values in the parameters

SELECT @pQtyToRetreive = 366,

@pDivisionID = 1

;

--===== Determine the total weight required

DECLARE @DrawWeight DECIMAL(38,19),

@UnitWeight DECIMAL(38,19)

;

SELECT @Unitweight = SUM(bloc.BlocWeight-bloc.WeightConsumed)/div.DivisionQty,

@DrawWeight = @Unitweight*@pQtyToRetreive

FROM #tDivisionBloc bloc

INNER JOIN #tDivision div

ON bloc.DivisionID = div.DivisionID

AND div.DivisionID = @pDivisionID

AND bloc.WeightConsumed < bloc.BlocWeight

GROUP BY div.DivisionQty

;

--===== Grab the data we need and move it into a temp table.

-- This also creates a couple of extra columns we need.

SELECT ROW_NUMBER() OVER (ORDER BY bloc.BlocWeight-bloc.WeightConsumed DESC) AS SortOrder,

bloc.BlocID,

bloc.DivisionID,

bloc.BlocWeight-bloc.WeightConsumed AS WeightAvailable,

CAST(0 AS DECIMAL(38,19)) AS RunningTotal,

CAST(0 AS DECIMAL(38,19)) AS WeightConsumed,

CAST(0 AS INT) AS DrawQuantity

INTO #Work

FROM #tDivisionBloc bloc

INNER JOIN #tDivision div

ON bloc.DivisionID = div.DivisionID

AND div.DivisionID = 1--@pDivisionID

AND bloc.WeightConsumed < bloc.BlocWeight

;

--===== Add the quintessential clustered index

CREATE CLUSTERED INDEX IX_#Work_SortOrder

ON #Work (SortOrder) WITH FILLFACTOR = 100

;

--===== Create the running total

-- Declare some necessary variables

DECLARE @PrevRunningTotal DECIMAL(38,19),

@PrevSortOrder INT,

@WeightConsumed DECIMAL(38,19)

;

-- Start the variables above in a known condition

SELECT @PrevRunningTotal = 0,

@PrevSortOrder = 0

;

-- Do the running total

UPDATE #Work

SET @PrevRunningTotal = RunningTotal

= WeightAvailable + @PrevRunningTotal,

@WeightConsumed = WeightConsumed

= CASE

WHEN @PrevRunningTotal <= @DrawWeight THEN WeightAvailable

WHEN @PrevRunningTotal - WeightAvailable > @DrawWeight THEN 0

WHEN @PrevRunningTotal > @DrawWeight THEN WeightAvailable -(@PrevRunningTotal - @DrawWeight)

END,

DrawQuantity = @WeightConsumed/@UnitWeight,

@PrevSortOrder = SortOrder --Just an "anchor"

FROM #Work WITH(TABLOCKX)

OPTION (MAXDOP 1)

;

--===== Update the "real" tables

UPDATE #tDivisionBloc

SET WeightConsumed = bloc.WeightConsumed+work.WeightConsumed

FROM #tDivisionBloc bloc

INNER JOIN #Work work

ON bloc.BlocID = work.BlocID

;

UPDATE #tDivision

SET DivisionQty = DivisionQty - @pQtyToRetreive

;

--===== Let's see what we've got in the tables

SELECT * FROM #Work

;

SELECT * FROM #tDivisionBloc

;

SELECT * FROM #tDivision

;

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Jeff Moden SSC Guru Points: 1004432 More actions · Answer 12

P.S. I don't believe that FLOAT is the thing to use here. Could be wrong, though.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

steve-lauzon SSC Veteran Points: 283 More actions · Answer 13

Holy smokes !

Thanks a lot guys... you've just saved me a bunch of time - just the solution I was searching for

Can't wait for the day that I'll be able to give such advice to newbies like me

Thanks for your help

A+

Jeff Moden SSC Guru Points: 1004432 More actions · Answer 14

steve-lauzon (4/13/2010)
Holy smokes !
Thanks a lot guys... you've just saved me a bunch of time - just the solution I was searching for
Can't wait for the day that I'll be able to give such advice to newbies like me
Thanks for your help
A+

The method both Wayne and I used is known as a "Quirky" Update. There are some very specific rules to using it correctly and I recommend you visit the article that he has a link for in the comments of his code. It's a long article but it's worth the read because it's a powerful tool in the right hands and a major disaster in the wrong hands.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)