Min Max Quartile

Question

Min Max Quartile

Viewing 7 posts - 16 through 22 (of 22 total)

You must be logged in to reply to this topic. Login to reply

ChrisM@Work SSC Guru Points: 186127 More actions · Answer 1

Garadin (12/9/2008)
Garadin (12/9/2008)
Chris,
Although it does seem like it might be more efficient due to the derived tables, the execution plans are virtually identical (once I add the aliases to your outer ProductID's). I actually had it this way and then went the other route. Check out this post from last week, particularly Jeff's post towards the bottom evaluating the 3 methods of doing this. (Left Join/NULL, Exists and IN)
http://www.sqlservercentral.com/Forums/Topic611416-338-1.aspx
Which would have probably been true if I didn't have outer references in my IN's. :crazy: After testing, Chris is absolutely right, try his out, with my tests on my Dev box, his version completes in 34 seconds for 1.4M rows.

Seth, I've just worked on this again from scratch and came up with your original code which is correct, mine is broken.

DECLARE @Products TABLE(

ProductIdent INT IDENTITY(1,1),

ProductID VARCHAR(10),

UnitPrice money)

INSERT INTO @Products(ProductID, UnitPrice)

SELECT 'P1',0 UNION ALL

SELECT 'P1',5 UNION ALL --

SELECT 'P1',10 UNION ALL --

SELECT 'P1',20 UNION ALL

SELECT 'P2',80 UNION ALL

SELECT 'P2',90 UNION ALL --

SELECT 'P2',100 UNION ALL --

SELECT 'P2',200

SELECT p.*

FROM @Products p

WHERE ProductIdent NOT IN (SELECT TOP 25 PERCENT ProductIdent FROM @Products WHERE ProductID = p.ProductID ORDER BY UnitPrice)

AND ProductIdent NOT IN (SELECT TOP 25 PERCENT ProductIdent FROM @Products WHERE ProductID = p.ProductID ORDER BY UnitPrice DESC)Problem is, as you say, it's the first and last 25% for each productID, which means of course that these quartiles have to be calculated for each productID. On a table of a million or more rows it's gonna hurt.

I'm wondering if this can be done with "quirky update".

^{“Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw}

For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

Kelvin Phayre Ten Centuries Points: 1358 More actions · Answer 2

I've done this and got the right answer. It basically takes the last record from each query. So the max of price ascending is my min price and the min of price descending is my max. :hehe: :w00t: What do you guys think?

DECLARE @Products TABLE(

ProductID VARCHAR(10),

MinOrMaxchar(1),

UnitPrice float)

INSERT INTO @Products(ProductID, MinOrMax, UnitPrice)

SELECT TOP 25 PERCENT ProductID, 'X', UnitPrice

FROM scheme.stockm s join SalesProducts P on s.warehouse='WG' and s.product=P.ProductID where ProductID='214700'

ORDER BY UnitPrice asc

INSERT INTO @Products(ProductID, MinOrMax, UnitPrice)

SELECT TOP 25 PERCENT ProductID, 'I', UnitPrice

FROM scheme.stockm s join SalesProducts P on s.warehouse='WG' and s.product=P.ProductID where ProductID='214700'

ORDER BY UnitPrice desc

select max (UnitPrice) from @Products where MinOrMax = 'X'

select min (UnitPrice) from @Products where MinOrMax = 'I'

select ProductID,

MINSP=max (case when (MinOrMax = 'X') then UnitPrice end),

MAXSP=min (case when (MinOrMax = 'I') then UnitPrice end)

from @Products

group by ProductID

ChrisM@Work SSC Guru Points: 186127 More actions · Answer 3

Only one problem there Kelvin - you've restricted it to a single ProductID.

^{“Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw}

For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

Garadin One Orange Chip Points: 29613 More actions · Answer 4

Excellent call Chris. The below shows an example of doing exactly that. The times are from my highly overloaded and underpowered dev box.

[font="Courier New"]-------------------- SETUP --------------------------------

CREATE TABLE #Product(

ProductIdent INT IDENTITY(1,1),

ProductID INT,

UnitPrice money,

ProductSequence INT)

DECLARE @L INT

SET @L = 1

WHILE @L <=70 -- Takes about 17 Seconds

BEGIN

INSERT INTO #Product(ProductID, UnitPrice)

SELECT RIGHT(CHECKSUM(NEWID()),3), ABS(CHECKSUM(NEWID())/100000) *.01

FROM Tally -- 20K row tally table

PRINT @L

SET @L = @L + 1

END -- Total of 1.4M Records

CREATE CLUSTERED INDEX [CX_Product] --Takes About 40 Seconds

ON #Product(ProductID, UnitPrice) ON [PRIMARY] -- This is an important part of the solution and is necessary

--------------------- SOLUTION ----------------------------

DECLARE @ProductSequence INT,

@PrevProductID INT,

@ProductIDent INT

UPDATE #Product -- Takes about 29 Seconds

SET @ProductSequence = ProductSequence = CASE WHEN ProductID = @PrevProductID THEN @ProductSequence + 1

ELSE 1

END,

@PrevProductID = ProductID,

@ProductIdent = ProductIdent

FROM #Product WITH (INDEX(0))

-- Grab the count for each Product

SELECT ProductID, COUNT(ProductIdent) ProductCount

INTO #ProdCount

FROM #Product

GROUP BY ProductID

-- Final Query

SELECT P.ProductID, MIN(UnitPrice) MinPrice, MAX(UnitPrice) MaxPrice

FROM #Product P

INNER JOIN #ProdCount PC ON P.ProductID = PC.ProductID

WHERE ProductSequence BETWEEN PC.ProductCount/4 AND (PC.ProductCount - (PC.ProductCount/4))

GROUP BY P.ProductID

[/font]

Seth Phelabaum

Consistency is only a virtue if you're not a screwup. 😉

Links: How to Post Sample Data[/url] :: Running Totals[/url] :: Tally Table[/url] :: Cross Tabs/Pivots[/url] :: String Concatenation[/url]

Kelvin Phayre Ten Centuries Points: 1358 More actions · Answer 5

Guys, many many thanks for your help and guidance. I had realised I had only used one product ... I was showing you what I needed to do individually for EVERY product 😉

Please see below my final solution. Again many many thanks.

DECLARE @ProductCode varchar(20)

DECLARE ProdList CURSOR FOR

select distinct ProductID from SalesProducts

DECLARE @Products TABLE(

ProductID VARCHAR(10),

MinOrMaxchar(1),

UnitPrice float)

drop table SalesProductsMinMax

CREATE TABLE [dbo].[SalesProductsMinMax] (

ProductID char(20),

MinUnitPrice float,

MaxUnitPrice float

)

OPEN ProdList;

FETCH ProdList into @ProductCode;

WHILE @@FETCH_STATUS = 0 BEGIN

INSERT INTO @Products(ProductID, MinOrMax, UnitPrice)

SELECT TOP 25 PERCENT ProductID, 'X', UnitPrice

FROM scheme.stockm s join SalesProducts P on s.warehouse='WG' and s.product=P.ProductID where ProductID=@ProductCode

ORDER BY UnitPrice asc

INSERT INTO @Products(ProductID, MinOrMax, UnitPrice)

SELECT TOP 25 PERCENT ProductID, 'I', UnitPrice

FROM scheme.stockm s join SalesProducts P on s.warehouse='WG' and s.product=P.ProductID where ProductID=@ProductCode

ORDER BY UnitPrice desc

FETCH ProdList into @ProductCode;

END;

CLOSE ProdList;

DEALLOCATE ProdList;

insert into SalesProductsMinMax ( ProductID, MinUnitPrice, MaxUnitPrice)

select ProductID,

MINSP=max (case when (MinOrMax = 'X') then UnitPrice end),

MAXSP=min (case when (MinOrMax = 'I') then UnitPrice end)

from @Products

group by ProductID

Garadin One Orange Chip Points: 29613 More actions · Answer 6

What is the performance of that cursor like? On my system, it ran for 7 minutes with the processor pegged at 100% before I stopped it (coming from my indexed temp table, and not using your joins, so mine should be faster)

My final solution, which takes advantage of the lightning fast running totals method takes a little over 30 seconds to run.

Seth Phelabaum

Consistency is only a virtue if you're not a screwup. 😉

Links: How to Post Sample Data[/url] :: Running Totals[/url] :: Tally Table[/url] :: Cross Tabs/Pivots[/url] :: String Concatenation[/url]

Kelvin Phayre Ten Centuries Points: 1358 More actions · Answer 7

Took about 1 minute to run. On SQL 2k 12gig memory 8gig assigned to SQL, 8 processors. ProductID was index on SalesProducts. I had 944 unique products.

Now that I have worked out how to top and tail prices I need to now do the same to daily sales figures for products over the year so I can work out the minimum and safety level stock levels for our MRP. I have extracted the total sold grouped by day of the week, week number and month number. I may also have to split the year so I can have variable levels based on the season.

Very interesting stuff what !!! 😎

Thanks again