October 6, 2006 at 12:17 pm
Following are two stored procedures:
Why is the second one more than twice as fast on SQL 2000? The select statements are identical.
CREATE PROC Invoices_StoreGroupItemMovementByDateRange1
(
@StoreGroupID int,
@StartDate datetime,
@EndDate datetime
)
AS
SELECT DISTINCT I.ItemNo, I.ItemDescription
(SELECT SUM(II.QTYPackSizeDescription) FROM Invoices II WHERE (II.ItemNo = I.ItemNo) AND (II.StoreKey = I.StoreKey) AND (II.InvoiceDate = I.InvoiceDate)) AS Mvmt
FROM Invoices I
WHERE (I.StoreKey IN (SELECT StoreKey FROM StoreGroupMembers S WHERE S.StoreGroupID = @StoreGroupID))
AND (I.InvoiceDate >= @StartDate) AND (I.InvoiceDate <= @EndDate)
ORDER BY I.ItemNo
GO
CREATE PROC Invoices_StoreGroupItemMovementByDateRange2(
@SG int,
@SD datetime,
@ED datetime
)
AS
DECLARE @StoreGroupID int,
@StartDate datetime,
@EndDate datetime
SET @StoreGroupID = @SG
SET @StartDate = @SD
SET @EndDate = @ED
SELECT DISTINCT I.ItemNo, I.ItemDescription AS ItemDesc,
(SELECT SUM(II.QTYPackSizeDescription) FROM Invoices II WHERE (II.ItemNo = I.ItemNo) AND (II.StoreKey = I.StoreKey) AND (II.InvoiceDate = I.InvoiceDate)) AS ItemMovement
FROM Invoices I
WHERE (I.StoreKey IN (SELECT StoreKey FROM StoreGroupMembers S WHERE S.StoreGroupID = @StoreGroupID))
AND (I.InvoiceDate >= @StartDate) AND (I.InvoiceDate <= @EndDate)
ORDER BY I.ItemNo
GO
October 6, 2006 at 3:47 pm
October 7, 2006 at 11:44 am
Execution plans may be viewed at:
http://www.afslr.com/ExecPlans/Default.htm
I am executing both stored procedures from SQL Query Analyzer at my work station connected to the LAN.
These procedures are on the production SQL Server [don't have a test one set up for this project]. Will running sp_recompile / DBCC FREEPROCCACHE cause any problems? . I do not have a problem kicking everyone off if this will improve performance. Also, would like a heads up as to how long this might take. There are 1380 stored procedures on this particular database (not including the ones that start with "dt_".
Thank you for your time,
Billy Hildebrand
October 7, 2006 at 12:25 pm
I'm not sure why but I changed the query in both from
"... II.Stores = I.Stores..." to
"...II.StoreKey IN (SELECT StoreKey FROM StoreGroupMembers S WHERE S.StoreGroupID = @StoreGroupID)) AND (II.InvoiceDate = I.InvoiceDate)..."
As the first way was only getting the sum of the item sold for the first store in the sub query instead of for all of the stores in the store group [blond-attack, newbie syndrome, <sigh>]
Now both queries take the same amount of time (albiet longer than either previous but returning 10 times the records -- the accurate data I was wanting).
Thanks again,
Billy Hildebrand
October 8, 2006 at 5:21 pm
Ugh! Correlated Sub-Queries = Instant RBAR. Not good... real performance killer. Should be a derived table. WHERE IN... not good... should be an equi-join (inner join).
Haven't tested it but I'm thinking the following will run a bit faster and with the "accurate data" you are wanting...
SELECT DISTINCT i.ItemNo,
i.ItemDescription AS ItemDesc,
d.ItemMovement
FROM Invoices i,
StoreGroupMembers s,
(--Derived table "d" finds ItemMovement
SELECT ii.ItemNo,
ii.StoreKey,
ii.InvoiceDate,
SUM(ii.QTYPackSizeDescription) AS ItemMovement
FROM Invoices ii
GROUP BY ii.ItemNo, ii.StoreKey, ii.InvoiceDate
) d
WHERE i.StoreKey = s.StoreKey
AND i.ItemNo = d.ItemNo
AND i.StoreKey = d.StoreKey
AND i.InvoiceDate = d.InvoiceDate
AND s.StoreGroupID = @StoreGroupID
AND i.InvoiceDate >= @StartDate
AND i.InvoiceDate <= @EndDate
ORDER BY i.ItemNo
If you have "the religion", you could could certainly and easily change this to explicit INNER JOINs
--Jeff Moden
Change is inevitable... Change for the better is not.
October 8, 2006 at 5:23 pm
p.s. The above was from your original query... it does not include the change due to the "blond-attack"
--Jeff Moden
Change is inevitable... Change for the better is not.
October 9, 2006 at 3:30 am
Instant RBAR ????? ( sorry it's monday morning! )
[font="Comic Sans MS"]The GrumpyOldDBA[/font]
www.grumpyolddba.co.uk
http://sqlblogcasts.com/blogs/grumpyolddba/
October 9, 2006 at 4:18 pm
Sorry... it's a "Moden-ism" pronounced "ree-bar" and stands for "Row By Agonizing Row" Why "ree-bar"? Because it's like the metal stakes stuck in cement with a similar name... they don't go anywhere fast, either
--Jeff Moden
Change is inevitable... Change for the better is not.
October 10, 2006 at 1:30 am
I like it - something else I can throw at developers!!! - only joking - good term though!
[font="Comic Sans MS"]The GrumpyOldDBA[/font]
www.grumpyolddba.co.uk
http://sqlblogcasts.com/blogs/grumpyolddba/
October 10, 2006 at 6:02 am
Thanks Jeff!
This query does work faster with both procedures. Alas the result is flawed as it was in my meagor attempts because it returns the sum of an item bought over a period of the first store in the derived table as it did in my sub-queries.
As a self-taught newbie, I guess when it comes to the "religion", I don't have a lot of the wisdom. Please share the good news with me that I might spread the word, too.
Below are the data-structures of the tables and the statement of the problem. How can I return distinct item number, item description and the sum of the quantity for all stores in a store group? The Invoices.QTYPackSizeDescription is the quantity attribute and the Invoices.StoreKey has a one-to-many relationship with StoreGroupMembers.StoreKey.
CREATE TABLE [Invoices] (
[Company] [varchar] (50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[StoreKey] [float] NOT NULL ,
[InvoiceNo] [varchar] (10) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL ,
[InvoiceDate] [smalldatetime] NOT NULL ,
[WarehouseSlot] [varchar] (5) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[QTYPackSizeDescription] [decimal](18, 0) NULL ,
[ItemDescription] [varchar] (50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[ItemNo] [decimal](18, 0) NOT NULL ,
[RetailUPC] [float] NULL ,
[RetQty] [float] NULL ,
[RetailPrice] [float] NULL ,
[Markup] [float] NULL ,
[Cost] [float] NULL ,
[ExtCost] [float] NULL ,
[Deal] [float] NULL ,
[Tax] [float] NULL ,
[Type] [varchar] (10) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[MassMerch] [varchar] (3) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[CatchWeightFlag] [varchar] (3) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[SubstituteItem] [varchar] (10) COLLATE SQL_Latin1_General_CP1_CI_AS NULL
) ON [PRIMARY]
GO
CREATE TABLE [StoreGroupMembers] (
[StoreGroupID] [numeric](18, 0) NOT NULL ,
[StoreKey] [numeric](18, 0) NOT NULL ,
CONSTRAINT [PK_StoreGroupMembers] PRIMARY KEY CLUSTERED
(
[StoreKey]
) WITH FILLFACTOR = 90 ON [PRIMARY]
) ON [PRIMARY]
GO
Problem:
Return a record set containing item number, item description and quantity sold of each distinct item for a particular store group over a specified date range.
Note:
I relize there is not a primary key on the Invoices table. The closest thing to a key consists of the company, storekey, invoice number, invoice date, item number and quantity. I have several indexes for the main queries I run against this table including one that includes the attributes just mentioned.
Asside:
Data is generated on a main frame and invoice numbers are reset to '00001' when they reach '99999' and item number and quantity can appear on the invoice more than once. The mainframe program was written long ago and I import data into my SQL server for reporting (and other) purposes. We are in the process of replacing our 390s with 400s and all of the programming is to be update and eventually I will have more normalized data. In the interim...
Thank you for your time,
Billy Hildebrand
October 10, 2006 at 6:10 am
My managers at work liked it so well, they allowed the term in our coding standards... and, ALL the developers know what RBAR means... no need for a lengthy explanation anymore.
--Jeff Moden
Change is inevitable... Change for the better is not.
October 10, 2006 at 6:15 am
As a self-taught newbie, I guess when it comes to the "religion", I don't have a lot of the wisdom. Please share the good news with me that I might spread the word, too. |
The "religion" I spoke of is that some folks absolutely refuse to make joins in the WHERE clause (known as "equi-joins") and will take the time (doesnt take much) to write them as INNER JOINS using the "new" ANSI join standards.
--Jeff Moden
Change is inevitable... Change for the better is not.
October 10, 2006 at 6:30 am
I'm willing to use equi-joins. I'm just not sure how it would work in this problem. Wouldn't it still return results for the first record in the join?
Using your query, modified with an IN in the HAVING clause works, but as you mentioned "...not good...":
SELECT DISTINCT i.ItemNo,
i.ItemDescription AS ItemDesc,
d.ItemMovement
FROM Invoices i,
StoreGroupMembers s,
(--Derived table "d" finds ItemMovement
SELECT ii.StoreKey,
ii.ItemNo,
ii.InvoiceDate,
SUM(ii.QTYPackSizeDescription) AS ItemMovement
FROM Invoices ii
GROUP BY ii.StoreKey, ii.ItemNo, ii.InvoiceDate
HAVING ii.StoreKey IN(SELECT sg.StoreKey FROM StoreGroupMembers sg WHERE sg.StoreGroupID = @StoreGroupID) d
WHERE i.StoreKey = s.StoreKey
AND i.ItemNo = d.ItemNo
AND i.StoreKey = d.StoreKey
AND i.InvoiceDate = d.InvoiceDate
AND i.InvoiceDate >= @StartDate
AND i.InvoiceDate <= @EndDate
AND s.StoreGroupID = @StoreGroupID
ORDER BY i.ItemNo
How can I change this with an equi-join to I can get rid of the RBAR?
Thank you for your time.
Billy
October 10, 2006 at 1:38 pm
Have you tried simplifying it like this?
SELECT I.ItemNo, I.ItemDescription AS ItemDesc, SUM(I.QTYPackSizeDescription) ItemMovement
FROM Invoices I, StoreGroupMembers S
WHERE I.StoreKey = S.StoreKey
AND S.StoreGroupID = @StoreGroupID
AND I.InvoiceDate BETWEEN @StartDate AND @EndDate
GROUP BY I.ItemNo, I.ItemDescription
ORDER BY I.ItemNo
Or for those with religion:
SELECT I.ItemNo, I.ItemDescription AS ItemDesc, SUM(I.QTYPackSizeDescription) ItemMovement
FROM Invoices I INNER JOIN StoreGroupMembers S ON (I.StoreKey = S.StoreKey
AND S.StoreGroupID = @StoreGroupID)
WHERE I.InvoiceDate BETWEEN @StartDate AND @EndDate
GROUP BY I.ItemNo, I.ItemDescription
ORDER BY I.ItemNo
October 12, 2006 at 6:28 am
Mark,
Thank you so very much for the simplification. I had tried a couple of different ways but had failed to add the additional operator in the ON portion. Your two queries have enlightened me and your time is greatly appreciated.
Personally, I've never used a FROM with more than one table source unless it was joined in someway (no comma separated tables).
Thanks to everyone that responded.
Billy Hildebrand
Viewing 15 posts - 1 through 15 (of 15 total)
You must be logged in to reply to this topic. Login to reply