October 8, 2009 at 10:51 am
Thanks Jeff,
Now I get it... I was so confused to see the seek because of the search by apples that I didn't pay attention to the 78% of the sort... Of course at first I just had a scan because my original query version had no WHERE predicate. So I stuck in the WHERE clause to make sure I'd get a seek... 🙂
Glad to see I was basically on the right path there (the ABS function on the formula is the cause of trouble).... *phew... wipes sweat from brow*
Though interesting... I would have though that by referencing the field from the SELECT clause that it wouldn't have to scan to recalculate the order by values since they were in the SELECT clause.
October 8, 2009 at 11:02 am
If you really want to have some fun, here's a great way to optimize the example queries that Mark wrote... I'll let you find the changes I made because it's more fun that way but do notice the actual execution plans for both queries... no "SORT" operators and the relative query cost of each query has plummeted compared to the data insert.
drop table #inventory
go
create table #inventory(itemcode varchar(10),startlevel int, endlevel int, cost money, StockValue AS ABS((endlevel - startlevel)*Cost) )
create index funfun on #inventory(itemcode,StockValue DESC)
insert into #inventory
select
'Apples',
100,
50,
.35
UNION ALL
SELECT
'Bannanas',
75,
50,
.25
UNION ALL
SELECt
'Peaches',
80,
20,
.10
SELECT
itemcode,
stockvalue --= ABS((endlevel - startlevel)*Cost)
FROM
#inventory
WHERE
itemcode = 'Apples'
order by
ABS((endlevel - startlevel)*Cost)
SELECT
itemcode,
stockvalue --= ABS((endlevel - startlevel)*Cost)
FROM
#inventory
WHERE
itemcode = 'Apples'
order by
stockvalue
If you can't make such a change to the real table, the careful use of an INDEXED VIEW may also do the trick at some expense to storage size because the data will materialize.
--Jeff Moden
Change is inevitable... Change for the better is not.
October 8, 2009 at 11:05 am
Wow.. that really took me by surprise, i didn't expect that!
I see what you're saying about the sorting after the math is done and not just sorting the column- but if I change that last select from your snippet to
ALTER TABLE #inventory
ADD StockVal AS ABS((endlevel - startlevel) * Cost) PERSISTED
CREATE INDEX funfun2 ON #inventory(itemcode,stockval)
SELECT
itemcode, stockval
FROM #inventory
WHERE
itemcode = 'Apples'
order by stockval
We drop the sort. Going back to what you said earlier, is this only due to the fact that we are now sorting the column itself and not the math? Just making sure I am following along 🙂
EDIT: Jeff snuck his in before mine :pinch:
October 8, 2009 at 11:10 am
Mike McQueen (10/8/2009)
Wow.. that really took me by surprise, i didn't expect that!I see what you're saying about the sorting after the math is done and not just sorting the column- but if I change that last select from your snippet to
ALTER TABLE #inventory
ADD StockVal AS ABS((endlevel - startlevel) * Cost) PERSISTED
CREATE INDEX funfun2 ON #inventory(itemcode,stockval)
SELECT
itemcode, stockval
FROM #inventory
WHERE
itemcode = 'Apples'
order by stockval
We drop the sort. Going back to what you said earlier, is this only due to the fact that we are now sorting the column itself and not the math? Just making sure I am following along 🙂
EDIT: Jeff snuck his in before mine :pinch:
Heh... I may have snuck it in before you but I wrote a 2k solution. I flat forgot about "PERSISTED" in 2k5 which eliminates the need for an INDEXED VIEW.
--Jeff Moden
Change is inevitable... Change for the better is not.
October 8, 2009 at 11:11 am
And, to answer your question, "YES"... it's because you are now sorting on an indexed column which contains the answer to the math calculation instead of sorting on a math calculation.
--Jeff Moden
Change is inevitable... Change for the better is not.
October 8, 2009 at 11:16 am
While we're on the topic of computed columns and execution plans, I have a question that has been nagging at me.
If you check out the execution plan of the last select I posted, with PERSISTED data, it has a Compute Scalar. From my (limited) knowledge, I thought PERSISTED indicated that the data was stored as a column. If the data is now a static column (for the time being, until one of the columns making up the equation is updated) why doesn't the execution plan only show a scan/seek and no compute?
October 8, 2009 at 4:45 pm
changed the where a bit, tested with 100,000 rows:
SELECT * FROM (
select acct_code, part_no,location,description,sku_no,list_status
,pub_date,pub_status,succession_status,void,in_stock,avg_cost
,std_cost
,((avg_cost - std_cost)/avg_cost)*100 as unit_cost_var
,(avg_cost - std_cost)*in_stock as extended_diff
,in_stock*avg_cost as extended_avg_cost
from dfi_inv_master_list with (NOLOCK)
) cte
where (extended_diff >= @a
or extended_diff <= -1* @a)
and avg_cost <> 0 and in_stock <> 0
order by ABS(extended_diff) desc
October 8, 2009 at 5:45 pm
zukko (10/8/2009)
changed the where a bit, tested with 100,000 rows:SELECT * FROM (
select acct_code, part_no,location,description,sku_no,list_status
,pub_date,pub_status,succession_status,void,in_stock,avg_cost
,std_cost
,((avg_cost - std_cost)/avg_cost)*100 as unit_cost_var
,(avg_cost - std_cost)*in_stock as extended_diff
,in_stock*avg_cost as extended_avg_cost
from dfi_inv_master_list with (NOLOCK)
) cte
where (extended_diff >= @a
or extended_diff <= -1* @a)
and avg_cost <> 0 and in_stock <> 0
order by ABS(extended_diff) desc
That's real nice... would you mind sharing the code where you generated the 100k rows and maybe give us a hint about what you found, please?
--Jeff Moden
Change is inevitable... Change for the better is not.
October 8, 2009 at 8:02 pm
sorry forgot to add the full script 😛
CREATE TABLE [dbo].[dfi_inv_master_list](
[acct_code] [int] IDENTITY(1,1) NOT NULL,
[part_no] [int] NULL,
[location] [int] NULL,
[description] [int] NULL,
[sku_no] [int] NULL,
[list_status] [int] NULL,
[pub_date] [int] NULL,
[pub_status] [int] NULL,
[succession_status] [int] NULL,
[void] [int] NULL,
[in_stock] [float] NULL,
[avg_cost] [float] NULL,
[std_cost] [float] NULL
) ON [PRIMARY]
SELECT top 100000
0 as part_no,
0 as location,
0 as description,
0 as sku_no,
0 as list_status,
0 as pub_date,
0 as pub_status,
0 as succession_status,
0 as void,
identity(int,1,1) as in_stock,
100 as avg_cost,
200 as std_cost
into #tmpA FROM sys.columns t1,sys.columns t2
INSERT INTO dfi_inv_master_list SELECT * FROM #tmpA
drop table #tmpA
set statistics time on
Declare @a money
SELECT @a = 1000
SELECT * FROM (
select acct_code, part_no ,location ,description ,sku_no ,list_status
,pub_date ,pub_status ,succession_status ,void ,in_stock ,avg_cost
,std_cost
,((avg_cost - std_cost)/avg_cost)*100 as unit_cost_var
,(avg_cost - std_cost)*in_stock as extended_diff
,in_stock*avg_cost as extended_avg_cost
from dfi_inv_master_list with (NOLOCK)
) cte
where (extended_diff >= @a
or extended_diff <= -1* @a)
and avg_cost <> 0 and in_stock <> 0
order by ABS(extended_diff) desc
set statistics time off
this helped take some CPU time off the query (at least for me).
October 16, 2009 at 2:54 am
Mike McQueen (10/8/2009)
While we're on the topic of computed columns and execution plans, I have a question that has been nagging at me.If you check out the execution plan of the last select I posted, with PERSISTED data, it has a Compute Scalar. From my (limited) knowledge, I thought PERSISTED indicated that the data was stored as a column. If the data is now a static column (for the time being, until one of the columns making up the equation is updated) why doesn't the execution plan only show a scan/seek and no compute?
1. There's no need to persist the column if you are then going to create an index on it.
2. A compute scalar always appears in the plan for a persisted computed column, for internal reasons. It doesn't mean the value is being recomputed - unless it says so.
The QO will (annoyingly) often choose to recompute the value, if the base columns for the computation are available to it, and if the computed column isn't stored on the same data access path. This is because the estimated cost of the scalar computation is so low. For example, the QO will often choose to recompute over performing an RID- or KEY-lookup to fetch the persisted value...the lookup will cost a few logical I/Os, whereas the compute scalar appears to be almost free. This is particularly annoying when the persisted value is based on the result of a complex scalar function. Indexing appropriately (which often means INCLUDEing the computed value) is one way around this.
October 16, 2009 at 7:05 am
Paul-
Thanks for the input! This has been bothering me for a while now. This makes sense though, we had a sneaking suspicion that this may be the case.
October 17, 2009 at 12:30 am
zukko (10/8/2009)
sorry forgot to add the full script 😛
Thanks zukko.
--Jeff Moden
Change is inevitable... Change for the better is not.
Viewing 12 posts - 16 through 26 (of 26 total)
You must be logged in to reply to this topic. Login to reply