query optimizing

  • Thanks Jeff,

    Now I get it... I was so confused to see the seek because of the search by apples that I didn't pay attention to the 78% of the sort... Of course at first I just had a scan because my original query version had no WHERE predicate. So I stuck in the WHERE clause to make sure I'd get a seek... 🙂

    Glad to see I was basically on the right path there (the ABS function on the formula is the cause of trouble).... *phew... wipes sweat from brow*

    Though interesting... I would have though that by referencing the field from the SELECT clause that it wouldn't have to scan to recalculate the order by values since they were in the SELECT clause.



    --Mark Tassin
    MCITP - SQL Server DBA
    Proud member of the Anti-RBAR alliance.
    For help with Performance click this link[/url]
    For tips on how to post your problems[/url]

  • If you really want to have some fun, here's a great way to optimize the example queries that Mark wrote... I'll let you find the changes I made because it's more fun that way but do notice the actual execution plans for both queries... no "SORT" operators and the relative query cost of each query has plummeted compared to the data insert.

    drop table #inventory

    go

    create table #inventory(itemcode varchar(10),startlevel int, endlevel int, cost money, StockValue AS ABS((endlevel - startlevel)*Cost) )

    create index funfun on #inventory(itemcode,StockValue DESC)

    insert into #inventory

    select

    'Apples',

    100,

    50,

    .35

    UNION ALL

    SELECT

    'Bannanas',

    75,

    50,

    .25

    UNION ALL

    SELECt

    'Peaches',

    80,

    20,

    .10

    SELECT

    itemcode,

    stockvalue --= ABS((endlevel - startlevel)*Cost)

    FROM

    #inventory

    WHERE

    itemcode = 'Apples'

    order by

    ABS((endlevel - startlevel)*Cost)

    SELECT

    itemcode,

    stockvalue --= ABS((endlevel - startlevel)*Cost)

    FROM

    #inventory

    WHERE

    itemcode = 'Apples'

    order by

    stockvalue

    If you can't make such a change to the real table, the careful use of an INDEXED VIEW may also do the trick at some expense to storage size because the data will materialize.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Wow.. that really took me by surprise, i didn't expect that!

    I see what you're saying about the sorting after the math is done and not just sorting the column- but if I change that last select from your snippet to

    ALTER TABLE #inventory

    ADD StockVal AS ABS((endlevel - startlevel) * Cost) PERSISTED

    CREATE INDEX funfun2 ON #inventory(itemcode,stockval)

    SELECT

    itemcode, stockval

    FROM #inventory

    WHERE

    itemcode = 'Apples'

    order by stockval

    We drop the sort. Going back to what you said earlier, is this only due to the fact that we are now sorting the column itself and not the math? Just making sure I am following along 🙂

    EDIT: Jeff snuck his in before mine :pinch:

  • Mike McQueen (10/8/2009)


    Wow.. that really took me by surprise, i didn't expect that!

    I see what you're saying about the sorting after the math is done and not just sorting the column- but if I change that last select from your snippet to

    ALTER TABLE #inventory

    ADD StockVal AS ABS((endlevel - startlevel) * Cost) PERSISTED

    CREATE INDEX funfun2 ON #inventory(itemcode,stockval)

    SELECT

    itemcode, stockval

    FROM #inventory

    WHERE

    itemcode = 'Apples'

    order by stockval

    We drop the sort. Going back to what you said earlier, is this only due to the fact that we are now sorting the column itself and not the math? Just making sure I am following along 🙂

    EDIT: Jeff snuck his in before mine :pinch:

    Heh... I may have snuck it in before you but I wrote a 2k solution. I flat forgot about "PERSISTED" in 2k5 which eliminates the need for an INDEXED VIEW.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • And, to answer your question, "YES"... it's because you are now sorting on an indexed column which contains the answer to the math calculation instead of sorting on a math calculation.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • While we're on the topic of computed columns and execution plans, I have a question that has been nagging at me.

    If you check out the execution plan of the last select I posted, with PERSISTED data, it has a Compute Scalar. From my (limited) knowledge, I thought PERSISTED indicated that the data was stored as a column. If the data is now a static column (for the time being, until one of the columns making up the equation is updated) why doesn't the execution plan only show a scan/seek and no compute?

  • changed the where a bit, tested with 100,000 rows:

    SELECT * FROM (

    select acct_code, part_no,location,description,sku_no,list_status

    ,pub_date,pub_status,succession_status,void,in_stock,avg_cost

    ,std_cost

    ,((avg_cost - std_cost)/avg_cost)*100 as unit_cost_var

    ,(avg_cost - std_cost)*in_stock as extended_diff

    ,in_stock*avg_cost as extended_avg_cost

    from dfi_inv_master_list with (NOLOCK)

    ) cte

    where (extended_diff >= @a

    or extended_diff <= -1* @a)

    and avg_cost <> 0 and in_stock <> 0

    order by ABS(extended_diff) desc

  • zukko (10/8/2009)


    changed the where a bit, tested with 100,000 rows:

    SELECT * FROM (

    select acct_code, part_no,location,description,sku_no,list_status

    ,pub_date,pub_status,succession_status,void,in_stock,avg_cost

    ,std_cost

    ,((avg_cost - std_cost)/avg_cost)*100 as unit_cost_var

    ,(avg_cost - std_cost)*in_stock as extended_diff

    ,in_stock*avg_cost as extended_avg_cost

    from dfi_inv_master_list with (NOLOCK)

    ) cte

    where (extended_diff >= @a

    or extended_diff <= -1* @a)

    and avg_cost <> 0 and in_stock <> 0

    order by ABS(extended_diff) desc

    That's real nice... would you mind sharing the code where you generated the 100k rows and maybe give us a hint about what you found, please?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • sorry forgot to add the full script 😛

    CREATE TABLE [dbo].[dfi_inv_master_list](

    [acct_code] [int] IDENTITY(1,1) NOT NULL,

    [part_no] [int] NULL,

    [location] [int] NULL,

    [description] [int] NULL,

    [sku_no] [int] NULL,

    [list_status] [int] NULL,

    [pub_date] [int] NULL,

    [pub_status] [int] NULL,

    [succession_status] [int] NULL,

    [void] [int] NULL,

    [in_stock] [float] NULL,

    [avg_cost] [float] NULL,

    [std_cost] [float] NULL

    ) ON [PRIMARY]

    SELECT top 100000

    0 as part_no,

    0 as location,

    0 as description,

    0 as sku_no,

    0 as list_status,

    0 as pub_date,

    0 as pub_status,

    0 as succession_status,

    0 as void,

    identity(int,1,1) as in_stock,

    100 as avg_cost,

    200 as std_cost

    into #tmpA FROM sys.columns t1,sys.columns t2

    INSERT INTO dfi_inv_master_list SELECT * FROM #tmpA

    drop table #tmpA

    set statistics time on

    Declare @a money

    SELECT @a = 1000

    SELECT * FROM (

    select acct_code, part_no ,location ,description ,sku_no ,list_status

    ,pub_date ,pub_status ,succession_status ,void ,in_stock ,avg_cost

    ,std_cost

    ,((avg_cost - std_cost)/avg_cost)*100 as unit_cost_var

    ,(avg_cost - std_cost)*in_stock as extended_diff

    ,in_stock*avg_cost as extended_avg_cost

    from dfi_inv_master_list with (NOLOCK)

    ) cte

    where (extended_diff >= @a

    or extended_diff <= -1* @a)

    and avg_cost <> 0 and in_stock <> 0

    order by ABS(extended_diff) desc

    set statistics time off

    this helped take some CPU time off the query (at least for me).

  • Mike McQueen (10/8/2009)


    While we're on the topic of computed columns and execution plans, I have a question that has been nagging at me.

    If you check out the execution plan of the last select I posted, with PERSISTED data, it has a Compute Scalar. From my (limited) knowledge, I thought PERSISTED indicated that the data was stored as a column. If the data is now a static column (for the time being, until one of the columns making up the equation is updated) why doesn't the execution plan only show a scan/seek and no compute?

    1. There's no need to persist the column if you are then going to create an index on it.

    2. A compute scalar always appears in the plan for a persisted computed column, for internal reasons. It doesn't mean the value is being recomputed - unless it says so.

    The QO will (annoyingly) often choose to recompute the value, if the base columns for the computation are available to it, and if the computed column isn't stored on the same data access path. This is because the estimated cost of the scalar computation is so low. For example, the QO will often choose to recompute over performing an RID- or KEY-lookup to fetch the persisted value...the lookup will cost a few logical I/Os, whereas the compute scalar appears to be almost free. This is particularly annoying when the persisted value is based on the result of a complex scalar function. Indexing appropriately (which often means INCLUDEing the computed value) is one way around this.

  • Paul-

    Thanks for the input! This has been bothering me for a while now. This makes sense though, we had a sneaking suspicion that this may be the case.

  • zukko (10/8/2009)


    sorry forgot to add the full script 😛

    Thanks zukko.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 12 posts - 16 through 26 (of 26 total)

You must be logged in to reply to this topic. Login to reply